Skip to content

Commit

Permalink
add distributed version of examples using docker with both strimzi an…
Browse files Browse the repository at this point in the history
…d debezium images
  • Loading branch information
elakito committed Sep 10, 2020
1 parent 95fdb70 commit fb768f9
Show file tree
Hide file tree
Showing 38 changed files with 1,084 additions and 75 deletions.
13 changes: 1 addition & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,17 +136,6 @@ of the column can be `Int, Float, Decimal, Timestamp`. This considers SAP DB Tim
* `{topic}.partition.count` - This setting can be used to specify the no. of topic partitions that the Source connector can use to publish the data. Should be an `Integer`. Default value is `1`.


#### Sample Configurations
- Source and Sink Properties
- Sample Connectors: [connect-hana-source.properties](config/connect-hana-source.properties),[connect-hana-sink.properties](config/connect-hana-sink.properties)
- PERSONS1 batch-mode Connetors: [connect-hana-source-1.properties](config/connect-hana-source-1.properties),[connect-hana-sink-1.properties](config/connect-hana-sink-1.properties)
- PERSONS2 incrementing-mode Connectors: [connect-hana-source-2.properties](config/connect-hana-source-2.properties),[connect-hana-sink-2.properties](config/connect-hana-sink-2.properties)
- Connector Properties
- Standalone Connector with JSON: [connect-standalone.properties](config/connect-standalone.properties)
- Standalone Connector with Avro using Confluent Schemas Registry: [connect-avro-confluent.properties](config/connect-avro-confluent.properties)
- Standalone Connector with Avro using Apicurio Schemas Registry: [connect-avro-apicurio.properties](config/connect-avro-apicurio.properties)


## Examples

Folder [`examples`](examples) includes some example scenarios. In addtion, the `unit tests` provide examples on every possible mode in which the connector can be configured.
Expand All @@ -157,7 +146,7 @@ We welcome comments, questions, and bug reports. Please [create an issue](https:

## Contributing

Contributions are accepted by sending Pull Requests to this repo.
Contributions are accepted by sending Pull Requests to this repo. Please do not forget to sign the [Contribution License Agreement](https://cla-assistant.io/SAP/kafka-connect-sap).

## Todos

Expand Down
2 changes: 1 addition & 1 deletion config/connect-hana-sink-1.properties
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ name=test_topic_1_sink
connector.class=com.sap.kafka.connect.sink.hana.HANASinkConnector
tasks.max=1
topics=test_topic_1
connection.url=jdbc:sap://<url>/
connection.url=jdbc:sap://<host>/
connection.user=<username>
connection.password=<password>
auto.create=true
Expand Down
2 changes: 1 addition & 1 deletion config/connect-hana-sink-2.properties
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ name=test_topic_2_sink
connector.class=com.sap.kafka.connect.sink.hana.HANASinkConnector
tasks.max=1
topics=test_topic_2
connection.url=jdbc:sap://<url>/
connection.url=jdbc:sap://<host>/
connection.user=<username>
connection.password=<password>
auto.create=true
Expand Down
2 changes: 1 addition & 1 deletion config/connect-hana-sink.properties
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ connector.class=com.sap.kafka.connect.sink.hana.HANASinkConnector
tasks.max=2

topics=test_topic1,test_topic2
connection.url=jdbc:sap://<url>/
connection.url=jdbc:sap://<host>/
connection.user=<username>
connection.password=<password>
batch.size=3000
Expand Down
2 changes: 1 addition & 1 deletion config/connect-hana-source-1.properties
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ name=test-topic-1-source
connector.class=com.sap.kafka.connect.source.hana.HANASourceConnector
tasks.max=1
topics=test_topic_1
connection.url=jdbc:sap://<url>/
connection.url=jdbc:sap://<host>/
connection.user=<username>
connection.password=<password>
test_topic_1.table.name=<schemaname>."PERSONS1"
Expand Down
2 changes: 1 addition & 1 deletion config/connect-hana-source-2.properties
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ name=test-topic-2-source
connector.class=com.sap.kafka.connect.source.hana.HANASourceConnector
tasks.max=1
topics=test_topic_2
connection.url=jdbc:sap://<url>/
connection.url=jdbc:sap://<host>/
connection.user=<username>
connection.password=<password>
mode=incrementing
Expand Down
2 changes: 1 addition & 1 deletion config/connect-hana-source.properties
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ connector.class=com.sap.kafka.connect.source.hana.HANASourceConnector
tasks.max=1

topics=test_topic1,test_topic2
connection.url=jdbc:sap://<url>/
connection.url=jdbc:sap://<host>/
connection.user=<username>
connection.password=<password>
mode=incrementing
Expand Down
3 changes: 3 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@
* [persons3](persons3) : standalone Source and Sink HANA-Connectors using Apicurio Schema Registry (JSON)
* [persons4](persons4) : standalone Source and Sink HANA-Connectors using Apicurio Schema Registry (Avro)
* [persons6](persons6) : standalone Source and Sink HANA-Connectors using Confluent Schema Registry (Avro)
* [persons1ds](persons1ds) : distributed version of persons1 using Docker with Strimzi Kafka image
* [persons1db](persons1db) : distributed version of persons1 using Docker with Debezium Kafka-Connect image
* [persons4ds](persons4ds) : distributed version of persons4 using Docker with Strimzi Kafka image
13 changes: 13 additions & 0 deletions examples/persons1/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#
#
# make get_libs - build the go binary
#

DOCKER_TAG ?= latest

.PHONY: get_libs
get_libs:
@echo "Getting jar files into target ..."
@mkdir -p target
@cp ../../target/kafka-connect-hana-*.jar target
@cat driver-jars.txt | xargs -I '{}' mvn -q dependency:get -Dartifact='{}' -Dtransitive=false -Ddest=target
27 changes: 11 additions & 16 deletions examples/persons1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,30 +31,25 @@ For more information regarding how to start Kafka, refer to https://kafka.apache

We install the jar files into a dedicated directory within the plugins directory `plugins` that we create at `$KAFKA_HOME`.

First, we create a plugins directory `$KAFKA_HOME/plugins` if not yet created and create directory `kafka-connect-hana` within this directory. Assuming this project has been built (see Building), copy the connector's jar file into this directory.

First, we create a plugins directory `$KAFKA_HOME/plugins` if not yet created and create directory `kafka-connect-hana` within this directory.
```
$ mkdir -p $KAFKA_HOME/plugins/kafka-connect-hana
$ cp $KAFKA_CONNECT_SAP/target/kafka-connect-hana-1.0-SNAPSHOT.jar $KAFKA_HOME/plugins/kafka-connect-hana
$
```

Download the the HANA jdbc driver and put it into the target `$KAFKA_HOME/plugins/kafka-connect-hana` directory.
Assuming this project has been built (see Building), run `make get_libs` to place the required jar files including the HANA jdbc driver into directory 'target'.

```
$ wget https://repo1.maven.org/maven2/com/sap/cloud/db/jdbc/ngdbc/2.5.49/ngdbc-2.5.49.jar
--2020-07-24 20:14:06-- https://repo1.maven.org/maven2/com/sap/cloud/db/jdbc/ngdbc/2.5.49/ngdbc-2.5.49.jar
Resolving repo1.maven.org (repo1.maven.org)... 151.101.112.209
Connecting to repo1.maven.org (repo1.maven.org)|151.101.112.209|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1219123 (1.2M) [application/java-archive]
Saving to: 'ngdbc-2.5.49.jar'
ngdbc-2.5.49.jar 100%[===========================================>] 1.16M 5.23MB/s in 0.2s
2020-07-24 20:14:06 (5.23 MB/s) - 'ngdbc-2.5.49.jar' saved [1219123/1219123]
$ make get_libs
Getting jar files into target ...
$ ls target
kafka-connect-hana-1.0-SNAPSHOT.jar ngdbc-2.5.49.jar
$
```
Copy those jar files into `$KAFKA_HOME/plugins/kafka-connect-hana` directory.

$ mv ngdbc-2.5.49.jar $KAFKA_HOME/plugins/kafka-connect-hana
```
$ cp target/*.jar $KAFKA_HOME/plugins/kafka-connect-hana
$
```

Expand Down
1 change: 1 addition & 0 deletions examples/persons1/driver-jars.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
com.sap.cloud.db.jdbc:ngdbc:2.5.49
7 changes: 7 additions & 0 deletions examples/persons1db/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
FROM debezium/connect:1.2
USER root:root

RUN mkdir -p /kafka/connect/kafka-connector-hana
COPY ./target/ /kafka/connect/kafka-connector-hana/

USER 1001
19 changes: 19 additions & 0 deletions examples/persons1db/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#
#
# make get_libs - build the go binary
# make docker_build - build the docker image
#

DOCKER_TAG ?= latest

.PHONY: get_libs
get_libs:
@echo "Getting jar files into target ..."
@mkdir -p target
@cp ../../target/kafka-connect-hana-*.jar target
@cat driver-jars.txt | xargs -I '{}' mvn -q dependency:get -Dartifact='{}' -Dtransitive=false -Ddest=target

.PHONY: docker_build
docker_build:
@echo "Building docker image ..."
docker build . -t debezium-connector-hana-min
220 changes: 220 additions & 0 deletions examples/persons1db/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
### Example persons1db: kafka-hana-connect using debezium kafka images with docker

This example is a distributed version of example persons1 using debezium kafka-connect docker images.

#### Prerequisites

- This project is built (or its jar file is available)
- Access to HANA
- Docker

#### Running

This description assumes Docker and Docker-Compose are available on local machine.

##### Step 1: Start Zookeeper and Kafka

First, we start both Zookeeper and Kafka using debezium zookeeper and kafka images. If you are not familiar with debezium, you can find more information at Debezim tutorial https://debezium.io/documentation/reference/1.2/tutorial.html

Start Zookeeper using the following `docker run` command.

```
$ docker run -it --rm --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 debezium/zookeeper:1.2
Starting up in standalone mode
ZooKeeper JMX enabled by default
Using config: /zookeeper/conf/zoo.cfg
2020-09-09 22:42:33,018 - INFO [main:QuorumPeerConfig@135] - Reading configuration from: /zookeeper/conf/zoo.cfg
2020-09-09 22:42:33,031 - INFO [main:QuorumPeerConfig@387] - clientPortAddress is 0.0.0.0:2181
...
```

Start Kafka using the following `docker run` command.

```
$ docker run -it --rm --name kafka -p 9092:9092 --link zookeeper:zookeeper debezium/kafka:1.2
WARNING: Using default BROKER_ID=1, which is valid only for non-clustered installations.
Using ZOOKEEPER_CONNECT=172.17.0.2:2181
Using KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://172.17.0.3:9092
2020-09-09 22:43:05,396 - INFO [main:Log4jControllerRegistration$@31] - Registered kafka:type=kafka.Log4jController MBean
2020-09-09 22:43:05,934 - INFO [main:X509Util@79] - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
...
```

Before we start Kafka-Connect, we must prepare the Docker image that contains kafka-connect-hana.

##### Steps 2: Build Docker image for kafka-connect-hana

First, run `make get_libs` to place the required jar files into directory `target`.

```
$ make get_libs
Getting jar files into target ...
$
$ ls target
guava-20.0.jar ngdbc-2.5.49.jar
kafka-connect-hana-1.0-SNAPSHOT.jar
$
```

Next, run `make build_docker` to build the Docker image using debezium's Kafka Connect image and add the jar files to its plugins directory.


```
$ make docker_build
Building docker image ...
docker build . -t debezium-connector-hana-min
Sending build context to Docker daemon 3.868MB
Step 1/5 : FROM debezium/connect:1.2
---> 66f074fce2f0
Step 2/5 : USER root:root
---> Using cache
---> 64a9079cfa93
Step 3/5 : RUN mkdir -p /kafka/connect/kafka-connector-hana
---> Using cache
---> 174c3e1fb6db
Step 4/5 : COPY ./target/ /kafka/connect/kafka-connector-hana/
---> Using cache
---> 8f300532bf25
Step 5/5 : USER 1001
---> Using cache
---> b38cc3546555
Successfully built b38cc3546555
Successfully tagged debezium-connector-hana-min:latest
$
```

##### Step 3: Prepare the connector configuration files (Follow Step 2 of [persons1ds example](../persons1ds).

##### Step 4: Prepare the source table (Follow Step 4 of example persons1)

##### Step 5: Starting Kafka-Connect

Start Kafka-Connect using the following `docker run` command.

```
$ docker run -it --rm --name connect -p 8083:8083 -e GROUP_ID=1 -e CONFIG_STORAGE_TOPIC=my_connect_configs -e OFFSET_STORAGE_TOPIC=my_connect_offsets -e STATUS_STORAGE_TOPIC=my_connect_statuses --link zookeeper:zookeeper --link kafka:kafka debezium-connector-hana-min:latest
Plugins are loaded from /kafka/connect
Using the following environment variables:
GROUP_ID=1
CONFIG_STORAGE_TOPIC=my_connect_configs
OFFSET_STORAGE_TOPIC=my_connect_offsets
STATUS_STORAGE_TOPIC=my_connect_statuses
BOOTSTRAP_SERVERS=172.17.0.3:9092
REST_HOST_NAME=172.17.0.4
...
```

After starting those Docker containers, we can verify whether Kafka-Connect is running using curl.

```
$ curl -i http://localhost:8083/
HTTP/1.1 200 OK
Date: Wed, 09 Sep 2020 22:44:49 GMT
Content-Type: application/json
Content-Length: 91
Server: Jetty(9.4.24.v20191120)
{"version":"2.5.0","commit":"66563e712b0b9f84","kafka_cluster_id":"1NEvm9a4TW2t-f5Jkk4peg"}
$
$ curl -i http://localhost:8083/connectors
HTTP/1.1 200 OK
Date: Wed, 09 Sep 2020 22:45:35 GMT
Content-Type: application/json
Content-Length: 2
Server: Jetty(9.4.24.v20191120)
[]
$
```

The above result shows that Kafka Connect using Kafka 2.5.0 is running and there is no connector deployed.

We prepare for the connector json files using the json files `connect-hana-source-1.json` and `connect-hana-sink-1.json` which are the json representation of the configuration files created for example persons1.

Finally, we deploy the connectors by posting the connector configuration json files to the Kafka Connect's API. Assuming, these json files are already prepared in step 3, use curl to post these files.

```
$ curl -i -X POST -H 'content-type:application/json' -d @connect-hana-source-1.json http://localhost:8083/connectors
HTTP/1.1 201 Created
Date: Wed, 09 Sep 2020 22:46:30 GMT
Location: http://localhost:8083/connectors/test-topic-1-source
Content-Type: application/json
Content-Length: 399
Server: Jetty(9.4.24.v20191120)
{"name":"test-topic-1-source","config":{"connector.class":"com.sap.kafka.connect.source.hana.HANASourceConnector","tasks.max":"1","topics":"test_topic_1","connection.url":"jdbc:sap://...
$
$ curl -i -X POST -H 'content-type:application/json' -d @connect-hana-sink-1.json http://localhost:8083/connectors
HTTP/1.1 201 Created
Date: Wed, 09 Sep 2020 22:46:39 GMT
Location: http://localhost:8083/connectors/test-topic-1-sink
Content-Type: application/json
Content-Length: 414
Server: Jetty(9.4.24.v20191120)
{"name":"test-topic-1-sink","config":{"connector.class":"com.sap.kafka.connect.sink.hana.HANASinkConnector","tasks.max":"1","topics":"test_topic_1","connection.url":"jdbc:sap://...
$
$ curl -i http://localhost:8083/connectors
HTTP/1.1 200 OK
Date: Wed, 09 Sep 2020 22:47:35 GMT
Content-Type: application/json
Content-Length: 43
Server: Jetty(9.4.24.v20191120)
["test-topic-1-source","test-topic-1-sink"]
$
```

The above result shows that the connectors are deployed.

##### Step 6: Verifying the result

You can use debezium's watch-topic utility to look at the topic.

```
$ docker run -it --rm --name watcher --link zookeeper:zookeeper --link kafka:kafka debezium/kafka:1.2 watch-topic -a -k test_topic_1
WARNING: Using default BROKER_ID=1, which is valid only for non-clustered installations.
Using ZOOKEEPER_CONNECT=172.17.0.2:2181
Using KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://172.17.0.5:9092
Using KAFKA_BROKER=172.17.0.3:9092
Contents of topic test_topic_1:
null {"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"PERSONID"},{"type":"string","optional":true,"field":"LASTNAME"},{"type":"string","optional":true,"field":"FIRSTNAME"}],"optional":false,"name":"d025803persons1"},"payload":{"PERSONID":1,"LASTNAME":"simpson","FIRSTNAME":"homer"}}
null {"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"PERSONID"},{"type":"string","optional":true,"field":"LASTNAME"},{"type":"string","optional":true,"field":"FIRSTNAME"}],"optional":false,"name":"d025803persons1"},"payload":{"PERSONID":2,"LASTNAME":"simpson","FIRSTNAME":"merge"}}
...
```

You can also start another kafka container an interactive bash shell to inspect the topic using kafka-consule-consumer.sh.

First, start the interactive bash shell with debezium kafka image

```
$ docker run -it --rm --link zookeeper:zookeeper --link kafka:kafka debezium/kafka:1.2 /bin/bash
WARNING: Using default BROKER_ID=1, which is valid only for non-clustered installations.
Using ZOOKEEPER_CONNECT=172.17.0.2:2181
Using KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://172.17.0.5:9092
[kafka@2f92f972f2b6 ~]$
```

In the interactive shell, use kafka-console-consumer.sh to connect kafka:9092 to inspect the topic.

```
[kafka@2f92f972f2b6 ~]$ bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic test_topic_1 --from-beginning
{"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"PERSONID"},{"type":"string","optional":true,"field":"LASTNAME"},{"type":"string","optional":true,"field":"FIRSTNAME"}],"optional":false,"name":"d025803persons1"},"payload":{"PERSONID":1,"LASTNAME":"simpson","FIRSTNAME":"homer"}}
{"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"PERSONID"},{"type":"string","optional":true,"field":"LASTNAME"},{"type":"string","optional":true,"field":"FIRSTNAME"}],"optional":false,"name":"d025803persons1"},"payload":{"PERSONID":2,"LASTNAME":"simpson","FIRSTNAME":"merge"}}
...
```

It is noted that this scenario builds the Docker image without schema registry usage and runs Kafka Connect in the distributed mode. Additional connectors can be deployed to this Kafka Connect instance which use the same distributed-connect.properties configuration.

##### Step 7: Cleaning up

Use docker stop to terminate the containers.

```
$ docker stop watcher connect kafka zookeeper
watcher
connect
kafka
zookeeper
$
```
2 changes: 2 additions & 0 deletions examples/persons1db/driver-jars.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
com.sap.cloud.db.jdbc:ngdbc:2.5.49
com.google.guava:guava:20.0
Loading

0 comments on commit fb768f9

Please sign in to comment.