I am building Kafka CDC but following the document, it runs many docker-run commands.
I want to put it all into a docker-compose.yml
but I fail at 1 last command I can not convert to
The below are the commands
docker run -d --name postgres
-p 5432:5432
-e POSTGRES_USER=start_data_engineer
-e POSTGRES_PASSWORD=password debezium/postgres:12
docker run -d --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 debezium/zookeeper:1.1
docker run -d --name kafka -p 9092:9092 --link zookeeper:zookeeper debezium/kafka:1.1
docker run -d --name connect -p 8083:8083 --link kafka:kafka
--link postgres:postgres
-e BOOTSTRAP_SERVERS=kafka:9092
-e GROUP_ID=sde_group
-e CONFIG_STORAGE_TOPIC=sde_storage_topic
-e OFFSET_STORAGE_TOPIC=sde_offset_topic debezium/connect:1.1
This is the line I can not convert
docker run -it --rm --name consumer --link zookeeper:zookeeper
--link kafka:kafka debezium/kafka:1.1
watch-topic -a bankserver1.bank.holding --max-messages 1 | grep '^{' | jq
Here is my docker-compose.yml
so far
version: '2'
services:
zookeeper:
image: debezium/zookeeper
ports:
- 2181:2181
- 2888:2888
- 3888:3888
kafka:
image: debezium/kafka
ports:
- 9092:9092
links:
- zookeeper
environment:
- ZOOKEEPER_CONNECT=zookeeper:2181
postgres:
image: debezium/postgres:9.6
ports:
- "5432:5432"
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
connect:
image: debezium/connect
ports:
- 8083:8083
- 5005:5005
links:
- kafka
- postgres
- zookeeper
environment:
- BOOTSTRAP_SERVERS=kafka:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_source_connect_statuses
consumer:
image: debezium/kafka:1.1
links:
- zookeeper
- kafka
command: watch-topic -a bankserver1.bank.holding --max-messages 1 | grep '^{' | jq
When I run docker-compose up
, everything run normally. But the consumer
always fail with this output.
The ZOOKEEPER_CONNECT variable must be set, or the container must be linked to one that runs Zookeeper.
consumer_1 | WARNING: Using default BROKER_ID=1, which is valid only for non-clustered installations.
consumer_1 | The ZOOKEEPER_CONNECT variable must be set, or the container must be linked to one that runs Zookeeper.
— Update
For now I just want to read and shootdown. Making sure it works first.
Later then I will have a source handle those reading stuff.
docker run -it --rm --name consumer --link zookeeper:zookeeper --link kafka:kafka debezium/kafka:1.1 watch-topic -a bankserver1.bank.holding | grep --line-buffered '^{' | <your-file-path>/stream.py > my-output/holding_pivot.txt
2
Answers
Following will work…
The points are
As the error says, you need to provide a
ZOOKEEPER_CONNECT
. However, you should be usingentrypoint
there, notcommand
.In any case, I don’t know if the Debezium container will have the Python modules for you to pipe into
stream.py
or whatwatch-topic
does, but you don’t need anotherdebezium/kafka
container since you can exec into the running one.