skip to Main Content

I am building Kafka CDC but following the document, it runs many docker-run commands.

I want to put it all into a docker-compose.yml but I fail at 1 last command I can not convert to

The below are the commands

docker run -d --name postgres 
           -p 5432:5432 
           -e POSTGRES_USER=start_data_engineer 
           -e POSTGRES_PASSWORD=password debezium/postgres:12

docker run -d --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 debezium/zookeeper:1.1
docker run -d --name kafka -p 9092:9092 --link zookeeper:zookeeper debezium/kafka:1.1

docker run -d --name connect -p 8083:8083 --link kafka:kafka 
                                          --link postgres:postgres 
                             -e BOOTSTRAP_SERVERS=kafka:9092 
                             -e GROUP_ID=sde_group 
                             -e CONFIG_STORAGE_TOPIC=sde_storage_topic 
                             -e OFFSET_STORAGE_TOPIC=sde_offset_topic debezium/connect:1.1

This is the line I can not convert

docker run -it --rm --name consumer --link zookeeper:zookeeper 
                                    --link kafka:kafka debezium/kafka:1.1 
               watch-topic -a bankserver1.bank.holding --max-messages 1 | grep '^{' | jq

Here is my docker-compose.yml so far

version: '2'
services:
  zookeeper:
    image: debezium/zookeeper
    ports:
     - 2181:2181
     - 2888:2888
     - 3888:3888
  kafka:
    image: debezium/kafka
    ports:
     - 9092:9092
    links:
     - zookeeper
    environment:
     - ZOOKEEPER_CONNECT=zookeeper:2181
  postgres:
    image: debezium/postgres:9.6
    ports:
     - "5432:5432"
    environment:
     - POSTGRES_USER=user
     - POSTGRES_PASSWORD=password
  connect:
    image: debezium/connect
    ports:
     - 8083:8083
     - 5005:5005
    links:
     - kafka
     - postgres
     - zookeeper
    environment:
     - BOOTSTRAP_SERVERS=kafka:9092
     - GROUP_ID=1
     - CONFIG_STORAGE_TOPIC=my_connect_configs
     - OFFSET_STORAGE_TOPIC=my_connect_offsets
     - STATUS_STORAGE_TOPIC=my_source_connect_statuses
  consumer:
    image: debezium/kafka:1.1
    links:
     - zookeeper
     - kafka
    command: watch-topic -a bankserver1.bank.holding --max-messages 1 | grep '^{' | jq

When I run docker-compose up, everything run normally. But the consumer always fail with this output.

The ZOOKEEPER_CONNECT variable must be set, or the container must be linked to one that runs Zookeeper.
consumer_1   | WARNING: Using default BROKER_ID=1, which is valid only for non-clustered installations.
consumer_1   | The ZOOKEEPER_CONNECT variable must be set, or the container must be linked to one that runs Zookeeper.

— Update
For now I just want to read and shootdown. Making sure it works first.

Later then I will have a source handle those reading stuff.

docker run -it --rm --name consumer --link zookeeper:zookeeper --link kafka:kafka debezium/kafka:1.1 watch-topic -a bankserver1.bank.holding | grep --line-buffered '^{' | <your-file-path>/stream.py > my-output/holding_pivot.txt

2

Answers


  1. Following will work…
    The points are

    • I don’t know why, but ZOOKEEPER_CONNECT and KAFKA_BROKER do not be set automatically.
    • You must break commands into a list.
    • Finally, pipe command had not run inside container.
    version: '2'
    services:
      zookeeper:
        image: debezium/zookeeper
        ports:
         - 2181:2181
         - 2888:2888
         - 3888:3888
      kafka:
        image: debezium/kafka
        ports:
         - 9092:9092
        environment:
         - ZOOKEEPER_CONNECT=zookeeper:2181
      postgres:
        image: debezium/postgres:9.6
        ports:
         - "5432:5432"
        environment:
         - POSTGRES_USER=user
         - POSTGRES_PASSWORD=password
      connect:
        image: debezium/connect
        ports:
         - 8083:8083
         - 5005:5005
        environment:
         - BOOTSTRAP_SERVERS=kafka:9092
         - GROUP_ID=1
         - CONFIG_STORAGE_TOPIC=my_connect_configs
         - OFFSET_STORAGE_TOPIC=my_connect_offsets
         - STATUS_STORAGE_TOPIC=my_source_connect_statuses
      consumer:
        image: debezium/kafka:1.1
        environment:
         - ZOOKEEPER_CONNECT=zookeeper:2181
         - KAFKA_BROKER=kafka:9092
        command: 
         - watch-topic 
         - -a 
         - bankserver1.bank.holding 
         - --max-messages 
         - "1"
    
    Login or Signup to reply.
  2. the consumer always fail with this output.

    As the error says, you need to provide a ZOOKEEPER_CONNECT. However, you should be using entrypoint there, not command.

    In any case, I don’t know if the Debezium container will have the Python modules for you to pipe into stream.py or what watch-topic does, but you don’t need another debezium/kafka container since you can exec into the running one.

    docker-compose exec kafka 
      bash -c "watch-topic -a bankserver1.bank.holding | grep --line-buffered '^{' | <your-file-path>/stream.py > my-output/holding_pivot.txt"
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search