skip to Main Content

is there any way, in Kafka, to produce a message once several related messages have been consumed ? (without having to manually control it at the application code…)

The use case would be to pick a huge file, split it into several chunks, publish a message for each of these chunks in a topic, and once all these messages are consumed produce another message notifying the result on another topic.

We can do it with a database, or REDIS, to control the state but I wonder if there’s any higher level approach leveraging only Kafka ecosystem.

2

Answers


  1. You can use ConsumerGroupCommand to check if certain consumer group has finished processing all messages in a particular topic:

    1. $ kafka-consumer-groups --bootstrap-server broker_host:port --describe --group chunk_consumer

    OR

    1. $ kafka-run-class kafka.admin.ConsumerGroupCommand ...

    Zero lag for every partition will indicate that the messages have been consumed successfully, and offsets committed by the consumer.

    Alternatively, you can choose to subscribe to the __consumer_offsets topic and process messages from it yourself, but using ConsumerGroupCommand seems like a more straightforward solution.

    Login or Signup to reply.
  2. Approach can be as follow:

    1. After consuming each chunk application should produce message with status (Consumed, and chunk number)
    2. Second application (Kafka Streams once) should aggregate result and, when process messages with all chunks produce final message, that file is processed.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search