is there any way, in Kafka, to produce a message once several related messages have been consumed ? (without having to manually control it at the application code…)
The use case would be to pick a huge file, split it into several chunks, publish a message for each of these chunks in a topic, and once all these messages are consumed produce another message notifying the result on another topic.
We can do it with a database, or REDIS, to control the state but I wonder if there’s any higher level approach leveraging only Kafka ecosystem.
2
Answers
You can use
ConsumerGroupCommand
to check if certain consumer group has finished processing all messages in a particular topic:$ kafka-consumer-groups --bootstrap-server broker_host:port --describe --group chunk_consumer
OR
$ kafka-run-class kafka.admin.ConsumerGroupCommand ...
Zero lag for every partition will indicate that the messages have been consumed successfully, and offsets committed by the consumer.
Alternatively, you can choose to subscribe to the
__consumer_offsets
topic and process messages from it yourself, but usingConsumerGroupCommand
seems like a more straightforward solution.Approach can be as follow: