I am writing a pub/sub implementation which uses Azure EventHub as the underlying event ingestion service. In my application, the publishers will publish events to a particular EventHub partition and the consumers who are subscribed to that particular partition will receive events. Usually a consumer will be assigned to a unique EventHub ConsumerGroup
, and in some cases there can be multiple consumer assignments to the same ConsumerGroup
.
Let’s say I have two consumers (consumer-1, consumer-2) in the same ConsumerGroup (consumer-group-1) who are subscribed to events of a particular EventHub partition (partition ‘0’ of event-hub-1).
When we send an event to the partition ‘0’ of ‘event-hub-1’, how would the message delivery happens ?
- Will both consumers (consumer-1, consumer-2) get the same message ?
- Or will the ConsumerGroup load-balance the messages among the consumers as in traditional Kafka and only one consumer gets the message ?
Sample Code: https://github.com/ballerina-platform/ballerina-standard-library/issues/3483#issuecomment-1272824977
Note:
Application is written in ballerina language which internally uses Kafka Java Client
2
Answers
Kafka Consumer supports two models to consume messages from a
topic
.ConsumerGroup
and subscribe to thetopic
.topic
.Both models are mutually exclusive and given a Kafka Consumer, it should use only one model to consumer messages.
When a Kafka Consumer joins a particular
ConsumerGroup
, the consumer will be assigned to a set of partitions from thetopic
to which it has subscribed. Two consumers from the sameConsumerGroup
are not assigned to the same partition(s) of a giventopic
. As per the Kafka documentation this is either handled by the zookeeper or the Kafka cluster itself.But when we assign partitions manually to a consumer, the consumer will not use the consumer's group management functionality.
In the above scenario I have manually assigned partitions to the consumers and hence both consumers will not use the group management functionality. So, the both consumers will get all the messages sent to that particular partition. This is properly explained in the EventHub documentation.
For more information about the inner workings of the Kafka Consumer we could refer to Standalone Consumer: Why and How to Use a Consumer Without a Group section of Kafka Definitive Guide - Chapter 04.
A consumergroup is a "group of consumers" as the name suggests. Each consumergroup gets a copy of the message and one consumer of that consumergroup out of many receives that message. So, regarding your scenario, either consumer-1 or consumer-2 will get the message since they are in the same consumergroup.