skip to Main Content

I am trying to replicate Kafka cluster with MirrorMaker 2.0. I am using following mm2.properties:

name = mirror-site1-site2
topics = .*
connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector
tasks.max = 1
plugin.path=/usr/share/java/kafka/plugin
clusters = site1, site2

# for demo, source and target clusters are the same
source.cluster.alias = site1
target.cluster.alias = site2

site1.sasl.mechanism=SCRAM-SHA-256
site1.security.protocol=SASL_PLAINTEXT
site1.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required 
   username="<someuser>" 
   password="<somepass>";

site2.sasl.mechanism=SCRAM-SHA-256
site2.security.protocol=SASL_PLAINTEXT
site2.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required 
   username="<someuser>" 
   password="<somepass>";

site1.bootstrap.servers = <IP1>:9093, <IP2>:9093, <IP3>:9093, <IP4>:9093
site2.bootstrap.servers = <IP5>:9093, <IP6>:9093, <IP7>:9093, <IP8>:9093

site1->site2.enabled = true
site1->site2.topics = topic1


# use ByteArrayConverter to ensure that records are not re-encoded
key.converter = org.apache.kafka.connect.converters.ByteArrayConverter
value.converter = org.apache.kafka.connect.converters.ByteArrayConverter

So here’s the issue, mm2 seems to allways replicate x3 messages :

# Manual message production: 

 kafkacat -P -b <IP1>:9093,<IP2>:9093,<IP3>:9093,<IP4>:9093 -t "topic1"


# Result in the source topic (site1 cluster): 

% Reached end of topic topic1 [2] at offset 405
Message1
% Reached end of topic topic1 [2] at offset 406
Message2
% Reached end of topic topic1 [6] at offset 408
Message3
% Reached end of topic topic1 [2] at offset 407

 kafkacat -P -b <IP5>:9093,<IP6>:9093,<IP7>:9093,<IP8>:9093 -t "site1.topic1"

# Result in the target topic (site2 cluster): 

% Reached end of topic site1.titi [2] at offset 1216
Message1
Message1
Message1
% Reached end of topic site1.titi [2] at offset 1219
Message2
Message2
Message2
% Reached end of topic site1.titi [6] at offset 1229
Message3
Message3
Message3

I tried using Kafka from confluent package and kafka_2.13-2.4.0 directly from Apache, both with Debian 10.1.

I first encouraged this behaviour with confluent 5.4, thought it could be a bug in their package as they have replicator and should not really care about mm2, but I reproduced exactly the same issue with kafka_2.13-2.4.0 directly from Apache without any change.

I’m aware that mm2 is not yet idempotent and can’t guarantee once delivery. In my tests (I tried many things including producer tuning or bigger batch of thousand messages). In all these test mm2 always duplicate X3 all messages.

Did I miss something, did someone encourage the same thing ? As a site note with legacy mm1 with the same packages I don’t have this issue.

Appreciate any help… Thanks !


Even if the changelog didnt made me very confident about an improvement I tried again to run a mm2, from kafka 2.4.1 this time. => no change allways these strange duplications.

I installed this released on a new server to ensure the strange behaviour I met wasnt something related to the server.

As I use ACL does I need special right ? I put “all” thinking it cant be more permisive… Even if mm2 isnt idempotent yep, I’ll give a try to the right related to that.

That suprise me the more is that I cant find anything reporting an issue like this, for sure I must do something wrong, but what that is the question…

2

Answers


  1. Enabling idempotence to the client config will fix the issue. By default it will be set to false. Add the below to the mm2.properties file

    source.cluster.producer.enable.idempotence = true
    target.cluster.producer.enable.idempotence = true
    
    Login or Signup to reply.
  2. You need to remove connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector from your configuration, because this is telling Mirror Maker to use this class for Heartbeats and Checkpoints connectors that it generates along with the Source connector that replicates data, and this class makes them behave exactly like a Source connector, so that’s why you get 3 messages replicated each time, you’ve actually generated 3 Source connectors.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search