skip to Main Content

Given that I have a service P (Producer) and C (Consumer) my P service needs to:

  • Create an object X
  • Create an object Y (dependent on X)
  • Create an object Z (dependent on Y)
  • Notify C about X, Y, and Z (via Redis Streams)
  • C needs to use data from Z, Y, and X to do some local data persistence
  • Updated to Y and fairly common, but to X are rare

From C’s perspective, is there a way to guarantee that it had all the info it needed for successful persistence?

I know that services like Kafka and Redis Streams are not generally built for this stuff, but how does one overcome this?

Idea 1:

  • Send X, Y, and Z in that particular order to the same consumer group. But if we scale the number of workers to anything above 1, we run into the problem

Idea 2:

  • Instead of sending X and Y separately to C, I can send a compound object Z, which has Y and X embedded. But seems really like overkill – doesn’t it?

Is there any obvious way to handle object dependencies?

2

Answers


  1. This is a good question covered in this note about Redis Streams.

    We could say that schematically the following is true:

    If you use 1 stream -> 1 consumer, you are processing messages in
    order.

    If you use N streams with N consumers, so that only a given consumer
    hits a subset of the N streams, you can scale the above model of 1
    stream -> 1 consumer.

    If you use 1 stream -> N consumers, you are load balancing to N
    consumers, however in that case, messages about the same logical item
    may be consumed out of order, because a given consumer may process
    message 3 faster than another consumer is processing message 4.

    So basically Kafka partitions are more similar to using N different
    Redis keys, while Redis consumer groups are a server-side load
    balancing system of messages from a given stream to N different
    consumers.

    So if you want to process them in order and use more than one consumer you will need to deal with that yourself.

    Login or Signup to reply.
  2. I believe IDEA 2 is a better solution cuz I think keep the whole message in one data structure is a good idea.

    And probably you can try to use multiple keys.

    For example:

    On Service P

    def now_timestamp = datetime.currentstamp # let`s say it is 1515151551
    
    redis sadd not_processed_timestamp 1515151551
    redis set X_1515151551 INFO_OF_X
    redis set Y_1515151551 INFO_OF_Y
    redis set Z_1515151551 INFO_OF_Z
    

    On service C, create a new thread

    def new_task_timestamp = redis spop not_processed_timestamp # let`s say it is 1515151551
    redis blocking-get X_1515151551
    redis blocking-get Y_1515151551
    redis blocking-get Z_1515151551
    
    # process the rest
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search