skip to Main Content

I am using global window with repeated forever after processing time trigger to process streaming data from pub-sub as below :

PCollection<KV<String,SMMessage>> perMSISDNLatestEvents = messages
        .apply("Apply global window",Window.<SMMessage>into(new GlobalWindows())
                .triggering(Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(1))))
                .discardingFiredPanes())
        .apply("Convert into kv of msisdn and SM message", ParDo.of(new SmartcareMessagetoKVFn()))
        .apply("Get per MSISDN latest event",Latest.perKey()).apply("Write into Redis", ParDo.of(new WriteRedisFn()));

Is there a way to make repeatedly forever apache beam trigger to only execute after the previous execution is completed ? The reason for my question is because the next trigger processing will need to read data from redis, written by the previous trigger execution.

Thank You

2

Answers


  1. You could try and explicitly declare a global window trigger, as the example below:

    Trigger subtrigger = AfterProcessingTime.pastFirstElementInPane(); 
    Trigger maintrigger = Repeatedly.forever(subtrigger);
    

    I think that triggers would help you on your case, since it will allow you to create event times, which will run when you or your code trigger them, so you would only run repeatedly forever when a trigger finishes first.

    I found this documentation which might guide you on the triggers you are trying to create.

    Login or Signup to reply.
  2. So the trigger here would fire at the interval you provided. The trigger is not aware of any downstream processing so it’s unable to depend on such steps of your pipeline.

    Instead of depending on the trigger for consistency here, you could add a barrier (a DoFn) that exists before the Write step and only gives up execution after you see the previous data in Redis.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search