skip to Main Content

I have a problem statement where i need to decide what TTL to set on SQS messages.

I have an SQS where i get the message to copyDetail of transaction. There is no use of processing a message if txn is declined. e.g lets say copyDetail SQS message was added at T+0 in queue. Due to some issue like outage etc, currently Service can process a message at T+n moment. Where there is no upper limit for ‘n’ if I have millions of messages for stale copyDetail, i can’t process new copyDeail messages as my processing will be busy in copying stale Detail while my new requests too are blocked. So, we need to evaluate how to set TTL for such messages.

Limitations.

  • I can’t contact another service which decides the txn is valid or not. There is no way at the time of copyDetail processing using which we can know the status of txn.

NOTE

A particular txn is valid for 20mins. only.

Im thinking of applying retention period of 30mins. But even after it if service get message for copyDetail it will process the message even if transaction is failed at earlier stage.

Is there any feasible solution to it. Or i should i turn down the idea of setting the TTL over messages

2

Answers


  1. It sounds to me if messages are older than 20 minutes, they are useless? If thats the case, the naive solution would be to set the TTL of the queue to 20 minutes(or slightly under – you may need to allow for some processing time to clear a backlog of messages after a failure of the consumer). This approach means only processable items will be retained in the queue regardless of the consumer running or not (but you will "loose" all un-processable messages). If the consumer fails then sometime later restarts it wont waste effort processing obviously un-processable messages over 20 minutes old.

    I say naive solution as in the real world you you tend to want to know about failures and maybe take action (like, logging failures, alerting someone, starting up extra consumers to process high volumes of messages etc). You also might need more nuanced system specific behavior like re-processing failed messages with a delay in the case of a temporary failure of the consumer. I would suggest reading up on the AWS SQS docs, specifically the available queue/message attributes(for delays and reprocessing policies), and how to use dead letter Queues as a better way of handling un-processed messages.

    Login or Signup to reply.
  2. Even though you’ve mentioned that you can’t contact another service, I still provide my suggestion here,

    Because it’s impossible to predict whether the message is a success or failure without contacting any other service.

    Two things you can consider,

    1. Use EventBridge Pipes – It allows to filter the SQS Message using the patterns and it will allows us to send the message to variety of destinations such as SQS, Lambda, API Gateway, SNS, Kinesis and etc,.

    For example, if the message is like this

    Sample Message 1

    {
     "status":"failure"
    }
    

    Sample Message 2

    {
     "status":"success"
    }
    

    Filter pattern looks like this in the Pipe, since the whole message is enclosed in the body, so we use like this

    {
      "body": {
        "status": ["success"]
      }
    }
    

    Note: The filter will receive and acknowledge all the messages, but only send the messages that matches the pattern to the Target. So, any message that doesn’t match the pattern will be lost.

    Also, this works for the messages that are already in queue too.

    1. Using Lambda:
      You can create a Lambda function and set the SQS as trigger, inside the function you can add your own logic to filter and redrive the message to some other queue or trigger anything you want to do.

    Additionally, you can additionally set the retention period like MisterSmith suggested

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search