We plan to use AWS SQS service to queue events created from web service and then use several workers to process those events. One event can only be processed one time. According to AWS SQS document, AWS SQS standard queue can “occasionally” produce duplicated message but with unlimited throughput. AWS SQS FIFO queue will not produce duplicated message but with throughput limitation of 300 API calls per second (with batchSize=10, equivalent of 3000 messages per second). Our current peak hour traffic is only 80 messages per second. So, both are fine in terms of throughput requirement. But, when I started to use AWS SQS FIFO queue, I found that I need to do extra work like providing extra parameters
“MessageGroupId” and “MessageDeduplicationId” or need to enable “ContentBasedDeduplication” setting. So, I am not sure which one is a better solution. We just need the message not duplicated. We don’t need the message to be FIFO.
Solution #1:
Use AWS SQS FIFO queue. For each message, need to generate a UUID for “MessageGroupId” and “MessageDeduplicationId” parameters.
Solution #2:
Use AWS SQS FIFO queue with “ContentBasedDeduplcation” enabled. For each message, need to generate a UUID for “MessageGroupId”.
Solution #3:
Use AWS SQS standard queue with AWS ElasticCache (either Redis or Memcached). For each message, the “MessageId” field will be saved in the cache server and checked for duplication later on. Existence means this message has been processed. (By the way, how long should the “MessageId” exists in the cache server. AWS SQS document does not mention how far back a message could be duplicated.)
2
Answers
You are making your systems complicated with SQS.
We have moved to Kinesis Streams, It works flawlessly. Here are the benefits we have seen,
Buggier Implementation of the process
Hope it helps.