skip to Main Content

I am using DynamoDB. I have a table which has GSI (Global Secondary Index). GSI partitionKey values are unique.

I want to use query operation to get items using GSI partitionKey. Usually when you use query operation you need to use LastEvaluatedKey to get all the results. In my case I know that there should be 1 or 0 results.

Do I need to make multiple query requests using LastEvaluatedKey to get this item or one request should be enough?

2

Answers


  1. There is no need for you to include LastEvaluatedKey in your FIRST DynamoDB Query.

    However, it is still recommended that you performs a checking if the results has value for LastEvaluatedKey, then only you proceed to perform the next DynamoDB Query with ExclusiveStartKey equals to your LastEvaluatedKey.

    An example in python:

    results = []
    
    response = DDB_TABLE.query(
        IndexName='gsi-status',
        KeyConditionExpression=Key('status').eq('COMPLETED')
    )
    
    if response.get('Items'):
        results.extend(response.get('Items'))
    
    while response.get('LastEvaluatedKey'):
        response = DDB_TABLE.query(
            IndexName='gsi-status',
            KeyConditionExpression=Key('status').eq('COMPLETED')
        )
    
        if response.get('Items'):
            results.extend(response.get('Items'))
    
    Login or Signup to reply.
  2. Can you be 100% guaranteed that you’ll never get a LastEvaluatedKey in response to a Query returning 0 or 1 items? No.

    Fun fact: Query calls won’t cross partition boundaries. Should a boundary be hit you’ll see a LastEvaluatedKey and have to do a second request to read into the next partition.

    Now, this is an implementation detail subject to change. The API contract you’re given with the Query call is to expect that an LEK might be returned and be prepared to do a second call as required. Deciding "Nah I don’t think I’ll need to" is a risky move. Even if you’re safe today, will you be safe tomorrow? When the docs don’t promise you a behavior, best not to rely too much on it.

    OK, so you know you should, and you probably knew that before. What you really want to know is if not "wearing a helmet" here will ever "conk you on the head". Can we invent a scenario where DynamoDB won’t know reliably in advance into what partition to start processing a Query to find the 0 or 1 items to return?

    Imagine that over time an item collection has been split across partitions, and items have been added and removed. There will be a set of partitions, more than one partition might cover the same PK value, and each will cover some subset of the SK values.

    Partition A: PK = "x", SK=[1 to 10]
    Partition B: PK = "x", SK=[11 to max]
    

    Then imagine you do a Query where PK = "x" and SK > 9 limiting results to 1. Will that item be in Partition A or Partition B? Well, we don’t know for sure. DynamoDB will have to start at Partition A but might not find any items and need to continue with B. You’ll see LastEvaluatedKey.

    Conk!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search