skip to Main Content

I am trying to retrieve a fixed number(lets take it as 5 for now) of items from a dynamo db table.

This is the code I am using.

response = table.query(
                       KeyConditionExpression=Key('pk').eq('goo'),
                       Limit=5,
                       FilterExpression=Attr('goo').eq('bar'))

I am getting only 4 items from this. But if I remove FilterExpression, the item count will be 5. So is there any other way to get the fixed number of item even if I am using FilterExpression?

2

Answers


  1. Chosen as BEST ANSWER

    This is the answer I found.

    paginator = dynamo_db_client.get_paginator('query')
    response_iterator = paginator.paginate(
                            TableName='table_name',
                            KeyConditionExpression='#P=:p',
                            FilterExpression='#T=:t',
                            ExpressionAttributeNames={'#P':'pk','#T':'goo'},
                            ExpressionAttributeValues={
                                   ':p': {'S':'goo'},
                                   ':t': {'S':'bar'}
                            },
                            PaginationConfig={
                                    'MaxItems':5
                            }     
                      )
      
    for page in response_iterator:
        print(len(page['Items']))
    

    link for paginator doc : DynamoDB.Paginator.Query


  2. Filter Expressions are applied after items are read from the table in order to reduce the number of records sent over the wire. The Limit is applied during the query operation, i.e. before the filter expression.

    If the Query read 5 items and only 4 of them match the FilterExpression you’re getting only 4 items back.

    The pragmatic thing would be to remove the limit from the Query and apply the limit client-side. The drawback is that you may pay for more Read Capacity Units.

    If you want to avoid that you may have to reconsider your datamodel – a generic solution is difficult here.


    In your specific case, you could create a Global Secondary Index with the partition key pk and the sort key goo (it doesn’t have to be unique for GSIs). You can then fire your Query against the GSI with Limit 5 and it will give you what you want. But: you pay for the GSI storage + throughput.


    Edit: This question is pretty much a duplicate except for the Python code

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search