skip to Main Content

I am trying to understand how partition works in dynamodb table. Based on AWS doc:

A DynamoDB partition can support 3000 read operations and 1000 write operations. It keeps a divider between read and write ops so they do not interfere with each other. If you had a table that was configured to support 18000 reads and 6000 writes, you'd have at least 12 partition, but probably a few more for some head room.

A partition can be added when the storage size larger than 10GB.

it seems the number of partition is determined by the table data size and throughput.

What confuse me is how does this partition relate to the partition key used in the table. For items with different partition key, they can be in the same partition and share the same RCU/WCU limit. If it is true, how can I make the items cross different partitions? I have a solution to solve query performance by appending a random number on the partition key value for the items.

For example, the partition key value is TEST for all items in a table. If I query this PK TEST, it will hit all items which could be slow. Because the RCU is limited by 3000 per partition. My solution is to append a random number, e.g TEST_01, TEST_02 ... TEST_10 on all items. When I need to query all items, I send 10 queries in parallel and one for each partition key. I expect the 10 value are spread into 10 partition to get around the 3000 RCU limit. But if they are all in the same partition, there is no point to append a random number on the PK.

So my question is what the better solution on solving the querying performance. Or do I misunderstand anything?

2

Answers


  1. Having multiple PKs is definitely a good idea, because you benefit from having multiple partitions and hence more RCUs. If you only choose one PK like Test you’ll probably (depending on your load) get some hot partition.

    This is one of my preferred articles around partitioning and optimisation. I liked the idea of having multiple GSI to improve read performance.

    Login or Signup to reply.
  2. The quote you shared is not from the docs like you state, where is it from?

    To answer your question, you don’t need to think about the physical partitions as DynamoDB handles the sharding for you.

    What you do have to think about is your access patterns, and having a low cardinality key such as Test is a NoSQL anti-pattern.

    While DynamoDB has a feature called Adaptive Capacity, which detects that you need more than 3000RCU for Test, it can split all of the items related to that key into 2 or more partitions. However, you should not design a data model to rely on this behaviour.

    How does DynamoDB determine what partition an item goes to – it hashes the partition key you provide, which determines the partition as each partition is responsible for a range of the key space. For a table with 1 partition, all items will be sent to the single partition.

    How do you know how many partitions you have? You don’t, and you don’t need to as it’s handled for you. If you need 10k WCU, then you can be sure you have at least 10 partitions.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search