I have a single dynamodb table. One of the type records is called "Result" and one of its PK stores a composite key (class, student and exercise). This is the example design:
PK | SK | SCORE | … | GSI PK 1 | GSI SK 1 |
---|---|---|---|---|---|
RESULT#001 | RESULT#001 | 90 | … | CLASS#MATH#STUDENT#TOM#EXERCISE#1 | RESULT#001 |
Use cases:
- Find result by student (
gsi pk 1 = "CLASS#MATH#STUDENT#TOM#EXERCISE#1", gsi sk 1 = begins_with("RESULT")
)
Now, there’s a new use case:
2. Find all results by student
To meet this requirement, I could either:
- Add a new GSI which does not store the exercise ID. The query will look: (
gsi pk 2 = "CLASS#MATH#STUDENT#TOM", gsi sk 2 = begins_with("RESULT")
) - In the backend service, iterate over all exercises and execute multiple dynamo db queries re-using GSI 1
First option may be performant. However, it requires to update the dynamo db table whereas second option uses same design. What’s a recommended design criteria to follow?
3
Answers
The best design would put the exercise identifier into the prefix of the sort key of the single GSI so you could answer both queries out of indexes.
So: The GSI PK is the class/student. The SK is the exercise/result.
I’m also wondering if this could be your base table schema.
How about
This way you can query only by SK and effectively load the full partition, which is all the results for a student. This assumes that student and result form an unique pair
Perhaps modeling queries related to the exercises is missing, could you clarify it a little more?
Apart from that you could try
Query results by student using begin with
sk = begins_with(RESULT#)
Query a specific result using
sk = RESULT#result_id
Get rid of the GSI with this key design