AWS DynamoDB Best Practices

DynamoDB

Just a quick overview, what is DynamoDB ?

DynamoDB is a fully managed NoSQL database (key-value pair and document) service provided by AWS that offers fast and predictable performance with seamless scalability

Item 1: Prefer Query to Scan

First, let’s check the definition of scan and query operation in DynamoDB

Scan operation always scans the entire table or secondary index. It then filters out values to provide the result you want.
Query operation searches only primary key attribute values and supports a subset of comparison operators on key attribute values to refine the search process.

From here, we can easily tell that Scan operation is VERY slow/inefficient and resource consuming (since you need RCU to read all the items in the DDB). So you should design you DDB structure so that query operaion can fit almost all the case of your backend. In case it does not, you can always update your DDB design

What you can do:

Utilize Partition Key and Sort Key for the most common request pattern
Add LSI or GSI to accomodate advanced query patterns. BUT only add GSI/LSI when needed, since this consume RCU/WCU

Item 2: Secondary Indexes

Rule 1: Keep the number of indexes to a minimum
Rule 2: If you have to use an Index, choose wisely

There are two types of secondary index in DynamoDB:

Global secondary index (GSI): An index with a partition key and a sort key that can be different from those on the base table.
- A global secondary index is considered “global” because queries on the index can span all of the data in the base table, across all partitions.
Local secondary index (LSI): An index that has the same partition key as the base table, but a different sort key.
- A local secondary index is “local” in the sense that every partition of a local secondary index is scoped to a base table partition that has the same partition key value.

So the basic difference is the “scope” of each index is different. GSI has the global view while LSI has view only those data share the same partition key

When you use these two types of indexes, there is also a difference when you try to retrieve data back

LSI supports Strongly Consistency and Eventual Consistency
1. which will always return the latest version of records if you can enable this by changing the ConsistentRead property when sending out a query
GSI supports Eventual Consistency
1. which might return a stale version of records, and you dont have the option to enable strong consistency

So, how can I choose the correct DDB index? I found this flow chart very helpful:

Image source: https://www.dynamodbguide.com/local-or-global-choosing-a-secondary-index-type-in-dynamo-db/#the-too-long-didnt-read-version-of-choosing-an-index

Item 3: `SaveBehavior` Configuration

When you use DynamoDB for CRUD operations, you will find out that DynamoDB does not have the actual “Update” concept. However, there is a SaveBehavior configuration which serves as the missed role in DynamoDB. Let’s take a look.

DynamoDBMapper mapper = new DynamoDBMapper(dynamoDBClient);

// SaveBehavior.UPDATE is the default value
mapper.save(something, new DynamoDBMapperConfig(SaveBehavior.UPDATE));

There are four different configurations for SaveBehavior

UPDATE (default)
UPDATE_SKIP_NULL_ATTRIBUTE
CLOBBER
APPEND_SET

Given this existing record and DynamoDB schema

AttributeName	key	modeled_scalar	modeled_set	unmodeled
KeyType	Hash	Non-key	Non-key	Non-key
AttributeType	Number	String	String set	String

{
    "key" : "99",
    "modeled_scalar" : "foo", 
    "modeled_set" : [
        "foo0", 
        "foo1"
    ], 
    "unmodeled" : "bar" 
}

together with this POJO class object

TestTableItem obj = new TestTableItem();
obj.setKey(99);
obj.setModeledScalar(null);
obj.setModeledSet(Collections.singleton("foo2");

SaveBehavior.UPDATE

UPDATE will not affect unmodeled attributes on a save operation, and a null value for the modeled attribute will remove it from that item in DynamoDB.

so basically after onvoking mapper.save(), the record in DDB will be as follows

{
  "key" : "99",
  "modeled_set" : [
    "foo2"
  ], 
  "unmodeled" : "bar" 
}

You can see that modeled_set has been updated, but since modeled_scalar is passed in using a null value, so it is removed from DDB.

In this case. if you wanna use SaveBehavior.UPDATE to update something, you have to include all the fields with proper value in your DynamoDB POJO class. Otherwise, it will be removed from DDB table instead of keep them as is.

SaveBehavior.UPDATE_SKIP_NULL_ATTRIBUTES

UPDATE_SKIP_NULL_ATTRIBUTES is similar to UPDATE, except that it ignores any null value attribute(s) and will NOT remove them from that item in DynamoDB.

{
  "key" : "99",
  "modeled_scalar" : "foo",
  "modeled_set" : [
    "foo2"
  ], 
  "unmodeled" : "bar" 
}

As you can find out, even though we set modeled_scalar to be null, this field is still persisted in DDB. So with this configuration, when you trying to update something with only specifying the hashKey and the field you wanna alter, it will properly update the field and keep the others the same without deleting.

SaveBehavior.CLOBBER

CLOBBER will clear and replace all attributes, including unmodeled ones

(delete and recreate) on save.

{
  "key" : "99", 
  "modeled_set" : [
    "foo2"
  ]
}

This is already self-explained. delete the record with same hashKey and re-create a new one with the specified fields.

SaveBehavior.APPEND_SET

APPEND_SET treats scalar attributes (String, Number, Binary) the same as UPDATE_SKIP_NULL_ATTRIBUTES does.

However, for set attributes, it will append to the existing attribute value, instead of overriding it.

{
  "key" : "99",
  "modeled_scalar" : "foo",
  "modeled_set" : [
    "foo0", 
    "foo1", 
    "foo2"
  ], 
  "unmodeled" : "bar" 
}

Scalar attributes are kept, but set attributes are appended.

SaveBehavior	On unmodeled attribute	On null-value attribute	On set attribute
UPDATE	keep	remove	override
UPDATE_SKIP_NULL_ATTRIBUTES	keep	keep	override
CLOBBER	remove	remove	override
APPEND_SET	keep	keep	append

References

Last updated Feb 13, 2022

This work is licensed under a Attribution-NonCommercial 4.0 International license.

Venmo	Paypal

PREVIOUSDesign Patterns: Proxy Pattern

NEXTMicroservice Communication Patterns

DynamoDB

Item 1: Prefer Query to Scan

Item 2: Secondary Indexes

Item 3: SaveBehavior Configuration

References

Item 3: `SaveBehavior` Configuration