You should always use DynamoDB global tables now
Jun 08, 2023 • 7 Minute Read
Here it is - the tweet heard around the NoSQL world:
Long had I waited for that day (and those of us who know, know it was a long time).
The new AWS::DynamoDB::GlobalTable
CloudFormation resource finally delivers DynamoDB's global table feature as a configurable IaC resource without the use of custom resources backed by Lambda functions to make it all automated. Where I work, I had opted to not invest in custom resources, and instead leave the configuring of global replicas a manual step for the services that we were working that used this mode.
I mean, this is not something that you're messing with frequently. Create replicas when standing up, and create replicas when you're expanding. A manual step that's not all that painful, and investing in a custom resource to do it didn't feel like a big return on investment.
But now that we have full CloudFormation support for creating and configuring our global tables, I have some thoughts.
Cloud NoSQL face-off: we compared DynamoDB, Azure Cosmos DB, and GCP's Cloud Datastore and Bigtable across pricing, features, and more.
How CloudFormation supports DynamoDB global tables
First, I want to address the elephant in the room that I actually am very disappointed that this is an entirely new resource and not an extension of the existing AWS::DynamoDB::Table
resource.
Let's take a look at a basic single table representation of both resources:
MyTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: pk
AttributeType: S
- AttributeName: sk
AttributeType: S
- AttributeName: gsi1pk
AttributeType: S
- AttributeName: gsi1sk
AttributeType: S
KeySchema:
- AttributeName: pk
KeyType: HASH
- AttributeName: sk
KeyType: RANGE
GlobalSecondaryIndexes:
- IndexName: GSI1
KeySchema:
- AttributeName: gsi1pk
KeyType: HASH
- AttributeName: gsi1sk
KeyType: RANGE
Projection:
ProjectionType: ALL
MyGlobalTable:
Type: AWS::DynamoDB::GlobalTable
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: pk
AttributeType: S
- AttributeName: sk
AttributeType: S
- AttributeName: gsi1pk
AttributeType: S
- AttributeName: gsi1sk
AttributeType: S
KeySchema:
- AttributeName: pk
KeyType: HASH
- AttributeName: sk
KeyType: RANGE
GlobalSecondaryIndexes:
- IndexName: GSI1
KeySchema:
- AttributeName: gsi1pk
KeyType: HASH
- AttributeName: gsi1sk
KeyType: RANGE
Projection:
ProjectionType: ALL
Replicas:
- Region: !Ref AWS::Region
Spot the difference? The new Replicas
property has an interesting rule about it: "The list must contain at least one element, the region where the stack defining the global table is deployed." So, no matter what, we need this populated, and in my template snippet above I'm opting to use the AWS region pseudo-parameter.
There are, in fact, a LOT of rules around the new global table resource (bear in mind I'm copying and pasting from the CloudFormation docs):
- You cannot convert a resource of type
AWS::DynamoDB::Table
into a resource of typeAWS::DynamoDB::GlobalTable
by changing its type in your template. Doing so might result in the deletion of your DynamoDB table.
This one is a shame... Catch up on this at the end.
- When using provisioned billing mode, CloudFormation will create an auto scaling policy on each of your replicas to control their write capacities. You must configure this policy using the
WriteProvisionedThroughputSettings
property. CloudFormation will ensure that all replicas have the same write capacity auto scaling property. You cannot directly specify a value for write capacity for a global table. - If your table uses provisioned capacity, you must configure auto scaling directly in the
AWS::DynamoDB::GlobalTable
resource. You should not configure additional auto scaling policies on any of the table replicas or global secondary indexes, either via API or viaAWS::ApplicationAutoScaling::ScalableTarget
orAWS::ApplicationAutoScaling::ScalingPolicy
. Doing so might result in unexpected behavior and is unsupported.
The TL;DR here is CloudFormation will manage all of the capacity configuration of your replicas. This should not be unexpected, and it's welcome.
- In AWS CloudFormation, each global table is controlled by a single stack, in a single region, regardless of the number of replicas. When you deploy your template, CloudFormation will create/update all replicas as part of a single stack operation. You should not deploy the same AWS::DynamoDB::GlobalTable resource in multiple regions. Doing so will result in errors, and is unsupported. If you deploy your application template in multiple regions, you can use conditions to only create the resource in a single region. Alternatively, you can choose to define your AWS::DynamoDB::GlobalTable resources in a stack separate from your application stack, and make sure it is only deployed to a single region.
This last part, "...define your resources in a stack separate from your application..." Is what I would consider best practice if you have been building multi-region DynamoDB backed applications using the latest version of global tables.
Assuming that someone was following this practice, I feel that AWS could have allowed the path of converting AWS::DynamoDB::Table
resources to AWS::DynamoDB::GlobalTable
and it adopt all of the replicas that it discovered. That would have been a happy path to adoption. Instead, we're left with the prospect of a database migration to do so.
Digging into DynamoDB? A Cloud Guru's DynamoDB Deep Dive course is always free.
Working with tables in multiple AWS regions
Now, let's dive into the some of behaviors around that Replicas
property.
- The list must contain at least one element, the region where the stack defining the global table is deployed. For example, if you define your table in a stack deployed to us-east-1, you must have an entry in
Replicas
with the region us-east-1. - You can create a new global table with up to two replicas. You can add or remove replicas after table creation, but you can only add or remove a single replica in each update.
Hmm.... what does it look like if we define a table with more than two regions at the start? We get this error:
"Invalid request provided: You can declare at most 2 replicas when you create a global table."
Oof. How about if we add more than one region at a time on an update?
"Invalid request provided: You can only add or remove a single replica in each update operation."
Double oof!! It's GSI updates all over again! At least there we can define all of them on creation up to the total limit.
This makes rolling out a global service that targets 3+ regions from the onset difficult. You need to deploy your template with your table with replicas in two regions first, and then perform incremental updates adding each additional region until you've hit the mark. That means the end result template cannot be deployed on its own as a new stack. You'd have to revert back to the start, or introduce a bunch of conditional statements to control the regions you're expanding to (don't @ me CDK users - I hear you).
Adding or removing a replica also has a note attached it:
- Adding a replica might take a few minutes for an empty table, or up to several hours for large tables. If you want to add or remove a replica, we recommend submitting an
UpdateStack
operation containing only that change.
This leans even harder into the best practice of defining your DynamoDB table in its own stack. Expand to new regions at your data layer first before you expand your application!
When to use DynamoDB global tables in CloudFormation
With all of these caveats I've spent the entire article combing through, as well as my lamenting about why AWS didn't just expand on the original resource, you might be wondering: why does this article's title say that you should always use the new one?
Functionally, the two types of resources operate in exactly the same way when you deploy a CloudFormation stack. If you're building an app that will ever only be in a single region - ok, I grant you there's maybe not a reason to always default to AWS::DynamoDB::GlobalTable
. But if you're building an app where you plan to expand to multiple regions, or want to leave the door open to do so, you should just use it and ease the burden of multi-region expansion later on. The point is the option is there even if you never turn it on!
And, of course, if you are building a multi-region app backed by DynamoDB, this is a no brainer.
A thanks and a shout out to the CloudFormation team for finally delivering unto us global tables as code!
Ready for more fun with DynamoDB? Walk through using DynamoDB Streams to build scalable aggregations with Forrest Brazeal.