How to safely recreate a CDK-baked DynamoDB table using S3 backups

Pascal Euhus
5 min readJul 26, 2024

--

Image by DALL-E 3

I love working with AWS CDK, but some things get nasty because of the way Cloudformation works and the way it has been designed. One of those things is when you want to perform an update on critical infrastructure components like a DynamoDB table that requires a replacement.

This is a common scenario when you want to change the key schema of a table or want to rename it¹.

Note that this approach will require a write stop on the table for the duration of the migration. If you are looking for a zero-downtime migration, you should consider using a streaming solution like DynamoDB Streams to actually replicate the data to a new table.

DynamoDB tables integrate well with S3. You can export to S3 and restore data from S3. Instead of performing destructive updates on the production table, you can create a new table with the desired configuration and then restore the data from the S3 backup.

With the power of CDK, you can build the new table with data in parallel to the existing table and then update the references (like Lambda functions) to the new table.

The system under migration is a simple DynamoDB table with a single partition key and no sort key. A Lambda function stores and retrieves data from the table.

The table is defined in the CDK stack like this:

export class MyExample extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);

// create the table
const myTable = new Table(this, "MyTable", {
partitionKey: { name: "Name", type: AttributeType.STRING },
billingMode: BillingMode.PAY_PER_REQUEST,
tableName: "MyTable",
removalPolicy: RemovalPolicy.RETAIN,
deletionProtection: true,
pointInTimeRecovery: true,
timeToLiveAttribute: "ExpiresAt",
});

// create the Lambda function
const consumer = new NodejsFunction(this, "MyFunction", {
functionName: "HelloWorld",
entry: `functions/helloWorld.ts`,
handler: `handler`,
runtime: Runtime.NODEJS_20_X,
architecture: Architecture.ARM_64,
bundling: {
minify: true,
},
logGroup: new LogGroup(this, `HelloWorldLogGroup`, {
retention: RetentionDays.ONE_WEEK,
logGroupName: `/aws/lambda/example/hello-world`,
}),
loggingFormat: LoggingFormat.JSON,
applicationLogLevelV2: ApplicationLogLevel.INFO,
});

// grant the Lambda function read/write access to the table
myTable.grantReadWriteData(consumer);
}
}

Now, let’s say we want to change the partition key of the table to UUID. We can’t do this with a simple update because it requires a replacement.

The idea is to setup the new table with the desired configuration and then restore the data from the old table to the new table.

Once confirmed that the new table is working as expected, we can update the references to the new table and delete the old table.

For that, we need to export the data from the old table to S3 and then import the data from S3 to the new table.

You can only export data to S3 if you have enabled point-in-time recovery on the table. This is a non-destructive operation, and you can set it up at any time.

First, we need to export the data from the old table to S3, you could either do this manually in the webconsole or leverage the aws-cli to do it.

Make sure you setup a S3 Bucket in advance to store the data.

aws dynamodb export-table-to-point-in-time --table-arn <YOUR_TABLE_ARN> \
--s3-bucket <YOUR_BUCKET_NAME> --export-format DYNAMODB_JSON

After running this command, you will see a new folder in the S3 bucket with the data of the table.

Bear in mind, that from now on any write to the old table will not be included in the export and hence, not be restored to the new table.

The S3 bucket we used to store the data in this example is called “MyTableBackupBucket”. We use this bucket name to fetch the data from the bucket in the CDK stack and pass it to the importSource configuration of the new table.

First, we setup the new table with the desired configuration and import the data from the old table. Everything else stays the same.

export class MyExample extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);

// create a separate table with the new configuration and import the data from the old table
const myNewTable = new Table(this, "MyNewTable", {
partitionKey: { name: "UUID", type: AttributeType.STRING },
billingMode: BillingMode.PAY_PER_REQUEST,
tableName: "MyNewTable",
removalPolicy: RemovalPolicy.RETAIN,
deletionProtection: true,
pointInTimeRecovery: true,
timeToLiveAttribute: "ExpiresAt",
importSource: {
bucket: Bucket.fromBucketName(this, "ImportSourceBucket", "MyTableBackupBucket"),
inputFormat: InputFormat.dynamoDBJson()
}
});

// create the table
const myTable = new Table(this, "MyTable", {
partitionKey: { name: "Name", type: AttributeType.STRING },
billingMode: BillingMode.PAY_PER_REQUEST,
tableName: "MyTable",
removalPolicy: RemovalPolicy.RETAIN,
deletionProtection: true,
pointInTimeRecovery: true,
timeToLiveAttribute: "ExpiresAt",
});

// create the Lambda function
const consumer = new NodejsFunction(this, "MyFunction", {
functionName: "HelloWorld",
entry: `functions/helloWorld.ts`,
handler: `handler`,
runtime: Runtime.NODEJS_20_X,
architecture: Architecture.ARM_64,
bundling: {
minify: true,
},
logGroup: new LogGroup(this, `HelloWorldLogGroup`, {
retention: RetentionDays.ONE_WEEK,
logGroupName: `/aws/lambda/example/hello-world`,
}),
loggingFormat: LoggingFormat.JSON,
applicationLogLevelV2: ApplicationLogLevel.INFO,
});

// grant the Lambda function read/write access to the table
myTable.grantReadWriteData(consumer);
}
}

After changing our Stack we need to deploy the new change set using

cdk deploy

When the new table is set and the data is imported, you can revise the new table configuration.

Once you are confident that the new table is working as expected, you can update the references to the new table and delete the old table.

export class MyExample extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);

// create a separate table with the new configuration and import the data from the old table
const myNewTable = new Table(this, "MyNewTable", {
partitionKey: { name: "UUID", type: AttributeType.STRING },
billingMode: BillingMode.PAY_PER_REQUEST,
tableName: "MyNewTable",
removalPolicy: RemovalPolicy.RETAIN,
deletionProtection: true,
pointInTimeRecovery: true,
timeToLiveAttribute: "ExpiresAt",
importSource: {
bucket: Bucket.fromBucketName(this, "ImportSourceBucket", "MyTableBackupBucket"),
inputFormat: InputFormat.dynamoDBJson()
}
});

// create the Lambda function
const consumer = new NodejsFunction(this, "MyFunction", {
functionName: "HelloWorld",
entry: `functions/helloWorld.ts`,
handler: `handler`,
runtime: Runtime.NODEJS_20_X,
architecture: Architecture.ARM_64,
bundling: {
minify: true,
},
logGroup: new LogGroup(this, `HelloWorldLogGroup`, {
retention: RetentionDays.ONE_WEEK,
logGroupName: `/aws/lambda/example/hello-world`,
}),
loggingFormat: LoggingFormat.JSON,
applicationLogLevelV2: ApplicationLogLevel.INFO,
});

// grant the Lambda function read/write access to the table
myTable.grantReadWriteData(myNewTable);
}
}

Deploy once again the changes using

cdk deploy

That’s it. You have successfully migrated your DynamoDB table. Mind that the example sets the removal policy to RETAIN. This means the table will not be deleted when the stack is deleted. You can now delete the old table manually via the webconsole or the aws-cli. If you specified a more destructive removal policy, the table may already be deleted when you deployed the latest changeset.

The new table specifies the “importSource” configuration to import the data from the S3 bucket. Changes to that configuration will not trigger a replacement of the table, although the “importSource” configuration is only used during the table creation.

Feel free to reach out if you have any questions or feedback.

[1] According the official docs it is recommended to let Cloudformation name your resources and put a custom name into resource tags. However, reality is that, including myself, people tend to name things because the webconsole looks better with a human-readable name.

--

--

Pascal Euhus

Cloud Architect, AWS Solutions Architect Professional, GCP Professional Cloud Architect