Introduction to AWS CDK

Malte Hallström / November 20, 2021

11 min read •

Background / Initiative

In order to future-proof the architecture and also keep it under version control in a clear and easy-to-understand way, I decided to give CDK a go.

Infrastructure requirements

The project that this was built around is weather-scraper, a recurring job that contained the following steps:

Download weather data
Parse and enrich said data
Run DB migrations and put data into database tables
Clean irrelevant/old data

The goal was to turn this whole project into a container that could be deployed on ECS and run with regular intervals, perhaps 2-3 times / hour. In order to make this possible, a docker image was created that supported environment variables through docker-compose (for local use) and through injection using ECS (explained later).

With some help, I was able to set up a basic configuration that integrated the following AWS services:

ECS
ECR
RDS
SM
CloudWatch

Prerequisites

In order to get started, you need to be signed into the AWS SDK. You also have to install the AWS CDK CLI by running yarn global add aws-cdk

Also, make sure that you specify a default region by running aws configure and following the steps. I used eu-west-1.

The basics

The first thing I did was to create a CDK app by installing the CDK CLI () and then issuing cdk init app --language typescript. Doing so left me with a folder structure like this:

The stack

The cdk-init-stack.ts file was pretty basic to begin with, containing only this:

import * as cdk from "@aws-cdk/core";

export class CdkInitStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    // The code that defines your stack goes here
  }
}

The stack defines a set of Amazon services. Some services depend on others, in which case variable references can be used within a stack in order to link them together.

The app

The cdk-init.ts file was also quite empty, with the exception of some comments:


#!/usr/bin/env node

import 'source-map-support/register'
import * as cdk from '@aws-cdk/core'
import { CdkInitStack } from '../lib/cdk-init-stack'


const app = new cdk.App()

new CdkInitStack(app, 'CdkInitStack', {

 /* If you don't specify 'env', this stack will be environment-agnostic.
 * Account/Region-dependent features and context lookups will not work,
 * but a single synthesized template can be deployed anywhere. */
 /* Uncomment the next line to specialize this stack for the AWS Account
 * and Region that are implied by the current CLI configuration. */
 // env: { account: process.env.CDK_DEFAULT_ACCOUNT, region: process.env.CDK_DEFAULT_REGION },
 /* Uncomment the next line if you know exactly what Account and Region you
 * want to deploy the stack to. */
 // env: { account: '123456789012', region: 'us-east-1' },
 /* For more information, see https://docs.aws.amazon.com/cdk/latest/guide/environments.html */
})

The app's purpose, as I later learned, is to tie the stacks together.

The configuration

The cdk.json file had the following default content:

{
  "app": "npx ts-node --prefer-ts-exts bin/cdk-init.ts",
  "context": {
    "@aws-cdk/aws-apigateway:usagePlanKeyOrderInsensitiveId": true,
    "@aws-cdk/core:enableStackNameDuplicates": true,
    "aws-cdk:enableDiffNoFail": true,
    "@aws-cdk/core:stackRelativeExports": true,
    "@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
    "@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
    "@aws-cdk/aws-kms:defaultKeyPolicies": true,
    "@aws-cdk/aws-s3:grantWriteWithoutAcl": true,
    "@aws-cdk/aws-ecs-patterns:removeDefaultDesiredCount": true,
    "@aws-cdk/aws-rds:lowercaseDbIdentifier": true,
    "@aws-cdk/aws-efs:defaultEncryptionAtRest": true,
    "@aws-cdk/aws-lambda:recognizeVersionProps": true,
    "@aws-cdk/aws-cloudfront:defaultSecurityPolicyTLSv1.2_2021": true
  }
}

Useful commands

When using CDK, many common operations are done through the CDK CLI. From the documentation:

cdk deploy deploy this stack to your default AWS account/region
cdk diff compare deployed stack with current state
cdk synth emits the synthesized CloudFormation template

Customizing the template to fit the project

In order to satisfy the [[#Infrastructure requirements]], the existing files had to change.

First, all dependencies needed were installed and imported. This was the result:

import * as cdk from "@aws-cdk/core";
import * as ecs from "@aws-cdk/aws-ecs";
import * as ecr from "@aws-cdk/aws-ecr";
import * as ec2 from "@aws-cdk/aws-ec2";
import * as secretsmanager from "@aws-cdk/aws-secretsmanager";
import * as logs from "@aws-cdk/aws-logs";

Creating the main stack

The first step was to define the class that would represent our stack. In order to make it modular and compatible with different environments, it was also made sure that it was possible to pass parameters to the class on initialization.

Here is the resulting base class along with an interface that describes the options, placed in a file called cdk-stack.ts:

export interface WeatherScraperStackProps extends cdk.StackProps {
  environment: "development" | "staging" | "production";
  ecrRepo: ecr.Repository;
  vpcId: string;
  postgresSecurityGroupIds: string[];
}

export class WeatherScraperStack extends cdk.Stack {
  securityGroup: ec2.SecurityGroup;
  constructor(
    scope: cdk.Construct,
    id: string,
    props: WeatherScraperStackProps
  ) {
    super(scope, id, props);
  }
}

Where:

ecrRepo will be a reference to a repository on the Elastic Container Registry. This is needed in order to reference and pull the latest docker images.
vpcId will be the ID of the Virtual Private Cloud where the database lives. This is needed in order to create security rules that can then be added to the security groups specified in postgresSecurityGroupIds
postgresSecurityGroupIds will be a list of IDs to the specific security groups within the VPC that are associated with the database that we need to connect to.

The task definition

AWS Task Definitions are used to uniquely identify tasks that can be run in ECS. They can reference one or more containers, and can be configured with a literal sea of parameters in typical AWS fashion.

In order to add a task definition to the stack, the following code was used:

const taskDefinition = new ecs.Ec2TaskDefinition(this, "WeatherScraperTask", {
  networkMode: ecs.NetworkMode.AWS_VPC,
});

The second parameter of the Ec2TaskDefinition constructor is the unique id of the task definition, which here will be the name of the project.

The networkMode is the Docker networking mode to use for the containers in the task. The valid values are none, bridge, awsvpc, and host.

The security rules

As previously mentioned, we need to configure security rules in order to securely connect to RDS (in this case a Postgres database).

To do that, we first create a referene to the VPC in question using this bit of code:

const vpc = ec2.Vpc.fromLookup(this, 'Vpc', { vpcId: props.vpcId })
this.securityGroup = new ec2.SecurityGroup(this, 'WeatherScraperSecurityGroup', { vpc })```


Were:


- `props.vpcId` is explained in [[#Creating the main stack]]
- The second argument of `ec2.SecurityGroup` is the unique id of the security group that will be created.
- `vpc` is a direct reference to a VPC, in this case the one where the database is located.

This is not all, however, since we still have not added any security rules to the database. In order to do that, we can use this code:


```typescript
props.postgresSecurityGroupIds.forEach((pgSgId) => {
 const pgSecurityGroup = ec2.SecurityGroup.fromSecurityGroupId(this, `PostgresSecurityGroup${pgSgId}`, pgSgId)
 pgSecurityGroup.addIngressRule(this.securityGroup, ec2.Port.tcp(5432), `Weather Scraper ${props.environment}`)
})

Where:

postgresSecurityGroupIds is explained in [[#Creating the main stack]]
SecurityGroup.fromSecurityGroupId imports an existing security group by ID. It is assumed that the group allows all outbound traffic, and thus it is only possible to add Ingress rules.
addIngressRule adds an inbound rule to the already existing security group. By specifying the securityGroup that we created earlier, we make sure that all coming services that use that rule will be able to access the database.

The logging

In order to send the stdout of the container to CloudWatch, the following simple code snippet was used:

const logGroup = new logs.LogGroup(this, "WeatherScraperLogGroup", {
  logGroupName: `weather-scraper-${props.environment}`,
  removalPolicy: cdk.RemovalPolicy.DESTROY,
  retention: 1,
});

Where:

Most parameters are self-explanatory
retention: 1 will result in the logs being kept for 1 day.

The secrets!

In order to connect to the database, credentials was of course needed along with a valid hostname. The container used here can read these from the environment, no build arguments necessary. This ultimately makes it possible to run the same image in any environment (development / staging etc.) without rebuilding.

In order to safely provide the container with these secrets, I wanted a way of specifying them using references to secrets stored in the [[AWS Secrets Manager]]. The secrets already existed, there just had to be a way of fetching them from inside the CDK script.

Fortunately, this was very simple using CDK:

const postgresSecrets = secretsmanager.Secret.fromSecretNameV2(
  this,
  `PostgresSecret${props.environment}`,
  `postgres-${props.environment}`
);

Using the now-defined postgresSecrets variable, I could write the following helper function to retrieve secrets by key:

const getSecret = (key: string) =>
  ecs.Secret.fromSecretsManager(postgresSecrets, key);

This will be used in the next section.

Adding the container to the task definition

With the above sections out of the way, all the necessary infrastructure was in place. All that was left to do was to add the actual container to the mix, as well as specifying the specifications of the machine that would run it.

taskDefinition.addContainer(`WeatherScraperContainer`, {
  image: ecs.ContainerImage.fromEcrRepository(props.ecrRepo, props.environment),
  memoryLimitMiB: 2048,
  memoryReservationMiB: 1024,
  cpu: 800,
  logging: ecs.LogDriver.awsLogs({
    logGroup,
    streamPrefix: "/ecs/WeatherScraper",
  }),
  secrets: {
    PGPASSWORD: getSecret("password"),
    DB_HOST: getSecret("host"),
  },
});

Here, you can see the results of some of the earlier sections being used:

The ecrRepo, a reference to the repository to which new container images are pushed.
The getSecret helper, used to fetch an environment-specific database secret from Secrets Manager.

Tying it all together

In order for this stack to execute successfully, we need to specify some parameters. This can be done in a separate file, see [[#The app]].

The first thing to note is that the stack currently relies on being passed a reference to an ECR repository. This can also be done using the CDK! I created a new file called ecr-stack.ts and put the following in it:

import * as cdk from "@aws-cdk/core";
import * as ecr from "@aws-cdk/aws-ecr";

export interface EcrRepoStackProps extends cdk.StackProps {
  appName: string;
}

export class EcrRepoStack extends cdk.Stack {
  repo: ecr.Repository;

  constructor(scope: cdk.Construct, id: string, props: EcrRepoStackProps) {
    super(scope, id, props);

    this.repo = new ecr.Repository(this, "EcrRepo", {
      repositoryName: `${props.appName}`,
    });
  }
}

It was then time to bring it all together in a file called cdk-app.ts:

#!/usr/bin/env node

import "source-map-support/register";
import * as cdk from "@aws-cdk/core";
import { WeatherScraperStack } from "./cdk-stack";
import { EcrRepoStack } from "./ecr-stack";

const app = new cdk.App();

const ecrWeatherScraperRepo = new EcrRepoStack(
  app,
  "EcrWeatherScraperRepoStack",
  {
    appName: "weather-scraper",
    env: {
      region: "eu-west-1",
      account: "282915464914",
    },
    description: "Scraping weather and importing into postgres DB",
    tags: {
      application: "weather-scraper",
      realm: "charts",
    },
  }
);

First, a new CDK application is initialized and its value is assigned to the app constant.

Then, a reference to a new ECR repository is created using the class we defined earlier and some new parameters:

appName, the name of the repository shown in the ECR page in the AWS console.
env, the AWS environment (account/region) where this stack will be deployed.

Note: If either region or account are not set, the Stack will be considered "environment-agnostic". Environment-agnostic stacks can be deployed to any environment but may not be able to take advantage of all features of the CDK. For example, they will not be able to use environmental context lookups such as ec2.Vpc.fromLookup.

Finally, we initialize the stack that we have been working on for the most part of this document using the following code:

new WeatherScraperStack(app, "WeatherScraperStackDevelopment", {
  env: {
    region: "eu-west-1",
    account: "282915464914",
  },
  vpcId: "vpc-0987bc6c",
  postgresSecurityGroupIds: ["sg-6af69013"],
  environment: "development",
  ecrRepo: ecrWeatherScraperRepo.repo,
  tags: {
    application: "weather-scraper",
    realm: "charts",
    environment: "development",
  },
});

This creates an environment-specific stack, in this case for development.

When it comes to the parameters:

env is the same as for the section above
vpcId and postgresSecurityGroupIds are explained in [[#Creating the main stack]]
The previously defined ecrWeatherScraperRepo variable, referencing an ECR repository, is used as value for the ecrRepo option.

Final steps

And that's it! With all of the above steps done, we now have a full infrastructure defined in code. All that's left to do is to synthesize a CloudFormation template from the code, which will catch common errors. This can be done using cdk synth --all

After confirming that the output looks OK without any errors, the final command can be run: cdk deploy --all. With a bit of luck, your environment will now be fully configured and ready to use!

In this particular case, the next step would be to [[Schedule Tasks in ECS]]