- What is DCE?
- Implementing DCE
- Usage
- API Reference
- SNS Topic Reference
- Scripts
- Build & Deploy
- Database Schema
- Database Backups
- Reset
- Account Provisioning & Decommissioning
- API Spec
- Notification via SES
- Budget Features
The Disposable Cloud Environments (DCE) is a mechanism for providing temporary, limited Amazon Web Services (AWS) accounts. Accounts can be "leased" for a period of seven days (by default). After the time has expired, the account is reset and returned to a pool of accounts to be leased again.
At a high-level, DCE consists of AWS Lambda functions (implemented in Go), Amazon DynamoDB tables, Amazon Simple Notifcation Servce (SNS) topics, and APIs exposed with Amazon API Gateway. These resources are created in the AWS account using Hashicorp Terraform.
This repository provides a set of components which implementors (you) can use to deploy your own DCE instance. These components come in the form of:
- Terraform modules to deploy DCE infrastructure to AWS
- Packaged go modules and other assets, to deploy to Lambda, CodeBuild, etc. to your AWS master account
With these resources deployed, you will have access to a set of integration points, for working with your DCE instance:
- APIs for managing DCE resources (accounts, leases, etc.)
- SNS topics, allowing you hook in to DCE events, and implement your own custom business logic
Before you can deploy DCE into your account, you will need the following:
- An AWS account to serve as the master account.
- Terraform 0.12+
- Go 1.12+
- GNU Make 3.81+
To run the terraform init
command later, make sure that you have API access
to your AWS account. To double-check your AWS API credentials, you can use
the aws configure
command as shown here:
$ aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key <not set> None None
secret_key <not set> None None
region <not set> None None
This output shows an account that has not yet been initialized. If your configuration looks like this, make sure to configure your API and CLI access before continuing. See Configurating the AWS CLI to learn how to configure the CLI and API credentials.
To deploy DCE into your master account, follow these steps:
- Clone this repository.
$ git clone git@github.com:Optum/DCE.git
- Change into the repository directory.
$ cd DCE
- Run
make deploy_local
to build and deploy the artifacts to your AWS account.$ make deploy_local ...
Once the last command has completed successfully, you will have DCE deployed in your AWS account!
TODO
DCE provides a number of SNS topics, which allow you to hook into DCE events, and implement your own custom business logic. Out of the box, DCE is unopinionated about how you manage the details of your DCE accounts. Some questions which are left to you to answer are:
- How do you grant and remove access to AWS Accounts?
- What do you do when an account reaches a budget threshold?
To answers to these questions, you can subscribe to SNS topics provided by DCE. For example, you could subscribe to the Lease Added topic, create an IAM User, and email an invite to the lease principal to login. On Lease Removed, you might delete that IAM User, and notify the lease principal that they no longer have access.
See the SNS Topic Reference for details on available SNS topics.
When a new AWS account is added to the pool, a role is created to allow principal users to login to the account, designated by the adminRoleArn
field on the account object. By default, this role has an Assume Role Policy allowing IAM principals from the same account to assume it.
To integrate with alternative identity providers, you may modify the Assume Role Policy on the IAM role.. You may listen to events on the account-created SNS topic, which include the principalRoleArn
in the message body.
TODO: what's the simplest / least opinionated approach to integrating with DCE
In order to reset AWS accounts, DCE uses the open source aws-nuke
tool. This tool tries its darndest to delete every single resource in your account, and will make several attempts to ensure everything is wiped clean.
To prevent aws-nuke
from deleting certain resources, you may provide a YAML configuration with a list of resource filters. (see aws-nuke
docs for the YAML filter configuration syntax). By default, DCE filters out resources which are critical to running DCE -- for example, the IAM roles for your account's adminRoleArn
/ principalRoleArn
.
As a DCE implementor, you may have additional resources you wish protect from aws-nuke
. If this is the case, you may specify your own custom aws-nuke
YAML configuration:
- Copy the contents of
default-nuke-config-template.yml
into your own file, and modify as needed. - Upload your YAML configuration file to an S3 bucket
- Set the Terraform
reset_nuke_template_bucket
andreset_nuke_template_key
to point at your YAML configuration file on S3 - Make sure you have aws-nuke enabled
DCE allows you to use a number of template parameters within your aws-nuke
YAML config, which will be resolved a runtime:
Parameter | Description |
---|---|
{{id}} |
The AWS Account ID, for the account currently being nuked |
{{admin_role}} |
The name of the IAM role assumed by the DCE master account to manage the child account. |
{{principal_role}} |
The name of the IAM role assumed by end users of DCE, in order to login to their AWS account |
{{principal_policy}} |
The IAM policy assigned the the principal_role |
By default, aws-nuke
is set to execute in Dry Run mode, so that you don't accidentally destroy critical resources in your AWS account. To enable aws-nuke
, you will need to set the Terraform reset_nuke_toggle
variable to "true"
.
To add an account to the DCE Account Pool, you may use the POST /accounts
endpoint.
eg.
POST /accounts
{
"id": "123456789012"
"adminRoleArn": "arn:aws:iam::123456789012:role/DCEAdmin"
}
Response:
{
"id": "1234567890",
"accountStatus": "NotReady",
"adminRoleArn": "arn:aws:iam::1234567890123:role/adminRole",
"principalRoleArn": "arn:aws:iam::1234567890123:role/DCEPrincipal",
"principalPolicyHash": "",
"createdOn": 1560306008,
"lastModifiedOn": 1560306008,
"metadata": {}
}
The IAM Role passed in as adminRoleArn
must be assumable by the DCE master account, and have appropriate IAM access to manage the DCE Account (eg. can run aws-nuke in the account).
When adding the account to the pool, the account will be marked as NotReady
, and queued for reset. You will need to wait for reset to complete and the account to be marked as Ready
before requesting leases against the account.
DCE will create a new IAM Role to be assumed by principal users of the account. The ARN for this role will be included in the response, as principalRoleArn
. The principal's role has near-admin access to the account, with the following exceptions:
- Cannot create resources which cannot be deleted by DCE
- Cannot create support tickets, or increase service limits
- Is restricted to
us-east-1
andus-west-1
- Cannot modify resources managed by DCE (including itself)
See Integrating with Identity Providers for documentation on assuming the DCE principal role using an identity provider.
DCE exposes an API for managing DCE accounts and leases.
See swaggerRedbox.yaml for endpoint documentation (better Swagger docs to come...).
The API is hosted by AWS API Gateway. The base URL is exposed as a Terraform output. To retrieve the base url of the API, run the following command from your Terraform module directory:
terraform output api_url
The DCE API is authorized via IAM. To access the API, you must have access to an IAM principal with appropriate IAM access to execute the API.
All API requests must be signed with Signature Version 4. See AWS documentation for signing requests.
The IAM principal used to send requests to the DCE API must have sufficient permissions to execute API requests.
The Terraform module in the repo provides an IAM policy with appropriate permissions for executing DCE API requests. You can access the policy name and ARN as Terraform outputs.
terraform output api_access_policy_name
terraform output api_access_policy_arn
The Terraform module will come with a sane starting policy that is applied to the IAM principal. This policy is applied when a new account is added or when a lease is unlocked. It is possible to change the policy to what is needed by providing the Terraform variable dcs_principal_policy
. The value of this variable is a location of a policy file that can be a Go template. It is uploaded into S3 and is read from there as the policy is applied.
The AWS SDK for Go exposes a signer/v4
package, which may be used to sign API requests. For example:
import (
"github.com/aws/aws-sdk-go/aws/credentials"
sigv4 "github.com/aws/aws-sdk-go/aws/signer/v4"
"net/http"
"time"
)
func sendRequest(method, endpoint) (http.Response, error) {
// Create an API request
req, err := http.NewRequest(method, apiUrl+endpoint, nil)
if err != nil {
return nil, err
}
// Load credentials from env vars, or a credentials file
awsCreds := credentials.NewChainCredentials([]credentials.Provider{
&credentials.EnvProvider{},
&credentials.SharedCredentialsProvider{Filename: "", Profile: ""},
})
// Sign the request
signer := sigv4.NewSigner(awsCreds)
signedHeaders, err := signer.Sign(req, nil, "execute-api", "us-east-1", time.Now())
if err != nil {
return nil, err
}
// Send the API request
return http.DefaultClient.Do(req)
}
See AWS docs with example code for signing requests in Python.
Alternatively, you could consider open-source libraries like aws-requests-auth for signing requests.
See AWS docs for sending signed requests in Postman
An account was added to the account pool
This message includes a payload as JSON, with the following fields:
Field | Type | Description |
---|---|---|
id | string | AWS Account ID |
accountStatus | "Ready", "NotReady", or "Leased" | Account status |
adminRoleArn | string | ARN for the IAM role used by the DCE master account to manage the account |
lastModifiedOn | int | Last modified timestamp |
createdOn | int | Last modified timestamp |
metadata | JSON object | Metadata field contains any organization specific data pertaining to the account that needs to be persisted |
Example:
{
"id": "1234567890",
"accountStatus": "NotReady",
"adminRoleArn": "arn:aws:iam::1234567890123:role/adminRole",
"principalRoleArn": "arn:aws:iam::1234567890123:role/DCEPrincipal",
"principalPolicyHash": "\"d41d8cd98f00b204e9800998ecf8427e-38\"",
"createdOn": 1560306008,
"lastModifiedOn": 1560306008,
"metadata": {}
}
This SNS topic ARN is provided as a Terraform output:
terraform output account_created_topic_arn
An account was deleted from the account pool
This SNS topic ARN is provided as a Terraform output:
terraform output account_deleted_topic_arn
Triggered when a lease is created.
This message includes a payload as JSON, with the following fields:
Field | Type | Description |
---|---|---|
accountId | string | AWS Account ID |
principalId | string | ID of the principal user, associated with the lease |
leaseStatus | string | Status of the lease. |
createdOn | integer | Timestamp (epoch) of creation |
lastModifiedOn | integer | Timestamp (epoch) of last modification |
leaseModifiedOn | integer | Timestamp (epoch) of lease status modification |
Example:
{
"accountId": "1234567890",
"principalId": "jdoe17",
"leaseStatus": "Active",
"createdOn": 1560306008,
"lastModifiedOn": 1560306008,
"leaseStatusModifiedOn": 1560306008
}
This SNS topic ARN is provided as a Terraform output:
terraform output lease_added_topic_arn
Triggered when a lease is deleted.
This message includes a payload as JSON, with the following fields:
Field | Type | Description |
---|---|---|
accountId | string | AWS Account ID |
principalId | string | ID of the principal user associated with the lease |
leaseStatus | string | Status of the lease. |
createdOn | integer | Timestamp (epoch) of creation |
lastModifiedOn | integer | Timestamp (epoch) of last modification |
leaseStatusModifiedOn | integer | Timestamp (epoch) of last lease status modification |
Example:
{
"accountId": "1234567890",
"principalId": "jdoe17",
"leaseStatus": "Decommissioned",
"createdOn": 1560306008,
"lastModifiedOn": 1560306008,
"leaseStatusModifiedOn": 1560306008
}
This SNS topic ARN is provided as a Terraform output:
terraform output lease_removed_topic_arn
Triggered when a lease is "locked". Locking a lease means that the principal's access to the account has been temporarily disabled. For example, a lease will be locked when the AWS account reaches it's max budget threshold, and unlocked again after the end of the lease period.
AWS DCE is unopinionated about how lease locks are implemented. It is up to you on how you want to respond to this topic (eg. by removing the principal's access to the account).
This message payload is the lease object as JSON, with the following fields:
Field | Type | Description |
---|---|---|
accountId | string | AWS Account ID |
principalId | string | ID of the principal user, associated with the lease |
leaseStatus | string | Status of the lease. |
createdOn | integer | Timestamp (epoch) of creation |
lastModifiedOn | integer | Timestamp (epoch) of last modification |
leaseModifiedOn | integer | Timestamp (epoch) of lease status modification |
Example:
{
"accountId": "1234567890",
"principalId": "jdoe17",
"leaseStatus": "Active",
"createdOn": 1560306008,
"lastModifiedOn": 1560306008,
"leaseStatusModifiedOn": 1560306008
}
This SNS topic ARN is provided as a Terraform output:
terraform output lease_locked_topic_arn
Triggered when a lease is "unlocked". Locking a lease means that the principal's access to the account has been temporarily disabled. For example, a lease will be locked when the AWS account reaches it's max budget threshold, and unlocked again after the end of the lease period.
AWS DCE is unopinionated about how lease locks are implemented. It is up to you on how you want to respond to this topic (eg. by removing the principal's access to the account).
This message payload is the lease object as JSON, with the following fields:
Field | Type | Description |
---|---|---|
accountId | string | AWS Account ID |
principalId | string | ID of the principal user, associated with the lease |
leaseStatus | string | Status of the lease. |
createdOn | integer | Timestamp (epoch) of creation |
lastModifiedOn | integer | Timestamp (epoch) of last modification |
leaseModifiedOn | integer | Timestamp (epoch) of lease status modification |
Example:
{
"accountId": "1234567890",
"principalId": "jdoe17",
"leaseStatus": "Active",
"createdOn": 1560306008,
"lastModifiedOn": 1560306008,
"leaseStatusModifiedOn": 1560306008
}
This SNS topic ARN is provided as a Terraform output:
terraform output lease_unlocked_topic_arn
Assumes that it runs in the root directory of the repo.
Bash script to unit test all Go projects and build all executables in the
cmd
directory, and generate the bin/build_artifacts.zip
and
bin/terraform_artifacts.zip
containing individual zipped executables and
Terraform files.
Requirements
Artifacts
-
bin/lambda/
- Executables and Zips generated from Golang Lambda Functions in
cmd/lambda/
. Gets removed at the end.
- Executables and Zips generated from Golang Lambda Functions in
-
bin/codebuild/
- Executables and Zips generated from Golang CodeBuild functions
in
cmd/codebuild/
. Gets removed at the end.
- Executables and Zips generated from Golang CodeBuild functions
in
-
bin/build_artifacts.zip
- The Build Artifact Zip file containing all the generated Zips from
bin/lambda/
andbin/codebuild/
. -lambda
-acctmgr.zip
-financelock.zip
-resetsqs.zip
-resettrigger.zip
-codebuild
-reset.zip
- The Build Artifact Zip file containing all the generated Zips from
-
bin/terraform_artifacts.zip
- The Terraform Artifact Zip file containing the
modules
directory with all of the base Terraform Files.
- The Terraform Artifact Zip file containing the
Assumes that it runs in the root directory of the repo.
Bash script that finds bin/build_artifacts.zip
, unzips it, and uploads those artifacts
(CodeBuild pipeline as well as all lambdas) to the designated S3 artifact bucket. It will also
link those lambdas to by running lambda update-function-code
. You must run scripts/build.sh
prior to running scripts/deploy.sh
.
~$ ./scripts/deploy.sh <namespace> <artifactBucket>
Argument | Description |
---|---|
namespace |
Indicates which namespace this deployment is scoped to. |
artifactBucket |
Describes which S3 artifact bucket to use. |
Run make build
to compile all lambdas under the functions
directory.
This will produce the binaries (as well as the zips that can be uploaded to the
AWS console for manual deployment) in the bin
directory. See the deploy.sh
above for automated deployment.
DCEAccount Table
Status of each Account in our pool
{
"Id": "123456789012", # *Unique AWS Account ID*
"AccountStatus": "Leased" | "Ready" | "NotReady"
"LastModifiedOn": 1555690626 # *Epoch Timestamp*
}
Hash Key: Id
Range Key: AccountStatus
DCELease Table
Current state of a users lease to a AWS Account. Records are unique by AccountId+PrincipalId.
{
"AccountId": "123456789012", # AWS Account ID
"PrincipalId": "098765432",
"Id" "4585fca7-cb21-406b-b778-83dccb351354",
"LeaseStatus": "Active" | "FinanceLock" | "ResetLock" | "ResetFinanceLock" | "Decommissioned"
"CreatedOn": 1555690626 # *Epoch Timestamp*
"LastModifiedOn": 1555690626 # *Epoch Timestamp*
"BudgetAmount": 300
"BudgetCurrency": "USD"
"BUdgetNotificationEmail": ["user@test.com", "manager@test.com"]
}
Hash Key: AccountId
Range Key: LeaseStatus
Secondary Index: PrincipalId Secondary Range Key: PrincipalId
UsageCache Table
Usage cost of lease per day is stored in this table.
{
"AccountId": "123456789012", # *Unique AWS Account ID*
"CostAmount": 24.00, # *usage cost amount*
"CostCurrency": "USD", # *usage cost currency*
"EndDate": 1568678399, # *Epoch Timestamp*
"PrincipalId": "TestUser1", # *User principal ID*
"StartDate": 1568592000, # *Epoch Timestamp*
"TimeToLive": 1571184000 # ttl attribute - Epoch Timestamp*
}
Hash Key: StartDate
Range Key: PrincipalId
DCE does not backup your DynamoDB tables by default. However, if you want to restore a DynamoDB table from a backup, we do provide a helper script in scripts/restore_db.sh. This script is also provided as a Github release artifact, for easy access.
To restore a DynamoDB table from a backup:
# Grab the account table name from Terraform state
table_name=$(cd modules && terraform output redbox_account_db_table_name)
# Or, grab the leases table name
table_name=$(cd modules && terraform output dcs_lease_db_table_name)
# List available backups
./scripts/restore_db.sh \
--target-table-name ${table_name} \
--list-backups
# Choose an backup from the output of the last command, and pass in the ARN
./scripts/restore_db.sh \
--target-table-name ${table_name} \
--backup-arn <backup arn>
# If the table already exists, and you want to delete and
# recreate it from a backup, pass in
# the --force-delete-table flag
./scripts/restore_db.sh \
--target-table-name ${table_name} \
--backup-arn <backup arn> \
--force-delete-table
After restoring your DynamoDB table from a backup, you should rerun terraform apply
to ensure that your table is in sync with your Terraform configuration.
AWS DCE Reset will process an AWS DCE Account to a clean and secure state. The Reset has 2 main procedures, clearing the resources in an account (Nuke).
The Reset of an account is done through a CodeBuild stage in a CodePipeline.
To clear resources from an AWS DCE Account, aws-nuke
is used to list out all nuke-able resources and remove them. The defualt
configuration file used to filter resources to not delete is located
here. The
configuration file can also be pulled from an S3 Bucket Object via setting
the RESET_NUKE_TEMPLATE_BUCKET
and RESET_NUKE_TEMPLATE_KEY
, these are
default to STUB
and are ignored.
Cloudwatch alarms are defined in modules/alarms.tf, all alarms deliver to the SNS topic defined in modules/alarms_sns.tf.
Alarms are defined based upon metrics available for each resource, Metrics and Services.
They will vary by each service, please refer to the documentation above to create/modify Alarms/Alerts.
This repo makes use of the Simple Email Service from AWS, which requires a verified email addrss for the functionality being used. To allow for this, the email address configured in the terraform requires a confirmation to made manually upon the applicaiton of the email address to the account.
The address MUST reply to a confirmation email sent from SES to verify the account before emails can commence.
Budget notifications will be sent out of a verified Simple Email Service account. Verification of this address is a manual process, see above "Notification via SES" section. Some variables used in notification templates (conatined in modules/variables.tf):
- IsOverBudget : Boolean determining account budget status
- Lease.PrincipalID : The UserID of the lease holder
- Lease.AccountID : The Account number of the AWS account in use
- Lease.BudgetAmount : The configured budget amount for the lease
- ActualSpend : The calculated spend on the account at time of notification
- ThresholdPercentile : The conigured threshold percentage for the notification, prior to exhaustion