Skip to content

danielbojczuk/GenerateAndPresentData

Repository files navigation

Generate And Present Data POC

This is a POC for an application with at least two components that:

  • Generates data.
  • Uses asynchronous communication to send the generated data to another instance, which persists the data to static storage of your choice.
  • Web app that serves the persisted data in any way.

Solution

Considering the application doesn't have complex requirements, it could be perfectly divided into small pieces. So, a serverless approach was chosen using Lambda Functions. If cold start is a problem, Provisioned Concurrency could be enabled.

Solution Diagram

Data Generation

To have more control over when the data will be generated, and over the data itself, the data generation is done by a POST request to the /data resource.

Data Presenter

Since a web interface is not necessary, you can retrieve the persisted data through a GET request to the /data resource. This API could be used by a Web Interface or any other service.

Data Persister

When a POST request is made to the /data resource, the information is posted in the queue, triggering the data-persister which stores the data in a DynamoDB table.

To fulfill the requirement of asynchronous communication, a queue was used. The reason was based on an assumption that only one client (data-persister) would use this information. If this is not true, we could replace the queue with a pub/sub system.

For persisting the data, a DynamoDB table was chosen due to the simplicity of the data and the ease of querying.

Infrastructure as Code

All AWS services are defined and provisioned using Terraform: GitHub Repository.

It uses an S3 bucket as a backend and workspaces to manage different environments.

CI/CD

A pipeline with GitHub Actions was created to build, test, and deploy the resources.

Pipeline Overview Pipeline Build Job

Monitoring

Logging

The solution uses CloudWatch to store the logs.

Tracing

X-Ray is used to manage the tracing.

X-Ray GET X-Ray POST

Metrics

The solution uses the default AWS metrics. Only in the production environment, some alarms are deployed along with an SNS topic and subscription to send alerts via email.

Alarms Alarms in Alarm Alarms Notification

Security

  • To avoid data going through the public internet within the application and control internal service access, all Lambdas are working in a private VPC using VPC endpoints to access AWS services.

  • A custom authorizer was implemented in the API gateway.

Automated Testing

  • A simple unit test was implemented in the data-api-authorizer function app.
  • A simple integration test using a Bash script was implemented to run in the dev environment via the pipeline.

How to Use

To generate and receive the data, you need to use a Bearer token. You can generate one using a JWT Builder Website or use the test token from the integration test.

The authorizer validates the roles in the Role field. It should have:

  • data.read for the GET request
  • data.write for the POST request

The application will use the data in the subfield within the JWT token as userId. And a userId can only get its own data.

Production URL: https://db6hxelybk.execute-api.eu-west-1.amazonaws.com/prd/data

POST

Example payload:

{
   "informationOne": "aaa23",
   "informationTwo": "bbb24"
}

Solution Diagram

GET

Solution Diagram

Improvements/Next Steps

Solution

  • Implement Dead Letter Queue.
  • Improve application to support getting more than one queue item per lambda execution.
  • Configure throttling for API.
  • Add a web interface using a SPA hosted in an S3 bucket distributed with CloudFront.
  • Add other metrics/alarms.
  • Change email notification for a better approach.

Pipeline

  • Add Terraform plan and manual steps to check the plan output before deployment.
  • Remove region and other hardcoded configurations.
  • Add smoke test after production deployment.
  • Add GitHub Actions workflow for PR check.

Security

  • Add encryption at rest and in transit.

Backup

  • Add Point-in-time recovery for DynamoDB.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published