Skip to content
This repository has been archived by the owner on Feb 25, 2024. It is now read-only.

bentoml/aws-ec2-deploy

Repository files navigation

AWS EC2 deployment tool

AWS EC2 is a great choice for deploying containerized and load balanced services in the cloud. Its ability to autoscale and automated health checking features make it attractive to users who want to reduce cost and want to horizontally scale base on traffic.

demo of aws-ec2-deploy tool

Prerequisites

Deploy IrisClassifier from Bentoml quick start guide to AWS EC2

  1. Build and save Bento Bundle from BentoML quick start guide

  2. Copy and change the sample config file given and change it according to your deployment specifications. Check out the config section to find the different options available.

  3. Create EC2 deployment with the deployment tool.

    Run deploy script in the command line:

    $ BENTO_BUNDLE_PATH=$(bentoml get IrisClassifier:latest --print-location -q)
    $ python deploy.py $BENTO_BUNDLE_PATH my-first-ec2-deployment ec2_config.json

    Get EC2 deployment information and status

    $ python describe.py my-first-ec2-deployment
    
    # Sample output
    {
      "InstanceDetails": [
        {
          "instance_id": "i-03ff2d1b9b717a109",
          "endpoint": "3.101.38.18",
          "state": "InService",
          "health_status": "Healthy"
        }
      ],
      "Endpoints": [
        "3.101.38.18:5000/"
      ],
      "S3Bucket": "my-ec2-deployment-storage",
      "TargetGroup": "arn:aws:elasticloadbalancing:us-west-1:192023623294:targetgroup/my-ec-Targe-3G36XKKIJZV9/d773b029690c84d3",
      "Url": "http://my-ec2-deployment-elb-2078733703.us-west-1.elb.amazonaws.com"
    }
  4. Make sample request against deployed service. The url for the endpoint given in the output of the describe command or you can also check the API Gateway through the AWS console.

    $ curl -i \
      --header "Content-Type: application/json" \
      --request POST \
      --data '[[5.1, 3.5, 1.4, 0.2]]' \
      https://ps6f0sizt8.execute-api.us-west-2.amazonaws.com/predict
    
    # Sample output
    HTTP/1.1 200 OK
    Content-Type: application/json
    Content-Length: 3
    Connection: keep-alive
    Date: Tue, 21 Jan 2020 22:43:17 GMT
    x-amzn-RequestId: f49d29ed-c09c-4870-b362-4cf493556cf4
    x-amz-apigw-id: GrC0AEHYPHcF3aA=
    X-Amzn-Trace-Id: Root=1-5e277e7f-e9c0e4c0796bc6f4c36af98c;Sampled=0
    X-Cache: Miss from cloudfront
    Via: 1.1 bb248e7fabd9781d3ed921f068507334.cloudfront.net (CloudFront)
    X-Amz-Cf-Pop: SFO5-C1
    X-Amz-Cf-Id: HZzIJUcEUL8aBI0KcmG35rsG-71KSOcLUNmuYR4wdRb6MZupv9IOpA==
    
    [0]%
  5. Delete EC2 deployment

    $ python delete.py my-first-ec2-deployment

Deployment operations

Configuration options

  • region: AWS region for EC2 deployment
  • ec2_auto_scale:
    • min_size: The minimum number of instances for the auto scale group.
    • desired_capacity: The desired capacity for the auto scale group. Auto Scaling group will start by launching as many instances as are specified for desired capacity.
    • max_size: The maximum number of instances for the auto scale group
  • instance_type: Instance type for the EC2 deployment. See https://aws.amazon.com/ec2/instance-types/ for more info
  • enable_gpus: (Optional) To enable access to the GPUs if you're using GPU-accelerated instance_types.
  • ami_id: The Amazon machine image (AMI) used for launching EC2 instance. The default is /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2. See https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html for more information.
  • elb:
    • health_check_interval_seconds: The approximate interval, in seconds, between health checks of an individual instance. Valid Range: Minimum value of 5. Maximum value of 300.
    • health_check_path.: The URL path for health check. Default is /healthz
    • health_check_port: Health check port. Default is 5000
    • health_check_timeout_seconds: The amount of time, in seconds, during which no response means a failed health check.
    • healthy_threshold_count: The number of consecutive health checks successes required before moving the instance to the Healthy state. Valid Range: Minimum value of 2. Maximum value of 10.
  • environment_variables: This takes a dictionary of variable, value pairs that are passed into docker as environment variables. If you want to pass bentoml specific environment variable use this. eg environment_variables: {'BENTOML_MB_MAX_BATCH_SIZE': '300'}

Create a deployment

Use command line

python deploy.py <Bento_bundle_path> <Deployment_name> <Config_JSON default is ./ec2_config.json>

Example:

MY_BUNDLE_PATH=${bentoml get IrisClassifier:latest --print-location -q)
python deploy.py $MY_BUNDLE_PATH my_first_deployment ec2_config.json

Use Python API

from deploy import deploy_to_ec2

deploy_to_ec2(BENTO_BUNDLE_PATH, DEPLOYMENT_NAME, CONFIG_JSON)

Update a deployment

Use command line

python update.py <Bento_bundle_path> <Deployment_name> <Config_JSON>

Use Python API

from update import update_deployment
update_deployment(BENTO_BUNDLE_PATH, DEPLOYMENT_NAME, CONFIG_JSON)

Get deployment's status and information

Use command line

python describe.py <Deployment_name> <Config_JSON>

Use Python API

from describe import describe_deployment
describe_deployment(DEPLOYMENT_NAME, CONFIG_JSON)

Delete deployment

Use command line

python delete.py <Deployment_name> <Config_JSON>

Use Python API

from  delete import delete_deployment
delete_deployment(DEPLOYMENT_NAME, CONFIG_JSON)