[Enhancement]: Sagemaker endpoint scaling to zero #40606
Description
Description
It was announcement on 2024 re:Invent that Sagemaker can be scaled down to zero instance if no activity detected.
I was able to reproduce setup following steps in artice using boto3 v1.35.83.
However implementing that with Terraform is not yet possible.
First it returns
Error: expected production_variants.0.managed_instance_scaling.0.min_instance_count to be at least (1), got 0
on configuration like
resource "aws_sagemaker_endpoint_configuration" "sagemaker_configurations" {
name_prefix = var.endpoint_name
production_variants {
variant_name = "AllTraffic"
model_name = aws_sagemaker_model.sagemaker_model.name
initial_instance_count = 1
instance_type = var.sagemaker_instance_type
managed_instance_scaling {
status = "ENABLED"
min_instance_count = 0
max_instance_count = 1
}
}
}
second does not have necessary inference_component
resource yet. Though boto3 already has create_inference_component method.
Affected Resource(s) and/or Data Source(s)
aws_sagemaker_endpoint_configuration
Potential Terraform Configuration
resource "aws_sagemaker_interface_component" "sagemaker_component" {
name = var.component_name
endpoint_name = aws_sagemaker_endpoint.sagemaker_endpoint.name
variant_name = "AllTraffic"
specification = {
model_name = aws_sagemaker_model.sagemaker_model.name
startup_parameters {
model_data_download_timeout_in_seconds = 3600,
container_startup_health_check_timeout_in_seconds = 3600,
}
compute_resource_requirements {
min_memory_required_in_mb = 1024,
number_of_accelerator_devices_required = 1,
}
}
}
Also with component added as resource modelName for configuration should be now optional.
resource "aws_sagemaker_endpoint_configuration" "test_configurations" {
name = "test-tf-config-zero-scale"
execution_role_arn = aws_iam_role.iam_for_sagemaker.arn
production_variants {
variant_name = "AllTraffic"
initial_instance_count = 1
instance_type = var.sagemaker_instance_type
managed_instance_scaling {
status = "ENABLED"
min_instance_count = 0
max_instance_count = 1
}
}
}
this returns expected error
Error: Missing required argument. The argument "model_name" is required, but no definition was found.
References
No response
Would you like to implement a fix?
No