Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][DO NOT MERGE] Proposal: Auto-scaling #546

Closed
wants to merge 8 commits into from
Closed
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 201 additions & 0 deletions docs/autoscaling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
## Abstract
Auto-scaling is a data-driven feature that allows users to increase or decrease capacity as needed by controlling the number of replicas deployed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-scaling of pods.

within the system automatically.

## Motivation

Applications experience peaks and valleys in usage. In order to respond to increases and decreases in load administrators
scale their applications by adding computing resources. In the cloud computing environment this can be
done automatically based on statistical analysis and thresholds.

### Goals

* Provide a concrete proposal for implementing auto-scaling components within Kubernetes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-scaling pods.

* Implementation proposal should be in line with current discussions in existing issues:
* Resize verb - [1629](https://github.com/GoogleCloudPlatform/kubernetes/issues/1629)
* Config conflicts - [Config](https://github.com/GoogleCloudPlatform/kubernetes/blob/c7cb991987193d4ca33544137a5cb7d0292cf7df/docs/config.md#automated-re-configuration-processes)
* Rolling updates - [1353](https://github.com/GoogleCloudPlatform/kubernetes/issues/1353)
* Multiple scalable types - [1624](https://github.com/GoogleCloudPlatform/kubernetes/issues/1624)
* Document the currently known use cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-goals - solving autoscaling of nodes (add a new section)


## Constraints and Assumptions

* The auto-scale component will not be part of a replication controller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replication controllers should not need to know about autoscalers - they are the target of autoscaling, not the mechanism (reference the discussion where this was decided)

* Data gathering semantics will not be part of an auto-scaler but an auto-scaler may use data to perform threshold checking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autoscalers should be loosely coupled with data gathering in order to allow a wide variety of input sources.

* Auto-scalable resources will support a resize verb ([1629](https://github.com/GoogleCloudPlatform/kubernetes/issues/1629))
such that the auto-scaler does not directly manipulate the underlying resource.
* Thresholds will be set by the application administrator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that this is a valid assumption. Rephrase - "Initially, most thresholds will be set by application administrators. It should be possible for an autoscaler to be written later that sets thresholds automatically based on past behavior (CPU used vs incoming requests)."

* The auto-scaler must be aware of user defined actions so it does not override them unintentionally (for instance someone
explicitly setting the replica count to 0 should mean that the auto-scaler does not try to scale the application up)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to write a custom auto-scaler and drive a replication controller without having to modify the existing auto-scaler.


## Use Cases

### Scaling based on traffic

The current, most obvious use case, is scaling an application based on traffic. Within the Kubernetes ecosystem there
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of traffic.

are routing layers that serve to direct requests to underlying `endpoints`. These routing layers are good examples of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one knows what "routing layers" are, or endpoints. I don't think that's a good example here. Rephrase - "Most applications will expose one or more network endpoints for clients to connect to. Many of those endpoints will be load balanced or situated behind a proxy - the data from those proxies and load balancers can be used to estimate client to server traffic for applications. This is the primary, but not sole, source of data for making decisions."

candidates to provide data to the auto-scaler.

Within Kubernetes a [kube proxy](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/services.md#ips-and-portals)
running on each node directs service requests to the underlying implementation.

External to the Kubernetes core infrastructure (but still within the Kubernetes ecosystem) lies the OpenShift routing layer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads weird. Just say "while the proxy provides internal inter pod connections, there will be L3 and L7 proxies and load balancers that manage traffic to backends. OpenShift, for instance, adds a "route" resource for defining external to internal traffic flow. The "routers" are HAProxy or Apache load balancers that aggregate many different services and pods and can serve as a data source for the number of backends."

OpenShift routers are `pods` with externally exposed IP addresses that are used to map service requests from the external
world to internal `endpoints` via user defined host aliases known as `routes`.

### Scaling based on predictive analysis

Scaling may also occur based on predictions of system state like anticipated load, historical data, etc. Hand in hand
with scaling based on traffic, predictive analysis may be used to determine anticipated system load and scale the application automatically.

### Scaling based on arbitrary data

Administrators may wish to scale the application based on any number of arbitrary data points such as job execution time or
duration of active sessions. There are any number of reasons an administrator may wish to increase or decrease capacity which
means the auto-scaler must be a configurable, extensible component.

## Specification

In order to facilitate talking about auto-scaling the following definitions are used:

* `ReplicationController` - the first building block of auto scaling. Pods are deployed and scaled by a `ReplicationController`.
* kube proxy - a request control point. The proxy handles internal inter-pod traffic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Control point is weird.

* router - an OpenShift request control point. The routing layer handles outside to inside traffic requests
* auto-scaler - scales replicas up and down by using the `resize` endpoint provided by scalable resources (`ReplicationController`)


### Auto-Scaler

The Auto-Scaler is a state reconciler responsible for checking data against configured scaling thresholds
and calling the `resize` endpoint to change the number of replicas. The scaler will
use a client/cache implementation to receive watch data from the data aggregators and respond to them by
scaling the application. Auto-scalers are created and defined like other resources via REST endpoints and belong to the
namespace just as a `ReplicationController` or `Service`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talk about whether an autoscaler should be annotation data on a replication controller vs its own object and tradeoffs.


//The auto scaler interface
type AutoScalerInterface interface {
//Adjust a resource's replica count. Calls resize endpoint
ScaleApplication(num int) error
}

type AutoScaler struct {
//Thresholds
AutoScaleThresholds []AutoScaleThreshold

//turn auto scaling on or off
Enabled boolean
//max replicas that the auto scaler can use, empty is unlimited
MaxAutoScaleCount int
//min replicas that the auto scaler can use, empty == 0 (idle)
MinAutoScaleCount int

//the selector that provides the scaler with access to the scalable component
Selector string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean.

}


//abstracts the data analysis from the auto-scaler
//example: scale when RequestsPerSecond (type) are above 50 (value) for 30 seconds (duration)
type AutoScaleThresholdInterface interface {
//called by the auto-scaler to determine if this threshold is met or not
ShouldScale() boolean
}

//generic type definition
type AutoScaleThreshold struct {
//scale based on this threshold (see below for definition)
Type Statistic
//after this duration
Duration time.Duration
//when this value is passed
Value float
}

### Data Aggregator

Data aggregation is opaque to the the auto-scaler resource. The auto-scaler is configured to use `AutoScaleThresholds`
that know how to work with the underlying data in order to know if an application must be scaled up or down. Data aggregation
must feed a common data structure to ease the development of `AutoScaleThreshold`s but it does not matter to the
auto-scaler whether this occurs in a push or pull implementation. For the purposes of this design I will propose a solution
for the existing routing layers that uses a pull mechanism.


//common statistics type for monitoring routers that can be used by threshold implementations
type Statistics struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is sufficient to represent realistic statistics that we can make decisions on. You've omitted concepts like averaging, smoothing, etc, which all need to be at least referenced. I also don't see a lot of the carryover from v2 and the design elements of autoscaling. I would suggest you make this section either more detailed, or less specific:

We want to aggregate stats. Stats must be smoothed but we also need to take into account spikes. There is no magic bullet for this. We need to have a simple implementation that can be replaced with very complex implementation. We want to do the bare minimum to gather core data.

I suspect that any stats implementation we will do is fundamentally flawed unless it's done by folks with a lot more experience. This doesn't map very closely to time series stats I've seen in other places.

//resource type that this statistic belongs to: router, job, etc
ResourceType string
//resource name that stats are being reported for
ResourceName string
//reporter name, to indicate where the statistics came from.
ReporterName string
//interval start date/time
StartTime time.Time
//interval stop date/time
StopTime time.Time
//the statistics
Stats map[Statistic]float
}

//some initial stat types geared toward the routing layer
type Statistic string

const (
RequestsPerSecond Statistic = "requestPerSecond"
SessionsPerSecond Statistic = "sessionsPerSecond"
BytesIn Statistic = "bytesIn"
BytesOut Statistic = "bytesOut"
CPUUsage Statistic = "cpuUsage"
AvgRequestDuration Statistic = "avgRequestDuration"
)


//implementation for routing layers specified in use cases above
type StatsGatherer interface {
GatherStats() []Statistics
}

//Gather stats from the proxy, uses configured minions to find proxies
type KubeProxyStatsController struct {}

//OpenShift specific
type RouterStatsController struct {
//GatherStats delegates to router implementation which may be socket based, http based, etc.
RouterList []router.Router
}

Not shown is the initialization of a `StatsGatherer`. When creating a `StatsGatherer` a registry will be given so that
the gatherer can save data that the `AutoScaleThreshold`s act upon. This means that other services storing statistics
potentially can piggyback in this registry.


## Use Case Realization

### Scaling based on traffic

1. User defines the application's auto-scaling resources

{
"id": "myapp-autoscaler",
"kind": "AutoScaler",
"apiVersion": "v1beta1",
"maxAutoScaleCount": 50,
"minAutoScaleCount": 1,
"thresholds": [
{
"id": "myapp-rps",
"kind": "AutoScaleThreshold",
"type": "requestPerSecond",
"durationVal": 30,
"durationInterval": "seconds",
"value": 50,
}
],
"selector": "myapp-replcontroller"
}

1. System creates new auto-scaler with defined thresholds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the system? You should frame this in terms of our current terminology (an auto scale controller will watch X for Y)

1. Periodically the system loops through defined thresholds calling `threshold.ShouldScale()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pseudocode interface is too specific. Rephrase this: "Periodically the system should loop through defined thresholds, and determine if the threshold is exceeded."

1. The threshold looks for the `requestPerSecond` statistic for `myapp-rps` in the configured registry
1. The threshold compares the historical data and current data and determines if the app should be scaled
1. If the app must be scaled the auto-scaler calls the `resize` endpoint for `myapp-replcontroller`