Added section about node autoscaler to node design doc.

nikhiljindal · Aug 21, 2015 · 0f8bd60 · 0f8bd60
1 parent 755287c
commit 0f8bd60
Show file tree

Hide file tree

Showing 2 changed files with 42 additions and 10 deletions.
diff --git a/docs/admin/cluster-management.md b/docs/admin/cluster-management.md
@@ -102,6 +102,47 @@ cluster/gce/upgrade.sh latest_stable
 The `cluster/kube-push.sh` script will do a rudimentary update.  This process is still quite experimental, we
 recommend testing the upgrade on an experimental cluster before performing the update on a production cluster.
 
+## Resizing a cluster
+
+If your cluster runs short on resources you can easily add more machines to it if your cluster is running in [Node self-registration mode](node.md#self-registration-of-nodes).
+If you're using GCE or GKE it's done by resizing Instance Group managing your Nodes. It can be accomplished by modifying number of instances on `Compute > Compute Engine > Instance groups > your group > Edit group` [Google Cloud Console page](https://console.developers.google.com) or using gcloud CLI:
+
+```
+gcloud compute instance-groups managed --zone compute-zone resize my-cluster-minon-group --new-size 42
+```
+
+Instance Group will take care of putting appropriate image on new machines and start them, while Kubelet will register its Node with API server to make it available for scheduling. If you scale the instance group down, system will randomly choose Nodes to kill.
+
+In other environments you may need to configure the machine yourself and tell the Kubelet on which machine API server is running.
+
+
+### Horizontal auto-scaling of nodes (GCE)
+
+If you are using GCE, you can configure your cluster so that the number of nodes will be automatically scaled based on their CPU and memory utilization.
+Before setting up the cluster by ```kube-up.sh```, you can set ```KUBE_ENABLE_NODE_AUTOSCALER``` environment variable to ```true``` and export it.
+The script will create an autoscaler for the instance group managing your nodes.
+
+The autoscaler will try to maintain the average CPU and memory utilization of nodes within the cluster close to the target value.
+The target value can be configured by ```KUBE_TARGET_NODE_UTILIZATION``` environment variable (default: 0.7) for ``kube-up.sh`` when creating the cluster.
+The node utilization is the total node's CPU/memory usage (OS + k8s + user load) divided by the node's capacity.
+If the desired numbers of nodes in the cluster resulting from CPU utilization and memory utilization are different,
+the autosclaer will choose the bigger number.
+The number of nodes in the cluster set by the autoscaler will be limited from ```KUBE_AUTOSCALER_MIN_NODES``` (default: 1)
+to ```KUBE_AUTOSCALER_MAX_NODES``` (default: the initial number of nodes in the cluster).
+
+The autoscaler is implemented as a Compute Engine Autoscaler.
+The initial values of the autoscaler parameters set by ``kube-up.sh`` and some more advanced options can be tweaked on
+`Compute > Compute Engine > Instance groups > your group > Edit group`[Google Cloud Console page](https://console.developers.google.com)
+or using gcloud CLI:
+
+```
+gcloud preview autoscaler --zone compute-zone <command>
+```
+
+Note that autoscaling will work properly only if node metrics are accessible in Google Cloud Monitoring.
+To make the metrics accessible, you need to create your cluster with ```KUBE_ENABLE_CLUSTER_MONITORING```
+equal to ```google``` or ```googleinfluxdb``` (```googleinfluxdb``` is the default value).
+
 ## Maintenance on a Node
 
 If you need to reboot a node (such as for a kernel upgrade, libc upgrade, hardware repair, etc.), and the downtime is

diff --git a/docs/admin/node.md b/docs/admin/node.md
@@ -192,16 +192,6 @@ For self-registration, the kubelet is started with the following options:
 Currently, any kubelet is authorized to create/modify any node resource, but in practice it only creates/modifies
 its own.  (In the future, we plan to limit authorization to only allow a kubelet to modify its own Node resource.)
 
-If your cluster runs short on resources you can easily add more machines to it if your cluster is running in Node self-registration mode. If you're using GCE or GKE it's done by resizing Instance Group managing your Nodes. It can be accomplished by modifying number of instances on `Compute > Compute Engine > Instance groups > your group > Edit group` [Google Cloud Console page](https://console.developers.google.com) or using gcloud CLI:
-
-```
-gcloud compute instance-groups managed --zone compute-zone resize my-cluster-minon-group --new-size 42
-```
-
-Instance Group will take care of putting appropriate image on new machines and start them, while Kubelet will register its Node with API server to make it available for scheduling. If you scale the instance group down, system will randomly choose Nodes to kill.
-
-In other environments you may need to configure the machine yourself and tell the Kubelet on which machine API server is running.
-
 #### Manual Node Administration
 
 A cluster administrator can create and modify Node objects.
@@ -258,6 +248,7 @@ Set the `cpu` and `memory` values to the amount of resources you want to reserve
 Place the file in the manifest directory (`--config=DIR` flag of kubelet).  Do this
 on each kubelet where you want to reserve resources.
 
+
 ## API Object
 
 Node is a top-level resource in the kubernetes REST API. More details about the