cluster-monitoring

Kubernetes Cluster Monitoring

Table of Contents

Introduction

This chapter will demonstrate how to monitor a Kubernetes cluster using the following:

Kubernetes Dashboard
Heapster, InfluxDB and Grafana
Prometheus, Node exporter and Grafana

Pre-requisites

A 3 master nodes and 5 worker nodes cluster as explained at ../cluster-install#multi-master-multi-node-multi-az-gossip-based-cluster is used for this chapter.

All configuration files for this chapter are in the cluster-monitoring directory.

Kubernetes Dashboard

Kubernetes Dashboard is a general purpose web-based UI for Kubernetes clusters.

Deploy Dashboard using the following command:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

Dashboard can be seen using the following command:

kubectl proxy

Now, Dashboard is accessible at http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/.

Starting with Kubernetes 1.7, Dashboard supports authentication. Read more about it at https://github.com/kubernetes/dashboard/wiki/Access-control#introduction. We’ll use a bearer token for authentication.

Check existing secrets in the kube-system namespace:

kubectl -n kube-system get secret

It shows the output as:

NAME                                     TYPE                                  DATA      AGE
attachdetach-controller-token-dhkcr      kubernetes.io/service-account-token   3         3h
certificate-controller-token-p131b       kubernetes.io/service-account-token   3         3h
daemon-set-controller-token-r4mmp        kubernetes.io/service-account-token   3         3h
default-token-7vh0x                      kubernetes.io/service-account-token   3         3h
deployment-controller-token-jlzkj        kubernetes.io/service-account-token   3         3h
disruption-controller-token-qrx2v        kubernetes.io/service-account-token   3         3h
dns-controller-token-v49b6               kubernetes.io/service-account-token   3         3h
endpoint-controller-token-hgkbm          kubernetes.io/service-account-token   3         3h
generic-garbage-collector-token-34fvc    kubernetes.io/service-account-token   3         3h
horizontal-pod-autoscaler-token-lhbkf    kubernetes.io/service-account-token   3         3h
job-controller-token-c2s8j               kubernetes.io/service-account-token   3         3h
kube-dns-autoscaler-token-s3svx          kubernetes.io/service-account-token   3         3h
kube-dns-token-92xzb                     kubernetes.io/service-account-token   3         3h
kube-proxy-token-0ww14                   kubernetes.io/service-account-token   3         3h
kubernetes-dashboard-certs               Opaque                                2         9m
kubernetes-dashboard-key-holder          Opaque                                2         9m
kubernetes-dashboard-token-vt0fd         kubernetes.io/service-account-token   3         10m
namespace-controller-token-423gh         kubernetes.io/service-account-token   3         3h
node-controller-token-r6lsr              kubernetes.io/service-account-token   3         3h
persistent-volume-binder-token-xv30g     kubernetes.io/service-account-token   3         3h
pod-garbage-collector-token-fwmv4        kubernetes.io/service-account-token   3         3h
replicaset-controller-token-0cg8r        kubernetes.io/service-account-token   3         3h
replication-controller-token-3fwxd       kubernetes.io/service-account-token   3         3h
resourcequota-controller-token-6rl9f     kubernetes.io/service-account-token   3         3h
route-controller-token-9brzb             kubernetes.io/service-account-token   3         3h
service-account-controller-token-bqlsk   kubernetes.io/service-account-token   3         3h
service-controller-token-1qlg6           kubernetes.io/service-account-token   3         3h
statefulset-controller-token-kmgzg       kubernetes.io/service-account-token   3         3h
ttl-controller-token-vbnhf               kubernetes.io/service-account-token   3         3h

We can login using any secret with type 'kubernetes.io/service-account-token', though each of them have different privileges. In our case, we’ll use the token from secret default-token-7vh0x to login. Use the following command to get the token for this secret:

kubectl -n kube-system describe secret default-token-7vh0x

Note you’ll need to replace default-token-7vh0x with the default-token from your output list.

It shows the output:

Name:         default-token-7vh0x
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name=default
              kubernetes.io/service-account.uid=3a3fea86-b3a1-11e7-9d90-06b1e747c654

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1046 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkZWZhdWx0LXRva2VuLTd2aDB4Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImRlZmF1bHQiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIzYTNmZWE4Ni1iM2ExLTExZTctOWQ5MC0wNmIxZTc0N2M2NTQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06ZGVmYXVsdCJ9.GHW-7rJcxmvujkClrN6heOi_RYlRivzwb4ScZZgGyaCR9tu2V0Z8PE5UR6E_3Vi9iBCjuO6L6MLP641bKoHB635T0BZymJpSeMPQ7t1F02BsnXAbyDFfal9NUSV7HoPAhlgURZWQrnWojNlVIFLqhAPO-5T493SYT56OwNPBhApWwSBBGdeF8EvAHGtDFBW1EMRWRt25dSffeyaBBes5PoJ4SPq4BprSCLXPdt-StPIB-FyMx1M-zarfqkKf7EJKetL478uWRGyGNNhSfRC-1p6qrRpbgCdf3geCLzDtbDT2SBmLv1KRjwMbW3EF4jlmkM4ZWyacKIUljEnG0oltjA

Copy the value of token from this output, select Token in the Dashboard login window, and paste the text. Click on SIGN IN to see the default Dashboard view:

Click on Nodes to see a textual representation about the nodes running in the cluster:

Install a Java application as explained in Deploying applications using Kubernetes Helm charts.

Click on Pods, again to see a textual representation about the pods running in the cluster:

This will change after Heapster, InfluxDB and Grafana are installed.

Heapster, InfluxDB and Grafana

Heapster is a metrics aggregator and processor. It is installed as a cluster-wide pod. It gathers monitoring and events data for all containers on each node by talking to the Kubelet. Kubelet itself fetches this data from cAdvisor. This data is persisted in a time series database InfluxDB for storage. The data is then visualized using a Grafana dashboard, or it can be viewed in Kubernetes Dashboard.

Heapster collects and interprets various signals like compute resource usage, lifecycle events, etc., and exports cluster metrics via REST endpoints.

Heapster, InfluxDB and Grafana are Kubernetes addons.

Installation

Execute this command to install Heapster, InfluxDB and Grafana:

kubectl create -f heapster/templates/

Heapster is now aggregating metrics from the cAdvisor instances running on each node. This data is stored in an InfluxDB instance running in the cluster. Grafana dashboard, accessible at http://localhost:8001/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy/?orgId=1, now shows the information about the cluster.

Note	Grafana dashboard will not be available if Kubernetes proxy is not running. If proxy is not running, it can be started with the command `kubectl proxy`.

Grafana dashboard

There are some built-in dashboards for monitoring the cluster and workloads. They are available by clicking on the upper left corner of the screen.

The “Cluster” dashboard shows all worker nodes, and their CPU and memory metrics. Type in a node name to see its collected metrics during a chosen period of time.

The cluster dashboard looks like this:

The “Pods”" dashboard allows you to see the resource utilization of every pod in the cluster. As with nodes, you can select the pod by typing its name in the top filter box.

After the deployment of Heapster, Kubernetes Dashboard now shows additional graphs such as CPU and Memory utilization for pods and nodes, and other workloads.

The updated view of the cluster in Kubernetes Dashboard looks like this:

The updated view of pods looks like this:

Cleanup

Remove all the installed components:

kubectl delete -f heapster/templates/

Prometheus, Node exporter and Grafana

Prometheus is an open-source systems monitoring and alerting toolkit. Prometheus collects metrics from monitored targets by scraping metrics from HTTP endpoints on these targets.

Different targets to scrape are defined in a Prometheus configuration file. Targets may be statically configured via the static_configs parameter in the configuration file or dynamically discovered using one of the supported service-discovery mechanisms (Consul, DNS, Etcd, etc.).

Node exporter is a Prometheus exporter for hardware and OS metrics exposed by *NIX kernels.

Installation

The Prometheus configuration file is defined as a ConfigMap in the file prometheus/templates/prometheus-configmap.yaml.

We need to provide the location of the etcd server in our cluster in this configuration file. In our case, etcd is running inside the Kubernetes cluster. Find the IP address of etcd pods using this command:

$ kubectl get pods --namespace=kube-system | grep etcd
etcd-server-events-ip-172-20-102-165.eu-central-1.compute.internal        1/1       Running   0          3m
etcd-server-events-ip-172-20-59-49.eu-central-1.compute.internal          1/1       Running   0          2m
etcd-server-events-ip-172-20-87-29.eu-central-1.compute.internal          1/1       Running   0          3m
etcd-server-ip-172-20-102-165.eu-central-1.compute.internal               1/1       Running   0          3m
etcd-server-ip-172-20-59-49.eu-central-1.compute.internal                 1/1       Running   0          2m
etcd-server-ip-172-20-87-29.eu-central-1.compute.internal                 1/1       Running   0          3m

Note down the name of the pod that starts with etcd-server-ip. There is one etcd server per master. So if you are using a three master cluster, as is the case here, then you’ll see three entries starting with etcd-server-ip. Pick any of them.

Get more details about this pod:

kubectl describe pod/etcd-server-ip-172-20-102-165.eu-central-1.compute.internal --namespace=kube-system

The output looks like this:

Name:         etcd-server-ip-172-20-102-165.eu-central-1.compute.internal
Namespace:    kube-system
Node:         ip-172-20-102-165.eu-central-1.compute.internal/172.20.102.165
Start Time:   Fri, 03 Nov 2017 08:53:24 +0100
Labels:       k8s-app=etcd-server
Annotations:  kubernetes.io/config.hash=ca13c974790baaedca8c01a8eb4a3518
              kubernetes.io/config.mirror=ca13c974790baaedca8c01a8eb4a3518
              kubernetes.io/config.seen=2017-11-03T07:53:19.754732296Z
              kubernetes.io/config.source=file
Status:       Running
IP:           172.20.102.165
Containers:
  etcd-container:

. . .

Node-Selectors:  <none>
Tolerations:     :NoExecute
Events:          <none>

Note down the IP address from this output.

Update the file prometheus/templates/prometheus-configmap.yaml, and replace <IP> with the IP address of the etcd server in your cluster. The updated fragment will look like as shown:

- job_name: 'etcd'
  target_groups:
  - targets:
    - 172.20.43.222:4001

etcd clusters deployed with the most recent version of kops use port 4001, if you have a newer version of etcd it will be listening on port 2379.

Once you save the etcd information then you can deploy the ConfigMap:

$ kubectl create -f prometheus/templates/prometheus-configmap.yaml
configmap "prometheus" created

Next, deploy Prometheus into your cluster:

$ kubectl create -f prometheus/templates/prometheus-deployment.yaml
service "prometheus" created
deployment "prometheus" created

Next, we will deploy the node exporter DaemonSet which will read system level metrics from each node and export them to Prometheus. Node exporter is defined as a DaemonSet, and so there is a single instance running on each node of the cluster:

$ kubectl create -f prometheus/templates/node-exporter.yaml
service "node-exporter" created
daemonset "node-exporter" created

Finally, deploy the Grafana dashboard:

$ kubectl create -f prometheus/templates/grafana.yml
service "grafana" created
deployment "grafana" created

Prometheus Dashboard

Prometheus is now scraping metrics from the etcd server, the Kubernetes API server and the node exporter. Metrics exported by different sources are listed below:

etcd: https://coreos.com/etcd/docs/latest/metrics
Kubernetes API server: https://github.com/kubernetes/kube-state-metrics
Node exporter: https://github.com/prometheus/node_exporter

Let’s look at these these metrics in the Prometheus dashboard. There are a few way to access the Prometheus dashboard?

You can use port forwarding. First find the pod name:

$ kubectl get pods -l app=prometheus
NAME                         READY     STATUS    RESTARTS   AGE
prometheus-570506388-8z5hq   1/1       Running   0          1m

Then forward the traffic on that pod:

$ kubectl port-forward prometheus-570506388-8z5hq 8080:9090 &

and enter http://127.0.0.1:8080/graph in your browser. Remember to replace the pod name in the port-forward command above.

Prometheus dashboard looks like:

A wide set of metrics are available and can be seen in the dashboard. Here is a snapshot of metrics from etcd:

Here is a snapshot of metrics from the Kubernetes API server:

Here is a snapshot of metrics from the node exporter:

Grafana dashboard

Start Kubernetes proxy, if not already running, using the command:

kubectl proxy

Grafana dashboard is now accessible at http://localhost:8001/api/v1/proxy/namespaces/default/services/grafana/ and looks like as shown:

Cleanup

Remove all the installed components:

kubectl delete -f prometheus/templates/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster-monitoring

cluster-monitoring

readme.adoc

Kubernetes Cluster Monitoring

Introduction

Pre-requisites

Kubernetes Dashboard

Heapster, InfluxDB and Grafana

Installation

Grafana dashboard

Cleanup

Prometheus, Node exporter and Grafana

Installation

Prometheus Dashboard

Grafana dashboard

Cleanup

Name		Name	Last commit message	Last commit date
parent directory ..
heapster/templates		heapster/templates
prometheus/templates		prometheus/templates
readme.adoc		readme.adoc

Files

cluster-monitoring

Directory actions

More options

Directory actions

More options

Latest commit

History

cluster-monitoring

Folders and files

parent directory

readme.adoc

Kubernetes Cluster Monitoring

Introduction

Pre-requisites

Kubernetes Dashboard

Heapster, InfluxDB and Grafana

Installation

Grafana dashboard

Cleanup

Prometheus, Node exporter and Grafana

Installation

Prometheus Dashboard

Grafana dashboard

Cleanup