Support health (readiness) checks #620

dbcode · 2014-07-25T05:14:16Z

GCE network load balancers can be configured with health checks (periodic HTTP requests to a user-defined endpoint), such that instances are removed from the pool if they don't respond to the health checks promptly with a 200 status code.

Kubernetes should be able to reuse the same health checks, such that if a user has created a service that they wish to use from Kubernetes, their health checks will do what they expect them to do: cause any unhealthy instances to be removed from the load balancing pool until healthy again.

Ideally, if N frontends talk to M backends, this should not result in N x M health check HTTP requests per interval (i.e. each of the N frontends independently health checking each of the M backends). If that's not possible, maybe Kubernetes could transparently create and use a GCE network load balancer for each service that has more than a certain number of replicas (whether marked as "external" or not), instead of trying to do its own load balancing.

brendandburns · 2014-07-25T05:19:29Z

This is already basically possible.

The kubelet implements HTTP health checks, and restarts the container if it is failing. So no task should actually be failing for very long. This means that for M backends you only do M health checks.

Taking it a step further, we could consider adding health checks to the Service polling, but in some ways that seems redundant, since only healthy tasks should be in the service pool anyway.

dbcode · 2014-07-25T05:28:13Z

Can you clarify for me - let's say I have a cluster of 100 frontend containers using a backend service with 200 containers, and I have an HTTP health check on the backend service polling the URL "/healthy" every 5 seconds. How many requests to /healthy does each backend instance (container) see every 5 seconds?

Also, is restarting the container something that can be configured? I may not want to restart the container; e.g. on instance migration, I might want to just remove it from the LB pool 30-60 seconds before the migration takes place, and then put it back in once the migration is complete (thus minimizing broken connections).

brendandburns · 2014-07-25T05:40:17Z

The current health check is a "liveness" health check, that is performed at the backend container. Thus, your 200 backends would only see one health check every 5 seconds.

It is important to note that this is not a "readiness" healthcheck which indicates that it is ready to serve. We currently don't have a notion of "readiness", but we will add it eventually. When we do, we'll implement it in the same way, so that the health check is still at the level of the backend controller, not the frontend service, so health checks are still 1-1 with the backend container.

For your second point, that's exactly the reason for differentiating between liveness and readiness.

bgrant0607 · 2014-07-25T05:55:21Z

There are many scenarios where it is useful to differentiate between liveness and readiness:

Graceful draining
Startup latency
Offline for data reloading or other maintenance

And many components (any systems that disrupt pods and/or hosts + any systems that manage sets of pods) care about readiness: rollout tools, reschedulers, kernel updaters, worker pool managers, ...

bgrant0607 · 2014-09-05T21:43:39Z

Readiness information would be useful during rolling service updates, also.

bgrant0607 · 2015-02-14T09:23:28Z

Readiness has been implemented. Yeah! Kudos to @mikedanese.

…-aliyun-ansible-deployment add ansible deployment for aliyun instances to rebase 1.3.3

Kubectl Book Final Edits

Allows for more flexibility when selecting the network interface that flannel should be using. Addresses kubernetes#620

Allows for even more flexibility when selecting the network interface that flannel should be using Addresses kubernetes#620

bgrant0607 changed the title ~~Support health checks in Kubernetes load balancing pools~~ Support health (readiness) checks in Kubernetes load balancing pools Jul 25, 2014

lavalamp added the enhancement label Jul 27, 2014

bgrant0607 mentioned this issue Aug 23, 2014

Connecting containers #494

Closed

bgrant0607 added api labels Sep 5, 2014

bgrant0607 mentioned this issue Oct 2, 2014

Support master election #1542

Closed

bgrant0607 changed the title ~~Support health (readiness) checks in Kubernetes load balancing pools~~ Support health (readiness) checks Oct 2, 2014

bgrant0607 added the api/upward label Oct 3, 2014

bgrant0607 added this to the v0.9 milestone Oct 4, 2014

bgrant0607 mentioned this issue Oct 8, 2014

Add ability for container to publish metadata moby/moby#2336

Closed

thockin mentioned this issue Oct 10, 2014

improvement to livenessProbe health checks #1727

Closed

bgrant0607 added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Oct 15, 2014

smarterclayton mentioned this issue Oct 20, 2014

Allow users to wait for conditions from kubectl and using the API #1899

Closed

This was referenced Nov 19, 2014

Exposing services outside the cluster (or in different parts of the cluster) to pods #2358

Closed

Service reorg ideas #2585

Closed

bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Dec 3, 2014

bgrant0607 added area/usability status/help-wanted labels Dec 11, 2014

This was referenced Dec 16, 2014

Kubernetes doesn't handle autoscaled minions properly #2922

Closed

Create ReplicaSet #3024

Closed

erictune mentioned this issue Jan 6, 2015

Unidling proposal #3247

Closed

bgrant0607 mentioned this issue Jan 7, 2015

Convert ReplicationController to a plugin (was ReplicationController redesign) #3058

Closed

thockin mentioned this issue Jan 8, 2015

Containers startup throttling #3312

Closed

bgrant0607 removed the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jan 9, 2015

bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Jan 9, 2015

mikedanese mentioned this issue Jan 21, 2015

refactor pkg/health into more reusable pkg/probe #3695

Merged

mikedanese mentioned this issue Feb 3, 2015

Support readiness checks #4048

Merged

goltermann removed this from the v0.9 milestone Feb 6, 2015

dbcode removed this from the v0.9 milestone Feb 6, 2015

bgrant0607 closed this as completed Feb 14, 2015

YanXiaoping mentioned this issue Jun 12, 2015

Exec Readiness probe usage #7891

Closed

mqliang pushed a commit to mqliang/kubernetes that referenced this issue Dec 8, 2016

Merge pull request kubernetes#620 from keontang/caicloud-rebase-1.3.3…

16d7e15

…-aliyun-ansible-deployment add ansible deployment for aliyun instances to rebase 1.3.3

mqliang pushed a commit to mqliang/kubernetes that referenced this issue Mar 3, 2017

Merge pull request kubernetes#620 from keontang/caicloud-rebase-1.3.3…

30259bc

…-aliyun-ansible-deployment add ansible deployment for aliyun instances to rebase 1.3.3

stephanwesten mentioned this issue Mar 16, 2017

Issue with /docs/user-guide/replication-controller/ kubernetes/website#2859

Closed

2 tasks

camflan mentioned this issue Dec 8, 2017

The Pod-To-Pod network does not work when deploy flanneld with DaemonSet in kubernetes 1.7.5 #51881

Closed

k8s-ci-robot mentioned this issue Dec 26, 2017

[WIP] move CloudProvider to KubeletConfiguration #57620

Closed

wking pushed a commit to wking/kubernetes that referenced this issue Jul 21, 2020

Merge pull request kubernetes#620 from pwittrock/book

3350468

Kubectl Book Final Edits

b3atlesfan pushed a commit to b3atlesfan/kubernetes that referenced this issue Feb 5, 2021

Added iface-regex option

2c62df2

Allows for more flexibility when selecting the network interface that flannel should be using. Addresses kubernetes#620

b3atlesfan pushed a commit to b3atlesfan/kubernetes that referenced this issue Feb 5, 2021

Added ability to specify multiple ifaces and iface regexes

448d6f8

Allows for even more flexibility when selecting the network interface that flannel should be using Addresses kubernetes#620

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support health (readiness) checks #620

Support health (readiness) checks #620

dbcode commented Jul 25, 2014

brendandburns commented Jul 25, 2014

dbcode commented Jul 25, 2014

brendandburns commented Jul 25, 2014

bgrant0607 commented Jul 25, 2014

bgrant0607 commented Sep 5, 2014

bgrant0607 commented Feb 14, 2015

Support health (readiness) checks #620

Support health (readiness) checks #620

Comments

dbcode commented Jul 25, 2014

brendandburns commented Jul 25, 2014

dbcode commented Jul 25, 2014

brendandburns commented Jul 25, 2014

bgrant0607 commented Jul 25, 2014

bgrant0607 commented Sep 5, 2014

bgrant0607 commented Feb 14, 2015