-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tune down initialDelaySeconds for readinessProbe. #33146
Conversation
It seems like the Readiness probe succeeded at the first attempt. And endpoints being detected and exposed in around 40ms after that. The total titme is ~25s since kubelete received this pod. And it took ~22s to start all three container( I think this setup for readinessProbe should work properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
timeoutSeconds: 5 | ||
periodSeconds: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once it becomes ready, is there anything that would make it un-ready? Maybe this doesn't need to be small
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. Readiness probe would not fail if kubedns don't fail. If kubedns fail, liveness probe will catch.
BTW, this PR may have conflict with #32406. Should wait for that merge and I will rebase. |
10f9642
to
55db762
Compare
Jenkins GKE smoke e2e failed for commit 55db762. Full PR test history. The magic incantation to run this job again is |
The new commit keeps periodSeconds as default. |
@k8s-bot test this: github issue #IGNORE |
Jenkins unit/integration failed for commit 55db762. Full PR test history. The magic incantation to run this job again is |
@k8s-bot unit test this |
Automatic merge from submit-queue |
Removing label |
send a cherrypick for 1.4.1 On Thu, Oct 6, 2016 at 8:20 PM, k8s-cherrypick-bot <notifications@github.com
|
…32422-#32406-#33146-#33774-upstream-release-1.4 Automatic merge from submit-queue Automated cherry pick of #31894 #32422 #32406 #33146 #33774 Cherry pick of #31894 #32422 #32406 #33146 #33774 on release-1.4. #31894: Support graceful termination in kube-dns #32422: Added --log-facility flag to enhance dnsmasq logging #32406: Split dns healthcheck into two different urls #33146: Tune down initialDelaySeconds for readinessProbe #33774: Bump up addon kube-dns to v20 for graceful termination
…-of-#31894-kubernetes#32422-kubernetes#32406-kubernetes#33146-kubernetes#33774-upstream-release-1.4 Automatic merge from submit-queue Automated cherry pick of kubernetes#31894 kubernetes#32422 kubernetes#32406 kubernetes#33146 kubernetes#33774 Cherry pick of kubernetes#31894 kubernetes#32422 kubernetes#32406 kubernetes#33146 kubernetes#33774 on release-1.4. kubernetes#31894: Support graceful termination in kube-dns kubernetes#32422: Added --log-facility flag to enhance dnsmasq logging kubernetes#32406: Split dns healthcheck into two different urls kubernetes#33146: Tune down initialDelaySeconds for readinessProbe kubernetes#33774: Bump up addon kube-dns to v20 for graceful termination
Fixed #33053.
Tuned down the
initialDelaySeconds
(original 30s) for readiness probe to 3 seconds andperiodSeconds
(default 10s) to 5 seconds to shorten the initial time before a dns server pod being exposed. This configuration passed DNS e2e tests and did not even hit any readiness failure(for kube-dns) with a GCE cluster with 4 nodes during the experiments.For scaling out kube-dns servers, it took less than 10s for servers being exposed after they appeared as running, which is much faster than 30+s(the original cost).
failureThreshold
is left as default(3) and it would not lead to restart because the status of readiness probe would only affect whether endpoints being exposed in service or not(in the dns service point of view). According to the implementation of prober, the number of retries for readiness probe is unbounded. Hence there is no obvious effect if the readiness probe fail several times in the beginning.The state machine of prober could be illustrated with below figure:
I want to see the e2e result of this PR for further evaluation.
@thockin @bprashanth
This change is