-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash kube2sky after repeated etcd mutation failures. #4980
Crash kube2sky after repeated etcd mutation failures. #4980
Conversation
While on the PR I have question regarding making the PR live after it being merged: should I build kubernetes/kube2sky:1.1 and change skydns-rc.yaml.in or is it done in separate process ? |
cc @fabioy |
@@ -20,3 +20,6 @@ example, if this is set to `kubernetes.io`, then a service named "nifty" in the | |||
"nifty.default.kubernetes.io". | |||
|
|||
`-verbose`: Log additional information. | |||
|
|||
'-etcd_mutation_timeout': For how long the application with keep retrying etcd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/with/will
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
As for rebuilding kube2sky, yes. Once this is committed we should build and tag a new one and push it, then update the yaml. |
6c4add6
to
297542d
Compare
Thanks for the comments @thockin. Applied them, rebased/squashed and tested in a cluster that it still works. |
297542d
to
e7438f5
Compare
e7438f5
to
5026142
Compare
LGTM - do you want to rebuild and push a 1.1 container, or do you want me to? |
Crash kube2sky after repeated etcd mutation failures.
I will do that and also send a PR for skydns config.
|
Addresses #4814 to large degree.
I wasn't able to do etcd liveness check. Instead kube2sky will keep retrying to configurable period of time and then will just crash.
I've verified that it makes kube2sky more robust in case if etcd is not available and kube2sky would just stopped working. With this PR it will just crash and then come back successfully.