-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit event and retry when fail to start healthz server on kube-proxy #49267
Emit event and retry when fail to start healthz server on kube-proxy #49267
Conversation
/test pull-kubernetes-kubemark-e2e-gce |
/lgtm |
Some other components occupy port 10256, hence kube-proxy keeps restarting. kubeproxy logs from kubemark CI:
Will need to fix that before this could go in. |
This cannot go in until the NPD's port is changed in the COS config... |
Didn't look closely but you may also need to change the hollow node config used by kubemark https://github.com/kubernetes/kubernetes/blob/v1.7.1/test/kubemark/resources/hollow-node_template.yaml#L103 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative is to rate limited spin trying to bind the healthz port and emit events. Seems bad to have the kube-proxy die for this error?
pkg/proxy/healthcheck/healthcheck.go
Outdated
} | ||
go func() { | ||
go wait.Until(func() { | ||
glog.V(3).Infof("Starting goroutine for healthz on %s", hs.addr) | ||
if err := server.Serve(listener); err != nil { | ||
glog.Errorf("Healhz closed: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Healthz is mispelled
cc @dnardo |
pkg/proxy/healthcheck/healthcheck.go
Outdated
} | ||
go func() { | ||
go wait.Until(func() { | ||
glog.V(3).Infof("Starting goroutine for healthz on %s", hs.addr) | ||
if err := server.Serve(listener); err != nil { | ||
glog.Errorf("Healhz closed: %v", err) | ||
return | ||
} | ||
glog.Errorf("Unexpected healhz closed.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also misspelled here.
s/healhz/healthz
/retest |
/retest |
359cf90
to
934d8d3
Compare
934d8d3
to
677b2c5
Compare
677b2c5
to
dbb5a3e
Compare
Per discussion offline, we believe this is the better solution. Codes are revised to do so. |
dbb5a3e
to
db379de
Compare
Sample event:
|
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MrHohn, thockin Associated issue: 49263 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest |
Would you care to explain why? :-) kube-proxy would appear to be running and ready, which seems very misleading. Even though events could indicate problems, it's probably not the best mechanism to express the status a system component... |
The main idea is to keep kube-proxy running when 99% functionality is working.
Agreed that kube-proxy shouldn't appear to be ready in this case. I will soon send a separate PR to add a readiness probe to kube-proxy. |
/retest |
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886) |
/release-note |
What this PR does / why we need it: Enhance kube-proxy's logic when fail to start healthz server.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): From #49263.Special notes for your reviewer:
/assign @thockin @nicksardo @bowei
Release note: