Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in leader election, close of closed channel #26782

Closed
therc opened this issue Jun 3, 2016 · 7 comments
Closed

Panic in leader election, close of closed channel #26782

therc opened this issue Jun 3, 2016 · 7 comments
Assignees
Labels
area/HA kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@therc
Copy link
Member

therc commented Jun 3, 2016

Built from HEAD. Edited for conciseness.

m0 kube-scheduler: E0603 14:41:02.243120   29837 event.go:257] Could not construct reference to: '&api.Endpoints{TypeMeta:unversioned.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:api.ObjectMeta{Name:"kube-scheduler", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimes
m0 kube-scheduler: I0603 14:41:02.243250   29837 leaderelection.go:215] sucessfully acquired lease kube-system/kube-scheduler
m0 kube-scheduler: E0603 14:41:02.263170   29837 event.go:257] Could not construct reference to: '&api.Endpoints{TypeMeta:unversioned.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:api.ObjectMeta{Name:"kube-scheduler", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimes
m0 kube-scheduler: I0603 14:41:02.263248   29837 leaderelection.go:215] sucessfully acquired lease kube-system/kube-scheduler
m0 kube-scheduler: E0603 14:41:02.263327   29837 runtime.go:58] Recovered from panic: "close of closed channel" (close of closed channel)
m0 kube-scheduler: pkg/util/runtime/runtime.go:52
m0 kube-scheduler: pkg/util/runtime/runtime.go:40
m0 kube-scheduler: /usr/local/go/src/runtime/asm_amd64.s:472
m0 kube-scheduler: /usr/local/go/src/runtime/panic.go:426
m0 kube-scheduler: /usr/local/go/src/runtime/chan.go:300
m0 kube-scheduler: pkg/client/leaderelection/leaderelection.go:216
m0 kube-scheduler: pkg/util/wait/wait.go:86
m0 kube-scheduler: pkg/util/wait/wait.go:87
m0 kube-scheduler: pkg/util/wait/wait.go:49
m0 kube-scheduler: pkg/client/leaderelection/leaderelection.go:217
m0 kube-scheduler: pkg/client/leaderelection/leaderelection.go:175
m0 kube-scheduler: pkg/client/leaderelection/leaderelection.go:189
m0 kube-scheduler: plugin/cmd/kube-scheduler/app/server.go:157
m0 kube-scheduler: plugin/cmd/kube-scheduler/scheduler.go:47
m0 kube-scheduler: /usr/local/go/src/runtime/proc.go:188
m0 kube-scheduler: /usr/local/go/src/runtime/asm_amd64.s:1998
@therc
Copy link
Member Author

therc commented Jun 3, 2016

@mikedanese

@therc
Copy link
Member Author

therc commented Jun 3, 2016

I removed it for brevity, but those entries all come from the same PID, 29837. I haven't dug through the code yet to figure why the scheduler called the election code twice.

@mikedanese
Copy link
Member

mikedanese commented Jun 7, 2016

Looking at the code, it seems like this is a bug in wait.Until. Most recent change was 3ec25c5. I will continue to investigate, but cc @zmerlynn if you are intereseted, possibly unrelated. The close in question is here:

@timothysc timothysc added the kind/bug Categorizes issue or PR as related to a bug. label Jul 27, 2016
@timothysc timothysc self-assigned this Jul 27, 2016
@timothysc
Copy link
Member

timothysc commented Jul 27, 2016

@therc do you have repro instructions?

What is odd is you have b2b : "sucessfully acquired lease"

@timothysc
Copy link
Member

nvmd I see the issue(s), fix in flight.

@therc
Copy link
Member Author

therc commented Jul 27, 2016

I haven't seen this since, but thanks for fixing it!

k8s-github-robot pushed a commit that referenced this issue Jul 29, 2016
Automatic merge from submit-queue

Fix race condition found in JitterUntil.

This was caused by the recent addition of "sliding"

manifested in: #26782
k8s-github-robot pushed a commit that referenced this issue Jul 29, 2016
Automatic merge from submit-queue

Update acquire to use newer JitterUntil vs. sleep 

Fix to prevent #26782 which could have had a race on a 0 timer the way it was written before due to changes in wait. 

I will likely make a PR for some of the recent changes in wait as well.
@timothysc timothysc added this to the v1.4 milestone Jul 29, 2016
@timothysc
Copy link
Member

closed via #29699 and #29743

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HA kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants