Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

queueActionLocked requires write lock #30839

Merged
merged 1 commit into from
Aug 18, 2016
Merged

Conversation

lavalamp
Copy link
Member

@lavalamp lavalamp commented Aug 18, 2016

Fix kubernetes/minikube#368
Fix part of #30759

Hopefully. On stack dumps I couldn't see who was fighting with this.


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note-label-needed labels Aug 18, 2016
@caesarxuchao caesarxuchao added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Aug 18, 2016
@caesarxuchao
Copy link
Member

LGTM. Thanks. cc @wojtek-t.

@k8s-bot
Copy link

k8s-bot commented Aug 18, 2016

GCE e2e build/test passed for commit 3e69c5a.

@caesarxuchao
Copy link
Member

Bump to P2 as it's potentially fixing the scalability tests.

@caesarxuchao caesarxuchao added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Aug 18, 2016
@wojtek-t
Copy link
Member

Thanks (though I'm not 100% sure it will fix the problem).
Do you think that it was fighting with itself (i.e. one Resync with the following Resync)? This would be weird since in our tests for pods the resync period IIRC is 10min, and for other resources it should be fast since there aren't a lot of them...

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Aug 18, 2016

GCE e2e build/test passed for commit 3e69c5a.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 86340fc into kubernetes:master Aug 18, 2016
@lavalamp
Copy link
Member Author

@wojtek-t my theory is that it's a false positive-- that something in the Pop() routine (which was on the stack) just finished a read and now there's a write. Actually this theory may or may not make sense depending on how the map is determining that it's being concurrently accessed.

Having Resync called concurrently could cause it too, but there wasn't evidence of that.

@timothysc
Copy link
Member

@deads2k ^ see last comment.

@andrewgdavis
Copy link

Some slowness seems to have been introduced starting / stopping pods when this PR applied.
This is quite apparent running minikube v.0.8.0 (with this change) vs the official v0.8.0 release. That said, its better than having it crash.

@lavalamp
Copy link
Member Author

Hm. This would block lists and gets, where it formerly wouldn't (and now that I look at it, there was a genuine problem where list reads the things that Resync writes).

I can make a change to unlock periodically, not sure if it will help.

@wojtek-t @gmarek fyi in case this shows up in scalability tests.

@lavalamp
Copy link
Member Author

@andrewgdavis would you like to try #30948 in your setup and see if it makes things better or worse?

@andrewgdavis
Copy link

Not sure if this should be part of this PR or if a new one should be opened and reference this.

Any way, with this PR the following occured: (note the follow-up which is 30948 has not yet been applied.)

I0819 02:15:32.493056    2446 reflector.go:288] pkg/controller/volume/persistentvolume/controller_base.go:429: next resync planned for time.Time{sec:63607169732, nsec:491794383, loc:(*time.                    Location)(0x5854060)}, forcing now
I0819 02:15:32.493110    2446 reflector.go:288] pkg/controller/volume/persistentvolume/controller_base.go:429: next resync planned for time.Time{sec:63607169732, nsec:491794383, loc:(*time.                    Location)(0x5854060)}, forcing now
I0819 02:15:32.493174    2446 reflector.go:288] pkg/controller/volume/persistentvolume/controller_base.go:434: next resync planned for time.Time{sec:63607169732, nsec:492232783, loc:(*time.                    Location)(0x5854060)}, forcing now
I0819 02:15:32.493146    2446 reflector.go:288] pkg/controller/volume/persistentvolume/controller_base.go:434: next resync planned for time.Time{sec:63607169732, nsec:492232783, loc:(*time.                    Location)(0x5854060)}, forcing now
I0819 02:15:32.493213    2446 controller_base.go:575] storeObjectUpdate updating volume "/build-output" with version 57
I0819 02:15:32.493276    2446 controller.go:336] synchronizing PersistentVolume[build-output]: phase: Bound, bound to: "default/build-output-claim (uid: f5eda67c-6599-11e6-b05a-9e6f16e3b04a)",                 boundByController: true
fatal error: concurrent map read and map writeI0819 02:15:32.493295    2446 controller.go:361] synchronizing PersistentVolume[build-output]: volume is bound to claim default/build-output-claim


goroutine 196522 [running]:
runtime.throw(0x3cecda0, 0x21)
    /usr/local/go/src/runtime/panic.go:547 +0x90 fp=0xc827b2bae8 sp=0xc827b2bad0
runtime.mapaccess2_faststr(0x2920720, 0xc8225aca20, 0xc8277f1d40, 0x18, 0x1, 0x1)
    /usr/local/go/src/runtime/hashmap_fast.go:307 +0x5b fp=0xc827b2bb48 sp=0xc827b2bae8
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).queueActionLocked(0xc8222e6b00, 0x38c90c0, 0x4, 0x384ba60, 0xc824f42c60, 0x0, 0x0)
    /go/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:315 +0x4dc fp=0xc827b2bcb0 sp=0xc827b2bb48
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).Resync(0xc8222e6b00, 0x0, 0x0)
    /go/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:511 +0x4f7 fp=0xc827b2be28 sp=0xc827b2bcb0
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc8201a5ee8, 0xc821b78420, 0xc821a1e140, 0xc824b4e4e0, 0xc8201a5ef0)
    /go/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:289 +0x252 fp=0xc827b2bf78 sp=0xc827b2be28
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc827b2bf80 sp=0xc827b2bf78
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /go/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

k8s-github-robot pushed a commit that referenced this pull request Aug 22, 2016
Automatic merge from submit-queue

Do not hold the lock for a long time

Followup to #30839.

I'm not convinced this is a super great idea but I'll throw it out and let others decide.

Ref kubernetes/minikube#368
Ref #30759
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants