Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: storage: don't hang kubelet on unresponsive nfs #35038

Merged
merged 1 commit into from
Oct 18, 2016

Conversation

sjenning
Copy link
Contributor

@sjenning sjenning commented Oct 18, 2016

Fixes #31272

Currently, due to the nature of nfs, an unresponsive nfs volume in a pod can wedge the kubelet such that additional pods can not be run.

The discussion thus far surrounding this issue was to wrap the lstat, the syscall that ends up hanging in uninterruptible sleep, in a goroutine and limiting the number of goroutines that hang to one per-pod per-volume.

However, in my investigation, I found that the callsites that request a listing of the volumes from a particular volume plugin directory don't care anything about the properties provided by the lstat call. They only care about whether or not a directory exists.

Given that constraint, this PR just avoids the lstat call by using Readdirnames() instead of ReadDir() or ReadDirNoExit()

More detail for reviewers

Consider the pod mounted nfs volume at /var/lib/kubelet/pods/881341b5-9551-11e6-af4c-fa163e815edd/volumes/kubernetes.io~nfs/myvol. The kubelet wedges because when we do a ReadDir() or ReadDirNoExit() it calls syscall.Lstat on myvol which requires communication with the nfs server. If the nfs server is unreachable, this call hangs forever.

However, for our code, we only care what about the names of files/directory contained in kubernetes.io~nfs directory, not any of the more detailed information the Lstat call provides. Getting the names can be done with Readdirnames(), which doesn't need to involve the nfs server.

@pmorie @eparis @ncdc @derekwaynecarr @saad-ali @thockin @vishh @kubernetes/rh-cluster-infra


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Oct 18, 2016
@derekwaynecarr derekwaynecarr added release-note-none Denotes a PR that doesn't merit a release note. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 18, 2016
@derekwaynecarr
Copy link
Member

@k8s-bot gci gke e2e test this

@derekwaynecarr
Copy link
Member

@sjenning -- per our chat, a comment that describes the problem on this pr with an actual example set of directories was helpful for simpletons like myself to follow the core problem.

@derekwaynecarr derekwaynecarr self-assigned this Oct 18, 2016
@derekwaynecarr
Copy link
Member

@k8s-bot gci gke e2e test this

@sjenning
Copy link
Contributor Author

@derekwaynecarr I updated the first comment with a more detailed example of how this happens IRL

@derekwaynecarr
Copy link
Member

@k8s-bot gci gke e2e test this

@derekwaynecarr
Copy link
Member

LGTM

@derekwaynecarr derekwaynecarr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 18, 2016
@derekwaynecarr
Copy link
Member

fyi @kubernetes/sig-node

@wongma7
Copy link
Contributor

wongma7 commented Oct 18, 2016

any chance of this fix being cherry-picked?

@derekwaynecarr
Copy link
Member

@k8s-bot gci gke e2e test this

@k8s-cherrypick-bot
Copy link

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GKE smoke e2e failed for commit da3683e. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@saad-ali
Copy link
Member

Thanks for digging into this and getting a workaround out for 1.5 @sjenning!

Let's get this cherry-picked back to 1.4.

@saad-ali saad-ali modified the milestones: v1.4, v1.5 Oct 24, 2016
@jingxu97
Copy link
Contributor

@sjenning, during some testing, I noticed that your fix still won't solve the problem of kubelet hung on unresponsive nfs. When a node has some directories mounting to a nfs server (running in a container) exports, after the nfs server container is deleted, ReadDirNoStat() function will hung too.

Could you please check again and see whether you get different result? Thank you!

cc @kubernetes/sig-storage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants