Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubelet] Fix race condition in container manager #48123

Merged
merged 1 commit into from
Jun 28, 2017

Conversation

msau42
Copy link
Member

@msau42 msau42 commented Jun 27, 2017

What this PR does / why we need it:

This fixes a race condition where the container manager capacity map was being updated without synchronization. It moves the storage capacity detection to kubelet initialization, which happens serially in one thread.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #48045

Release note:

Fixes kubelet race condition in container manager.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 27, 2017
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jun 27, 2017
@msau42
Copy link
Member Author

msau42 commented Jun 27, 2017

/test pull-kubernetes-node-e2e

@msau42
Copy link
Member Author

msau42 commented Jun 27, 2017

/assign @dchen1107

@msau42
Copy link
Member Author

msau42 commented Jun 27, 2017

The difference between this change and #48096 is that this change only touches the codepath for the storage resources, and doesn't try to refactor the existing resources to use the initial capacity values. I'll leave it up to someone more familiar with the node code to refactor the rest.

@msau42
Copy link
Member Author

msau42 commented Jun 27, 2017

cc @marun

@yujuhong yujuhong assigned yujuhong and unassigned ncdc, Random-Liu and dchen1107 Jun 27, 2017
Copy link
Contributor

@yujuhong yujuhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall with some comments.

The difference between this change and #48096 is that this change only touches the codepath for the storage resources, and doesn't try to refactor the existing resources to use the initial capacity values. I'll leave it up to someone more familiar with the node code to refactor the rest.

Could you file an issue and leave a comment there explaining that? I think it's good to limit the scope of change in this PR since we're so close to the release, but using different capacity source in the node status function is quite puzzling. Adding a TODO to fix this would be good.

}
}

if hasDedicatedImageFs, _ := cadvisorInterface.HasDedicatedImageFs(); hasDedicatedImageFs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it a bit strange that with this feature, the image fs will longer be dedicated even though we call it that....not really pertinent to this PR though.

@@ -225,3 +225,16 @@ func (cc *cadvisorClient) getFsInfo(label string) (cadvisorapiv2.FsInfo, error)
func (cc *cadvisorClient) WatchEvents(request *events.Request) (*events.EventChannel, error) {
return cc.WatchForEvents(request)
}

// HasDedicatedImageFs returns true if the imagefs has a dedicated device.
func (cc *cadvisorClient) HasDedicatedImageFs() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with this for now, but for the record, this would need to move back to a higher level package (again) once we let the runtime report imagefs info through CRI.

@@ -60,6 +60,9 @@ type ContainerManager interface {
// GetNodeAllocatable returns the amount of compute resources that have to be reserved from scheduling.
GetNodeAllocatableReservation() v1.ResourceList

// GetNodeCapacity returns the amount of compute resources tracked by container manager available on the node.
GetCapacity() v1.ResourceList

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why the "container manager" is responsible for supplying all the information. If that's the case, maybe we can move HasDedicatedImageFs to here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that currently in container manager initialization, it scans the system resources and keeps a cache of the capacity.

As for HasDedicatedImageFs, it seems like the only place it is called in during container manager initialization. Would it make more sense for it to just be a private function then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried removing the function from cadvisor.Interface, but it looks like it is required in the eviction.DiskInfoProvider interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's used by the eviction manager. Moving the function requires passing the container manager to the eviction manager too.
This is ok for now.

@@ -60,6 +60,9 @@ type ContainerManager interface {
// GetNodeAllocatable returns the amount of compute resources that have to be reserved from scheduling.
GetNodeAllocatableReservation() v1.ResourceList

// GetNodeCapacity returns the amount of compute resources tracked by container manager available on the node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/GetNodeCapacity/GetCapacity

@@ -23,6 +23,8 @@ import (
kubecontainer "k8s.io/kubernetes/pkg/kubelet/container"
)

// TODO: Move all these methods into the cAdvisor package to make kubelet package a consumer of libraries rather than a source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the TODO. Container stats needs a major refactor to support CRI anyway. Don't think a one-off TODO comment will help.

@msau42
Copy link
Member Author

msau42 commented Jun 27, 2017

I added/removed some comments based on the review. PTAL @yujuhong

@yujuhong
Copy link
Contributor

/approve

Will LGTM after commits are squashed

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 28, 2017
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.

This patch fixes a bug in container manager where it was writing to a map without synchronization.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
@msau42 msau42 force-pushed the fix-allocatable-race branch from 3c82369 to 82f7820 Compare June 28, 2017 01:45
@msau42
Copy link
Member Author

msau42 commented Jun 28, 2017

Squashed

@dchen1107
Copy link
Member

/test pull-kubernetes-bazel

@dchen1107
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 28, 2017
@dchen1107 dchen1107 added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cherrypick-candidate labels Jun 28, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, msau42, yujuhong

Associated issue: 48045

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@dchen1107 dchen1107 added this to the v1.7 milestone Jun 28, 2017
@dchen1107
Copy link
Member

cc/ @luxas Can you help to verify the fix with kubeadm? Thanks!

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jun 28, 2017

@msau42: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-federation-e2e-gce 82f7820 link /test pull-kubernetes-federation-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 48123, 48079)

@k8s-github-robot k8s-github-robot merged commit 13a7fdc into kubernetes:master Jun 28, 2017
@caesarxuchao
Copy link
Member

This one has cherrypick conflict. I'll try to do a manual cherrypick.

caesarxuchao added a commit that referenced this pull request Jun 29, 2017
…#48123-upstream-release-1.7

Automated cherry pick of #48123
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

@dchen1107
Copy link
Member

dchen1107 commented Jun 29, 2017

The pr caused the cross-build failures including 1.7 branch.

I0628 18:33:54.041] +++ [0628 18:30:16] darwin/amd64: go build started
I0628 18:33:54.041] # k8s.io/kubernetes/pkg/kubelet/cm
I0628 18:33:54.041] pkg/kubelet/cm/container_manager_unsupported.go:33: cannot use unsupportedContainerManager literal (type *unsupportedContainerManager) as type ContainerManager in assignment:
I0628 18:33:54.042] 	*unsupportedContainerManager does not implement ContainerManager (missing GetCapacity method)
I0628 18:33:54.042] pkg/kubelet/cm/container_manager_unsupported.go:72: cannot use unsupportedContainerManager literal (type *unsupportedContainerManager) as type ContainerManager in return argument:
I0628 18:33:54.042] 	*unsupportedContainerManager does not implement ContainerManager (missing GetCapacity method)
I0628 18:33:54.042] !!! [0628 18:33:04] Call tree:
I0628 18:33:54.043] !!! [0628 18:33:04]  1: /go/src/k8s.io/kubernetes/hack/lib/golang.sh:690 kube::golang::build_binaries_for_platform(...)
I0628 18:33:54.043] !!! [0628 18:33:04]  2: hack/make-rules/build.sh:27 kube::golang::build_binaries(...)

Tracked by #48234

@msau42 msau42 deleted the fix-allocatable-race branch August 15, 2017 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubelet race condition: concurrent map writes
10 participants