Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eviction manager evicts based on inode consumption #35137

Merged

Conversation

dashpole
Copy link
Contributor

@dashpole dashpole commented Oct 19, 2016

Fixes: #32526 Integrate Cadvisor per-container inode stats into the summary api. Make the eviction manager act based on inode consumption to evict pods using the most inodes.

This PR is pending on a cadvisor godeps update which will be included in PR #35136


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Oct 19, 2016
@derekwaynecarr
Copy link
Member

@dashpole can you add a test case that evicts based on inodes similar to the other test cases that work on disk?

@dashpole
Copy link
Contributor Author

yes, the test is going out in a seperate PR #33955. I will be testing 3 pods: one creates empty files in it's container, one creates empty files in an emptydir volume, and another creates one, large file. The eviction manager should hit inodepressure and evict both of the containers making empty files. I still need to look into making the test skip the "empty files in container" pod when using devicemapper. I also want to run the test more, to make sure it isnt flaky before adding the test.

@derekwaynecarr
Copy link
Member

derekwaynecarr commented Oct 19, 2016

@dashpole -- the e2e test is good, but we should also be able to mock the test in unit testing as well via something like the following: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/eviction/eviction_manager_test.go#L315

@dashpole
Copy link
Contributor Author

dashpole commented Oct 19, 2016

@derekwaynecarr ill add a unit test to this PR

@dashpole dashpole force-pushed the per_container_inode_eviction branch from ac8d817 to c465205 Compare October 24, 2016 20:48
@k8s-ci-robot
Copy link
Contributor

Jenkins GCE etcd3 e2e failed for commit c4652055208f3751baa41dd69c0139a8ca49194d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GCE e2e failed for commit c4652055208f3751baa41dd69c0139a8ca49194d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE e2e failed for commit c4652055208f3751baa41dd69c0139a8ca49194d. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GKE smoke e2e failed for commit c4652055208f3751baa41dd69c0139a8ca49194d. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins Kubemark GCE e2e failed for commit c4652055208f3751baa41dd69c0139a8ca49194d. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@dashpole
Copy link
Contributor Author

I found a bug in the eviction manager tests. We were not assigning pods a UID, so the cached stats function was always returning the last (in order of when pods were added) pod's statsFunc. This meant that all calls to statsFunc returned the same value. This did not break tests because we always passed in the pod we wanted to fail as the first pod, and the eviction manager would leave that ordering intact when pods tie in their consumption. Changing the order of pods broke all tests in eviction_manager_test.go. Changing the newPod function in helpers_test.go to include a UID based on the name fixed the problem.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 24, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit c4652055208f3751baa41dd69c0139a8ca49194d. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@dashpole dashpole force-pushed the per_container_inode_eviction branch from c465205 to f908781 Compare October 27, 2016 18:43
@k8s-ci-robot
Copy link
Contributor

Jenkins GCE Node e2e failed for commit f90878158b4670dc6b393af63358843e7dcc437c. Full PR test history.

The magic incantation to run this job again is @k8s-bot node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@dashpole dashpole force-pushed the per_container_inode_eviction branch from f908781 to 0845625 Compare October 27, 2016 20:57
@dashpole
Copy link
Contributor Author

@derekwaynecarr @dchen1107 this all ready to go as soon as one of you has the chance to review it.

@dashpole
Copy link
Contributor Author

pkg/kubelet/eviction/eviction_manager_test.go, line 103 at r1 (raw file):

      memoryWorkingSet string
  }{
      {name: "guaranteed-low", requests: newResourceList("100m", "1Gi"), limits: newResourceList("100m", "1Gi"), memoryWorkingSet: "200Mi"},

Changed ordering to prevent regression on bug described in git comments.


Comments from Reviewable

@dashpole
Copy link
Contributor Author

pkg/kubelet/eviction/eviction_manager_test.go, line 117 at r1 (raw file):

      podStats[pod] = podStat
  }
  podToEvict := pods[5]

Eviction manager reorders pods when choosing which pod to evict, so grab a reference to the correct pod to evict now...


Comments from Reviewable

@dashpole
Copy link
Contributor Author

pkg/kubelet/eviction/helpers_test.go, line 1570 at r1 (raw file):

      ObjectMeta: api.ObjectMeta{
          Name: name,
          UID:  types.UID(name),

This is to fix the bug described in the git comments.


Comments from Reviewable

@derekwaynecarr derekwaynecarr added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Oct 31, 2016
@derekwaynecarr derekwaynecarr added this to the v1.5 milestone Oct 31, 2016
@derekwaynecarr
Copy link
Member

/lgtm

@derekwaynecarr derekwaynecarr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 31, 2016
@dashpole
Copy link
Contributor Author

pkg/kubelet/server/stats/summary.go, line 118 at r1 (raw file):

  }

  nodeFsInodesUsed := *sb.rootFsInfo.Inodes - *sb.rootFsInfo.InodesFree

Is this safe to do?


Comments from Reviewable

@derekwaynecarr derekwaynecarr removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 31, 2016
Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i missed this originally, thx for pointer.

@@ -115,6 +115,8 @@ func (sb *summaryBuilder) build() (*stats.Summary, error) {
return nil, fmt.Errorf("Missing stats for root container")
}

nodeFsInodesUsed := *sb.rootFsInfo.Inodes - *sb.rootFsInfo.InodesFree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need to check for nil on both values. for imagefs, it will be nil for both inodes and inodesfree values.

@dashpole dashpole force-pushed the per_container_inode_eviction branch from 0845625 to 4ca7f9f Compare October 31, 2016 19:10
@dashpole
Copy link
Contributor Author

pkg/kubelet/server/stats/summary.go, line 118 at r1 (raw file):

Previously, derekwaynecarr (Derek Carr) wrote…

you will need to check for nil on both values. for imagefs, it will be nil for both inodes and inodesfree values.

Done.

Comments from Reviewable

@dashpole
Copy link
Contributor Author

@k8s-bot gci gke e2e test this

@derekwaynecarr
Copy link
Member

@k8s-bot gci gke e2e test this

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GKE smoke e2e failed for commit 4ca7f9f. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@dashpole
Copy link
Contributor Author

@k8s-bot gci gke e2e test this

@dchen1107
Copy link
Member

LGTM

@dchen1107 dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 31, 2016
@dashpole
Copy link
Contributor Author

dashpole commented Nov 1, 2016

@k8s-bot unit test this

@k8s-ci-robot
Copy link
Contributor

Jenkins unit/integration failed for commit 4ca7f9f. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@dashpole
Copy link
Contributor Author

dashpole commented Nov 1, 2016

@k8s-bot unit test this

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 2244bfe into kubernetes:master Nov 1, 2016
@dashpole dashpole deleted the per_container_inode_eviction branch November 1, 2016 17:33
k8s-github-robot pushed a commit that referenced this pull request Nov 6, 2016
Automatic merge from submit-queue

Per Volume Inode Accounting

Collects volume inode stats using the same find command as cadvisor.  The command is "find _path_ -xdev -printf '.' | wc -c".  The output is passed to the summary api, and will be consumed by the eviction manager.

This cannot be merged yet, as it depends on changes adding the InodesUsed field to the summary api, and the eviction manager consuming this.  Expect tests to fail until this happens.
DEPENDS ON #35137
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants