Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image garbage collector: LowThresholdPercent can not be higher than HighThresholdPercent #18073

Conversation

ingvagabund
Copy link
Contributor

if LowThresholdPercent > HighThresholdPercent, amountToFree at image_manager.go:208 is negative and image GC will not free memory properly.

Justification:

  1. LowThresholdPercent > HighThresholdPercent implies (LowThresholdPercent * capacity / 100) > (HighThresholdPercent * capacity / 100)
  2. usage is at least (HighThresholdPercent * capacity / 100)
  3. amountToFree = usage - (LowThresholdPercent * capacity / 100)

Combining 1), 2) and 3) implies amountToFree can be negative.

What happens if amountToFree is negative? in freeSpace method, "for _, image := range images " loops at least once
and if everything goes fine, "delete(im.imageRecords, image.id)" is executed.
When checking for condition "if spaceFreed >= bytesToFree", it is always true as bytesToFree is negative
and spaceFreed is positive. The loop is finished, so is image GC.

At the end, only the oldest image is deleted. In situations where there is a lot of dead containers,
each container corresponing to distinct image, number of unused images can get higher.
If two new images get pulled in every 5 minutes, image GC will not work properly and will not free enough space.
Secondly, it will take a lot of time to free all unused images (hours depending on a number of unused images).

This is an incorrect configuration. Image GC should report it and refuse to work.

if LowThresholdPercent > HighThresholdPercent, amountToFree at image_manager.go:208 is negative and image GC will not free memory properly.

Justification:

1) LowThresholdPercent > HighThresholdPercent implies (LowThresholdPercent * capacity / 100) > (HighThresholdPercent * capacity / 100)
2) usage is at least (HighThresholdPercent * capacity / 100)
3) amountToFree = usage - (LowThresholdPercent * capacity / 100)

Combining 1), 2) and 3) implies amountToFree can be negative.

What happens if amountToFree is negative? in freeSpace method, "for _, image := range images " loops at least once
and if everything goes fine, "delete(im.imageRecords, image.id)" is executed.
When checking for condition "if spaceFreed >= bytesToFree", it is always true as bytesToFree is negative
and spaceFreed is positive. The loop is finished, so is image GC.

At the end, only the oldest image is deleted. In situations where there is a lot of dead containers,
each container corresponing to distinct image, number of unused images can get higher.
If two new images get pulled in every 5 minutes, image GC will not work properly and will not free enough space.
Secondly, it will take a lot of time to free all unused images (hours depending on a number of unused images).

This is an incorrect configuration. Image GC should report it and refuse to work.
@ingvagabund ingvagabund changed the title LowThresholdPercent can not be higher than HighThresholdPercent Image garbage collector: LowThresholdPercent can not be higher than HighThresholdPercent Dec 2, 2015
@k8s-github-robot
Copy link

Labelling this PR as size/XS

@k8s-github-robot k8s-github-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 2, 2015
@k8s-bot
Copy link

k8s-bot commented Dec 2, 2015

GCE e2e test build/test passed for commit 9590b23.

@k8s-github-robot
Copy link

The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label?

@dchen1107 dchen1107 added sig/node Categorizes an issue or PR as relevant to SIG Node. ok-to-merge and removed needs-ok-to-merge labels Dec 3, 2015
@dchen1107
Copy link
Member

Obviously there is a incorrect configuration, and we should validate this. But on another side, I have one concern that that with this change, this wrong configuration will cause crash loop kubelet, but not very easy for the user to figure out the reason. cc/ @mikedanese is config object will handle this validation? so that Kubelet can safely assume all configuration (today's flags) are making sense?

@dchen1107
Copy link
Member

Had quick sync with @mikedanese: this will be moved to config validation once that is ready.

LGTM for now.

@dchen1107 dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 3, 2015
@k8s-github-robot
Copy link

@k8s-bot test this

Tests are more than 48 hours old. Re-running tests.

@k8s-bot
Copy link

k8s-bot commented Dec 4, 2015

GCE e2e test build/test passed for commit 9590b23.

@gmarek
Copy link
Contributor

gmarek commented Dec 6, 2015

@k8s-bot unit test this

@k8s-github-robot
Copy link

@k8s-bot test this

Tests are more than 48 hours old. Re-running tests.

@k8s-bot
Copy link

k8s-bot commented Dec 6, 2015

GCE e2e test build/test passed for commit 9590b23.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Dec 7, 2015

GCE e2e test build/test passed for commit 9590b23.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

k8s-github-robot pushed a commit that referenced this pull request Dec 7, 2015
…-threshold-test

Auto commit by PR queue bot
@k8s-github-robot k8s-github-robot merged commit 316a8ac into kubernetes:master Dec 7, 2015
@@ -101,6 +101,9 @@ func newImageManager(runtime container.Runtime, cadvisorInterface cadvisor.Inter
if policy.LowThresholdPercent < 0 || policy.LowThresholdPercent > 100 {
return nil, fmt.Errorf("invalid LowThresholdPercent %d, must be in range [0-100]", policy.LowThresholdPercent)
}
if policy.LowThresholdPercent > policy.HighThresholdPercent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we disallow being equal as well?

@ingvagabund ingvagabund deleted the garbage-collector-low-high-threshold-test branch February 10, 2016 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants