-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image garbage collector: LowThresholdPercent can not be higher than HighThresholdPercent #18073
Image garbage collector: LowThresholdPercent can not be higher than HighThresholdPercent #18073
Conversation
if LowThresholdPercent > HighThresholdPercent, amountToFree at image_manager.go:208 is negative and image GC will not free memory properly. Justification: 1) LowThresholdPercent > HighThresholdPercent implies (LowThresholdPercent * capacity / 100) > (HighThresholdPercent * capacity / 100) 2) usage is at least (HighThresholdPercent * capacity / 100) 3) amountToFree = usage - (LowThresholdPercent * capacity / 100) Combining 1), 2) and 3) implies amountToFree can be negative. What happens if amountToFree is negative? in freeSpace method, "for _, image := range images " loops at least once and if everything goes fine, "delete(im.imageRecords, image.id)" is executed. When checking for condition "if spaceFreed >= bytesToFree", it is always true as bytesToFree is negative and spaceFreed is positive. The loop is finished, so is image GC. At the end, only the oldest image is deleted. In situations where there is a lot of dead containers, each container corresponing to distinct image, number of unused images can get higher. If two new images get pulled in every 5 minutes, image GC will not work properly and will not free enough space. Secondly, it will take a lot of time to free all unused images (hours depending on a number of unused images). This is an incorrect configuration. Image GC should report it and refuse to work.
Labelling this PR as size/XS |
GCE e2e test build/test passed for commit 9590b23. |
The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label? |
Obviously there is a incorrect configuration, and we should validate this. But on another side, I have one concern that that with this change, this wrong configuration will cause crash loop kubelet, but not very easy for the user to figure out the reason. cc/ @mikedanese is config object will handle this validation? so that Kubelet can safely assume all configuration (today's flags) are making sense? |
Had quick sync with @mikedanese: this will be moved to config validation once that is ready. LGTM for now. |
@k8s-bot test this Tests are more than 48 hours old. Re-running tests. |
GCE e2e test build/test passed for commit 9590b23. |
@k8s-bot unit test this |
@k8s-bot test this Tests are more than 48 hours old. Re-running tests. |
GCE e2e test build/test passed for commit 9590b23. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e test build/test passed for commit 9590b23. |
Automatic merge from submit-queue |
…-threshold-test Auto commit by PR queue bot
@@ -101,6 +101,9 @@ func newImageManager(runtime container.Runtime, cadvisorInterface cadvisor.Inter | |||
if policy.LowThresholdPercent < 0 || policy.LowThresholdPercent > 100 { | |||
return nil, fmt.Errorf("invalid LowThresholdPercent %d, must be in range [0-100]", policy.LowThresholdPercent) | |||
} | |||
if policy.LowThresholdPercent > policy.HighThresholdPercent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we disallow being equal as well?
if LowThresholdPercent > HighThresholdPercent, amountToFree at image_manager.go:208 is negative and image GC will not free memory properly.
Justification:
Combining 1), 2) and 3) implies amountToFree can be negative.
What happens if amountToFree is negative? in freeSpace method, "for _, image := range images " loops at least once
and if everything goes fine, "delete(im.imageRecords, image.id)" is executed.
When checking for condition "if spaceFreed >= bytesToFree", it is always true as bytesToFree is negative
and spaceFreed is positive. The loop is finished, so is image GC.
At the end, only the oldest image is deleted. In situations where there is a lot of dead containers,
each container corresponing to distinct image, number of unused images can get higher.
If two new images get pulled in every 5 minutes, image GC will not work properly and will not free enough space.
Secondly, it will take a lot of time to free all unused images (hours depending on a number of unused images).
This is an incorrect configuration. Image GC should report it and refuse to work.