forked from kubernetes/kubernetes
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vertical scaling community preview #37
Closed
vinaykul
wants to merge
42
commits into
vertical-scaling-1.11
from
vertical-scaling-community-preview
Closed
Vertical scaling community preview #37
vinaykul
wants to merge
42
commits into
vertical-scaling-1.11
from
vertical-scaling-community-preview
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
added check against qos change from resource patch
(2) added check against pod QOS change as a result of (1) (3) updated unit tests
…nnotation (2) fixed the duplicated update issue in (1) (3) added unit tests
(1) enabled job and pod resource change via patch (2) added check against qos change as a result of (1)
…engdu (1) added job controller changes to convert job spec resource update to annotation (2) fixed the duplicated update issue in (1) (3) added unit tests
* Kubelet changes to support vertical scaling * Kubelet changes for vertical scaling - code review feedback fixes
* Scheduler changes to support vertical resources scaling * Scheduler changes to support vertical resources scaling - code review fixes
…ce spec (2) added oomKillDisable annotation, passed down to pod level cgroup file memory.oom_control Tested on a local environment
added cgroup updates and oomKillDisable annotation
vertical scaling feature gate in apiserver and controller
tested using a manual e2e test
added vertical scaling feature gate to kubelet (related unit tests too)
…city (#31) Added unit-tests and fixes to address CR comments.
added resource quota control for pod resource update
* Add pod resource resize status to pod conditions * Handle InPlaceOnly policy through pod status update * Restore running pod resource values on resizing failure
* Update cached pod resources values on resizing updates - scheduler caches the initial pod that gets Added, and deletes cached Pod. So subsequent pod resource updates makes cached pod resource values to become stale. This fix updates cached pod resource values so the delete deducts the correct values.x * Addressed code review comments, minor fixes.
#35) * fixed the issue when pod resource is updated to decrease, resource quota is not updated immediately * added missing vertical scaling feature gates * fixed pod count issue per Vinay's PR comment
…ry resize retry. (#38) * Allow controller to process new resize requests after a failed request, add rudimentary resize retry.
…izing. Refactor code. (#40) * Respect the pod disruption budget when rescheduling pods. Refactor code. * Add unit test for checkPodDisruptionBudgetOk function.
…heduler support of vertical scaling. (#41) * Update policy to allow scheduler to update pod. Add comprehensive unit tests for scheduler. * kubelet unit tests enhancement to cover vertical scaling policies.
…43) * Cleanup event logs. Refactor resize action handling and add unit test for it. * Update container status hash upon successful UpdateContainerResoruces. * Remove action annotation set by processResizing when rescheduling pod.
…on load, retry latest template.spec values on completion. (#46)
* added resource in place update to deployment * removed unused lines * added testing info * fixed deployment resource update retry * added check against qos change for deployment spec udpate * added unit tests for deployment resource update and qos changes * addressed Vinay's PR comments * added error checking for PatchPodResourceAnnotation * Added patch to the controller policy to allow patching pod in the "default" namespace Thanks @vinay! * added test cases for JD * added test for restart policy * added test case for InPlacePreferred put testing scripts into a separate place * added a doc
Hi, how is it going now? It seems like no one is maintaining it. |
Yes, this is not being maintained. It was intended as a proof-of-concept implementation. The KEPs and latest status for this feature is tracked in kubernetes/enhancements#1342 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
This change implements our design proposal that gives kubernetes the capability to resize pod resources without restarts. This can be used by VPA in the 'Auto' mode. It is a simplistic way to achieve vertical scaling with well-contained changes to the core scheduler, controller, and kubelet codebase.
Based on the feedback received for our initial design proposal, we added a simple policy mechanism to control pod reschedule behavior, support multiple schedulers race handling, and resource quota support. We also have support for pausing a pod that exceeds memory limit via oom_control in order to allow resource monitoring enough time to react to sudden resource usage increases without killing the pod.
The intent of this PR is for the community to review our design in working code, validate if this is a good way to solve in-place vertical scaling problem, and have the opportunity to provide additional feedback/suggestions.
Here's a short demo.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes kubernetes#5774 kubernetes#9043 (item 2)
Special notes for your reviewer:
This change is done with 1.11 base, and the intent it to illustrate our design proposal via working implementation. This is a work-in-progress and we are requesting this review to get some early feedback.
We are investigating the following outstanding items:
Release note: