-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FG:InPlacePodVerticalScaling] Introduce /resize subresource to request pod resource resizing #128266
Conversation
00aa4eb
to
b93b1cb
Compare
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
da3b408
to
f0d1a99
Compare
/test pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2 |
/assign @tallclair |
/assign @jpbetz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. My only open question is whether we need to do something with managed fields, but I'll let @jpbetz comment on that.
Also, please update the release note to mention that non-resize pod updates can no longer mutate resources.
The test confirms that the subject can successfully resize the Pod resources but not the entire pod.
GetResetFieldsFilter returns a set of fields filter reset by pod resize strategy. This is needed to make server-side apply work correctly.
201b23c
to
bfb0b83
Compare
/test pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2 |
/retest |
/lgtm |
LGTM label has been added. Git tree hash: 206405be12b2086a4b6ec1fcc7b3e346e6d3b2ab
|
/retest |
f := framework.NewDefaultFramework("pod-resize-tests") | ||
ginkgo.BeforeEach(func(ctx context.Context) { | ||
f := framework.NewDefaultFramework("pod-resize-tests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pohly do you think this can be the cause of #128600 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was #128607, confirmed
_output/bin/e2e.test --ginkgo.focus="Pod.InPlace.Resize.Container" --context kind-kind --kubeconfig=/usr/local/google/home/aojea/
.kube/config
I1106 15:11:00.867730 2960849 test_context.go:564] The --provider flag is not set. Continuing as if --provider=skeleton had been used.
I1106 15:11:00.867866 2960849 e2e.go:109] Starting e2e run "9452694d-8cff-4453-a16f-58957fbfc75a" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /usr/local/google/home/aojea/src/kubernetes
=================================================================================
Random Seed: 1730905860 - will randomize all specs
Will run 36 of 6598 specs
SSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
• [FAILED] [4.087 seconds]
[sig-node] [Serial] Pod InPlace Resize Container (scheduler-focused) [Feature:InPlacePodVerticalScaling] [It] pod-resize-scheduler-tests [sig-node, Serial, Feature:InPlacePodVerticalScaling]
k8s.io/kubernetes/test/e2e/node/pod_resize.go:197
Timeline >>
STEP: Creating a kubernetes client @ 11/06/24 15:11:01.095
I1106 15:11:01.095575 2960849 util.go:502] >>> kubeConfig: /usr/local/google/home/aojea/.kube/config
I1106 15:11:01.102592 2960849 util.go:511] >>> kubeContext: kind-kind
STEP: Building a namespace api object, basename pod-resize-scheduler-tests @ 11/06/24 15:11:01.102
STEP: Waiting for a default service account to be provisioned in namespace @ 11/06/24 15:11:01.112
STEP: Waiting for kube-root-ca.crt to be provisioned in namespace @ 11/06/24 15:11:01.114
I1106 15:11:01.118597 2960849 pod_resize.go:202] Found 1 schedulable nodes
STEP: Find node CPU resources available for allocation! @ 11/06/24 15:11:01.118
I1106 15:11:01.120278 2960849 pod_resize.go:218] Found 10 pods on node 'kind-control-plane'
I1106 15:11:01.120291 2960849 pod_resize.go:230] Node 'kind-control-plane': NodeAllocatable MilliCPUs = 48000m. MilliCPUs currently available to allocate = 47050m.
I1106 15:11:01.120296 2960849 pod_resize.go:242] TEST1: testPod1 initial CPU request is '23525m'
I1106 15:11:01.120299 2960849 pod_resize.go:243] TEST1: testPod2 initial CPU request is '47050m'
I1106 15:11:01.120302 2960849 pod_resize.go:244] TEST1: testPod2 resized CPU request is '11762m'
STEP: TEST1: Create pod 'testpod1' that fits the node 'kind-control-plane' @ 11/06/24 15:11:01.12
STEP: TEST1: Create pod 'testpod2' that won't fit node 'kind-control-plane' with pod 'testpod1' on it @ 11/06/24 15:11:03.13
STEP: TEST1: Resize pod 'testpod2' to fit in node 'kind-control-plane' @ 11/06/24 15:11:05.138
I1106 15:11:05.139391 2960849 pod_resize.go:292] Unexpected error: failed to patch pod for resize:
<*errors.StatusError | 0xc00098fc20>:
the server could not find the requested resource
{
ErrStatus:
code: 404
details: {}
message: the server could not find the requested resource
metadata: {}
reason: NotFound
status: Failure,
}
[FAILED] in [It] - k8s.io/kubernetes/test/e2e/node/pod_resize.go:292 @ 11/06/24 15:11:05.139
I1106 15:11:05.139588 2960849 helper.go:124] Waiting up to 7m0s for all (but 0) nodes to be ready
STEP: dump namespace information after failure @ 11/06/24 15:11:05.141
STEP: Collecting events from namespace "pod-resize-scheduler-tests-3255". @ 11/06/24 15:11:05.141
STEP: Found 5 events. @ 11/06/24 15:11:05.142
I1106 15:11:05.142257 2960849 dump.go:53] At 2024-11-06 15:11:01 +0000 UTC - event for testpod1: {default-scheduler } Scheduled: Successfully assigned pod-resize-scheduler-tests-3255/testpod1 to kind-control-plane
I1106 15:11:05.142267 2960849 dump.go:53] At 2024-11-06 15:11:01 +0000 UTC - event for testpod1: {kubelet kind-control-plane} Pulled: Container image "registry.k8s.io/e2e-test-images/busybox:1.36.1-1" already present on machine
I1106 15:11:05.142274 2960849 dump.go:53] At 2024-11-06 15:11:01 +0000 UTC - event for testpod1: {kubelet kind-control-plane} Created: Created container: c1
I1106 15:11:05.142280 2960849 dump.go:53] At 2024-11-06 15:11:01 +0000 UTC - event for testpod1: {kubelet kind-control-plane} Started: Started container c1
I1106 15:11:05.142285 2960849 dump.go:53] At 2024-11-06 15:11:03 +0000 UTC - event for testpod2: {default-scheduler } FailedScheduling: 0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
I1106 15:11:05.143577 2960849 resource.go:168] POD NODE PHASE GRACE CONDITIONS
I1106 15:11:05.143606 2960849 resource.go:175] testpod1 kind-control-plane Running [{PodReadyToStartContainers True 0001-01-01 00:00:00 +0000 UTC 2024-11-06 15:11:02 +0000 UTC } {Initialized True 0001-01-01 00:00:00 +0000 UTC 2024-11-06 15:11:01 +0000 UTC } {Ready True 0001-01-01 00:00:00 +0000 UTC 2024-11-06 15:11:02 +0000 UTC } {ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2024-11-06 15:11:02 +0000 UTC } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2024-11-06 15:11:01 +0000 UTC }]
I1106 15:11:05.143615 2960849 resource.go:175] testpod2 Pending [{PodScheduled False 0001-01-01 00:00:00 +0000 UTC 2024-11-06 15:11:03 +0000 UTC Unschedulable 0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.}]
I1106 15:11:05.143620 2960849 resource.go:178]
I1106 15:11:05.144721 2960849 dump.go:109]
Logging node info for node kind-control-plane
I1106 15:11:05.146004 2960849 dump.go:114] Node Info: &Node{ObjectMeta:{kind-control-plane 4b17f8b3-75b1-4ffe-9f70-596d40fe3f62 707305 0 2024-10-31 09:24:35 +0000 UTC <nil> <nil> map[beta.kubernetes.io/arch:amd64 beta.kubernetes.io/os:linux kubernetes.io/arch:amd64 kubernetes.io/hostname:kind-control-plane kubernetes.io/os:linux node-role.kubernetes.io/control-plane:] map[kubeadm.alpha.kubernetes.io/cri-socket:unix:///run/containerd/containerd.sock node.alpha.kubernetes.io/ttl:0 volumes.kubernetes.io/controller-managed-attach-detach:true] [] [] [{kubelet Update v1 2024-10-31 09:24:35 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:volumes.kubernetes.io/controller-managed-attach-detach":{}},"f:labels":{".":{},"f:beta.kubernetes.io/arch":{},"f:beta.kubernetes.io/os":{},"f:kubernetes.io/arch":{},"f:kubernetes.io/hostname":{},"f:kubernetes.io/os":{}}},"f:spec":{"f:providerID":{}}} } {kubeadm Update v1 2024-10-31 09:24:36 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{"f:kubeadm.alpha.kubernetes.io/cri-socket":{}},"f:labels":{"f:node-role.kubernetes.io/control-plane":{}}}} } {kube-controller-manager Update v1 2024-10-31 09:24:43 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{"f:node.alpha.kubernetes.io/ttl":{}}},"f:spec":{"f:podCIDR":{},"f:podCIDRs":{".":{},"v:\"10.244.0.0/24\"":{}}}} } {kubelet Update v1 2024-11-06 15:06:58 +0000 UTC FieldsV1 {"f:status":{"f:conditions":{"k:{\"type\":\"DiskPressure\"}":{"f:lastHeartbeatTime":{}},"k:{\"type\":\"MemoryPressure\"}":{"f:lastHeartbeatTime":{}},"k:{\"type\":\"PIDPressure\"}":{"f:lastHeartbeatTime":{}},"k:{\"type\":\"Ready\"}":{"f:lastHeartbeatTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{}}},"f:images":{}}} status}]},Spec:NodeSpec{PodCIDR:10.244.0.0/24,DoNotUseExternalID:,ProviderID:kind://docker/kind/kind-control-plane,Unschedulable:false,Taints:[]Taint{},ConfigSource:nil,PodCIDRs:[10.244.0.0/24],},Status:NodeStatus{Capacity:ResourceList{cpu: {{48 0} {<nil>} 48 DecimalSI},ephemeral-storage: {{2153496870912 0} {<nil>} BinarySI},hugepages-1Gi: {{0 0} {<nil>} 0 DecimalSI},hugepages-2Mi: {{0 0} {<nil>} 0 DecimalSI},memory: {{126605991936 0} {<nil>} BinarySI},pods: {{110 0} {<nil>} 110 DecimalSI},},Allocatable:ResourceList{cpu: {{48 0} {<nil>} 48 DecimalSI},ephemeral-storage: {{2153496870912 0} {<nil>} BinarySI},hugepages-1Gi: {{0 0} {<nil>} 0 DecimalSI},hugepages-2Mi: {{0 0} {<nil>} 0 DecimalSI},memory: {{126605991936 0} {<nil>} BinarySI},pods: {{110 0} {<nil>} 110 DecimalSI},},Phase:,Conditions:[]NodeCondition{NodeCondition{Type:MemoryPressure,Status:False,LastHeartbeatTime:2024-11-06 15:06:58 +0000 UTC,LastTransitionTime:2024-10-31 09:24:35 +0000 UTC,Reason:KubeletHasSufficientMemory,Message:kubelet has sufficient memory available,},NodeCondition{Type:DiskPressure,Status:False,LastHeartbeatTime:2024-11-06 15:06:58 +0000 UTC,LastTransitionTime:2024-10-31 09:24:35 +0000 UTC,Reason:KubeletHasNoDiskPressure,Message:kubelet has no disk pressure,},NodeCondition{Type:PIDPressure,Status:False,LastHeartbeatTime:2024-11-06 15:06:58 +0000 UTC,LastTransitionTime:2024-10-31 09:24:35 +0000 UTC,Reason:KubeletHasSufficientPID,Message:kubelet has sufficient PID available,},NodeCondition{Type:Ready,Status:True,LastHeartbeatTime:2024-11-06 15:06:58 +0000 UTC,LastTransitionTime:2024-10-31 09:24:59 +0000 UTC,Reason:KubeletReady,Message:kubelet is posting ready status,},},Addresses:[]NodeAddress{NodeAddress{Type:InternalIP,Address:192.168.8.3,},NodeAddress{Type:Hostname,Address:kind-control-plane,},},DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:10250,},},NodeInfo:NodeSystemInfo{MachineID:d7116c595bc346dc80ebd143e1574a80,SystemUUID:17ff3b61-28d1-4786-971b-1b18c6ba6740,BootID:141a11fe-6289-42df-b6a7-f40eb76ba33d,KernelVersion:6.9.10-1rodete5-amd64,OSImage:Debian GNU/Linux 12 (bookworm),ContainerRuntimeVersion:containerd://1.7.18,KubeletVersion:v1.32.0-alpha.2.328+cf159912e25fbd,KubeProxyVersion:v1.32.0-alpha.2.328+cf159912e25fbd,OperatingSystem:linux,Architecture:amd64,},Images:[]ContainerImage{ContainerImage{Names:[docker.io/library/import-2024-10-31@sha256:6bfbd010f2673ebc331aa4e10241ebcf648378bf523eb0fd915401a51534d2e1 registry.k8s.io/kube-apiserver-amd64:v1.32.0-alpha.2.328_cf159912e25fbd registry.k8s.io/kube-apiserver:v1.32.0-alpha.2.328_cf159912e25fbd],SizeBytes:95494129,},ContainerImage{Names:[docker.io/library/import-2024-10-31@sha256:ff641a1ca4e9ba239f64f26643e8dc69a2efbdc1b991d73ab4325c143ffce95a registry.k8s.io/kube-proxy-amd64:test registry.k8s.io/kube-proxy-amd64:v1.32.0-alpha.2.328_c4102eb7359925],SizeBytes:93927610,},ContainerImage{Names:[docker.io/library/import-2024-10-31@sha256:d0ff8bf9897902b922cf5ba82e8fa60b2f3d0bb8b6fce32a2687625d044978c4 registry.k8s.io/kube-proxy-amd64:v1.32.0-alpha.2.328_cf159912e25fbd registry.k8s.io/kube-proxy:v1.32.0-alpha.2.328_cf159912e25fbd],SizeBytes:93927098,},ContainerImage{Names:[docker.io/library/import-2024-10-31@sha256:9101f22c11b10ba4a56d26afff3af6a3070ef6179aa0556fd2ef6836b78c7c2c registry.k8s.io/kube-controller-manager-amd64:v1.32.0-alpha.2.328_cf159912e25fbd registry.k8s.io/kube-controller-manager:v1.32.0-alpha.2.328_cf159912e25fbd],SizeBytes:89480520,},ContainerImage{Names:[docker.io/library/import-2024-10-31@sha256:ff230f6561be5c3c4026ad103f5c80f62a0897ebb296efbfe1715752f229e547 registry.k8s.io/kube-scheduler-amd64:v1.32.0-alpha.2.328_cf159912e25fbd registry.k8s.io/kube-scheduler:v1.32.0-alpha.2.328_cf159912e25fbd],SizeBytes:69128008,},ContainerImage{Names:[docker.io/library/httpd@sha256:bbea29057f25d9543e6a96a8e3cc7c7c937206d20eab2323f478fdb2469d536d docker.io/library/httpd:2],SizeBytes:59389653,},ContainerImage{Names:[registry.k8s.io/etcd:3.5.16-0],SizeBytes:57680541,},ContainerImage{Names:[docker.io/kindest/kindnetd:v20241007-36f62932],SizeBytes:38600298,},ContainerImage{Names:[docker.io/kindest/local-path-provisioner:v20240813-c6f155d6],SizeBytes:19430244,},ContainerImage{Names:[registry.k8s.io/coredns/coredns:v1.11.3],SizeBytes:18562039,},ContainerImage{Names:[docker.io/kindest/local-path-helper:v20230510-486859a6],SizeBytes:3052318,},ContainerImage{Names:[registry.k8s.io/e2e-test-images/busybox@sha256:a9155b13325b2abef48e71de77bb8ac015412a566829f621d06bfae5c699b1b9 registry.k8s.io/e2e-test-images/busybox:1.36.1-1],SizeBytes:2223659,},ContainerImage{Names:[registry.k8s.io/pause:3.10],SizeBytes:320368,},},VolumesInUse:[],VolumesAttached:[]AttachedVolume{},Config:nil,RuntimeHandlers:[]NodeRuntimeHandler{},Features:nil,},}
I1106 15:11:05.146016 2960849 dump.go:116]
Logging kubelet events for node kind-control-plane
I1106 15:11:05.147145 2960849 dump.go:121]
Logging pods the kubelet thinks are on node kind-control-plane
I1106 15:11:05.155983 2960849 dump.go:128] kube-system/etcd-kind-control-plane started at 2024-10-31 09:24:37 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156004 2960849 dump.go:134] Container etcd ready: true, restart count 0
I1106 15:11:05.156010 2960849 dump.go:128] kube-system/kube-scheduler-kind-control-plane started at 2024-10-31 09:24:37 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156016 2960849 dump.go:134] Container kube-scheduler ready: true, restart count 0
I1106 15:11:05.156022 2960849 dump.go:128] default/pod started at 2024-11-01 20:22:54 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156026 2960849 dump.go:134] Container pod ready: false, restart count 1350
I1106 15:11:05.156031 2960849 dump.go:128] kube-system/kube-proxy-spgv9 started at 2024-10-31 09:45:29 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156037 2960849 dump.go:134] Container kube-proxy ready: true, restart count 0
I1106 15:11:05.156042 2960849 dump.go:128] kube-system/kube-controller-manager-kind-control-plane started at 2024-10-31 09:24:37 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156048 2960849 dump.go:134] Container kube-controller-manager ready: true, restart count 0
I1106 15:11:05.156053 2960849 dump.go:128] kube-system/kube-apiserver-kind-control-plane started at 2024-10-31 09:24:37 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156058 2960849 dump.go:134] Container kube-apiserver ready: true, restart count 0
I1106 15:11:05.156063 2960849 dump.go:128] kube-system/kindnet-ngv9r started at 2024-10-31 09:24:43 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156067 2960849 dump.go:134] Container kindnet-cni ready: true, restart count 0
I1106 15:11:05.156071 2960849 dump.go:128] local-path-storage/local-path-provisioner-57c5987fd4-g5jv6 started at 2024-10-31 09:24:59 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156076 2960849 dump.go:134] Container local-path-provisioner ready: true, restart count 0
I1106 15:11:05.156081 2960849 dump.go:128] kube-system/coredns-7c65d6cfc9-2j8sw started at 2024-10-31 09:24:59 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156085 2960849 dump.go:134] Container coredns ready: true, restart count 0
I1106 15:11:05.156091 2960849 dump.go:128] kube-system/coredns-7c65d6cfc9-8gfxr started at 2024-10-31 09:24:59 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156095 2960849 dump.go:134] Container coredns ready: true, restart count 0
I1106 15:11:05.156100 2960849 dump.go:128] pod-resize-scheduler-tests-3255/testpod1 started at 2024-11-06 15:11:01 +0000 UTC (0+1 container statuses recorded)
I1106 15:11:05.156104 2960849 dump.go:134] Container c1 ready: true, restart count 0
I1106 15:11:05.179476 2960849 kubelet_metrics.go:206]
Latency metrics for node kind-control-plane
STEP: Destroying namespace "pod-resize-scheduler-tests-3255" for this suite. @ 11/06/24 15:11:05.179
<< Timeline
[FAILED] failed to patch pod for resize: the server could not find the requested resource
In [It] at: k8s.io/kubernetes/test/e2e/node/pod_resize.go:292 @ 11/06/24 15:11:05.139
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSGinkgo detected an issue with your spec structure
set up framework | framework.go:200
It looks like you are trying to add a [BeforeEach] node
to the Ginkgo spec tree in a leaf node after the specs started running.
To enable randomization and parallelization Ginkgo requires the spec tree
to be fully constructed up front. In practice, this means that you can
only create nodes like [BeforeEach] at the top-level or within the
body of a Describe, Context, or When.
Learn more at:
http://onsi.github.io/ginkgo/#mental-model-how-ginkgo-traverses-the-spec-hierarchy
Retrospective feedback on the changelog; this could have been -A new /resize subresource was added to request pod resource resizing. Update your k8s client code to utilize the /resize subresource for Pod resizing operations.
+Added a new `resize` subresource to Pods; you can use this to request changes to resource requests or limits. |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
This PR is a continuation to @iholder101's PR (#127320) that originally introduced /resize subresource and @mouuii's PR (#124887) that added support for pod resize in LimitRanger admission plugin.
Introduce /resize subresource to request pod resource resizing
Which issue(s) this PR fixes:
Fixes #109553
Fixes #124855
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: