Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" #128875

pacoxu · 2024-11-20T06:55:47Z

https://storage.googleapis.com/k8s-triage/index.html?test=%5C%5Bsig-node%5C%5D.*Serial

None

pacoxu · 2024-11-20T06:56:20Z

/test pull-kubernetes-node-kubelet-serial-containerd
/test pull-kubernetes-node-kubelet-serial-containerd-alpha-features
/test pull-kubernetes-node-kubelet-serial-containerd-kubetest2
/test pull-kubernetes-node-kubelet-serial-containerd-sidecar-containers

pacoxu · 2024-11-20T07:15:20Z

/test pull-kubernetes-e2e-capz-windows-master

pacoxu · 2024-11-20T08:59:14Z

/kind bug

pacoxu · 2024-11-20T08:59:41Z

/cc @tallclair @SergeyKanzhelev @mrunalp @derekwaynecarr @dchen1107

k8s-ci-robot · 2024-11-20T14:32:49Z

@cpanato: The provided milestone is not valid for this repository. Milestones in this repository: [next-candidate, v1.26, v1.27, v1.28, v1.29, v1.30, v1.31, v1.32, v1.33, v1.34]

Use /milestone clear to clear the milestone.

In response to this:

/milestone 1.32

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

cpanato · 2024-11-20T14:33:06Z

/milestone v1.32

cpanato · 2024-11-20T14:33:28Z

/triage accepted

aojea · 2024-11-20T14:34:50Z

weirdly, this also fixed a bunch of sig-storage tests as well in https://testgrid.k8s.io/sig-release-master-informing#capz-windows-master&width=20 (like ConfigMap should be consumable from pods in volume) ... that seems to indicate this feature enablement is doing something weird with pod / volume lifecycle ordering that impacts permissions set on volumes (even in tests where no resize changes were being made), which is really surprising

/lgtm

liggitt · 2024-11-20T14:38:28Z

Collecting issues to resolve before re-enabling this:

serial-containerd test timeouts - [Failing Test] node-kubelet-serial-containerd #128874
solid test failures on windows - [Failing Test] Container Runtime blackbox test on terminated container should report termination message if TerminationMessagePath is set as non-root user and at a non-default path [NodeConformance] [Conformance] #128783
- example at https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-capz-master-windows/1859221655700639744
- lots of sig-storage permission test failures
- sig-node "Container Runtime blackbox test on terminated container should report termination message if TerminationMessagePath is set as non-root user and at a non-default path"
ensure defaulting new fields within containers does not restart pods on <1.31 kubelets - [FG:InPlacePodVerticalScaling] Graduate to Beta #128682 (comment)

aojea · 2024-11-20T14:58:42Z

#128682 Merged on Nov 12, 8pm GMT+1

Testgrids to look at that may give signal as they seem started failing since that date, possible source for correlation

dims · 2024-11-20T17:01:45Z

@aojea thanks for listing those! let's watch the upcoming runs and see what else is hiding behind this one. 🤞🏾

(I've kicked off a few runs)

dims · 2024-11-20T19:20:53Z

ALL jobs @aojea pointed to above have 🟢 ✅ runs now!! thanks a ton @pacoxu

esotsal · 2024-11-20T20:51:33Z

Was InPlacePodVerticalScaling Beta the issue also for the Windows CI pipeline failure?

Checking https://testgrid.k8s.io/sig-release-master-informing#capz-windows-master , why with same commit suddenly windows pipeline failed? ( last successfull / first failed )

esotsal · 2024-11-20T21:00:16Z

Collecting issues to resolve before re-enabling this:

I think @liggitt it is worth adding in the list this analysis , i believe explains lot of failures seen in the pipelines ( related with tests , not InPlacePodVerticalScaling feature as is).

esotsal · 2024-11-21T07:27:05Z

Was InPlacePodVerticalScaling Beta the issue also for the Windows CI pipeline failure?

Answering to my self :-) , yes but not the feature it self, the tests. fyi this run from https://prow.k8s.io/pr-history/?org=kubernetes&repo=kubernetes&pr=128880 passes without the revert https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/128880/pull-kubernetes-e2e-capz-windows-master/1859388054804893696 . Retriggered to double confirm this. Seems the kubetest2 failure is the last of the three failures where we still have problems.

pacoxu · 2024-11-21T10:30:58Z

Was InPlacePodVerticalScaling Beta the issue also for the Windows CI pipeline failure?

Answering to my self :-) , yes but not the feature it self, the tests. fyi this run from https://prow.k8s.io/pr-history/?org=kubernetes&repo=kubernetes&pr=128880 passes without the revert https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/128880/pull-kubernetes-e2e-capz-windows-master/1859388054804893696 . Retriggered to double confirm this. Seems the kubetest2 failure is the last of the three failures where we still have problems.

See https://github.com/kubernetes/kubernetes/pull/128880/files#r1850861491 and the PR for more information.

The CI failure seems to be caused by the e2e pod cleanup logic, but not the feature. ┓( ´∀` )┏

dims · 2024-11-21T12:41:23Z

@pacoxu unfortunately PR landed E_TOO_LATE_IN_THE_CYCLE for us to dig in like this.. so let's do this one right in 1.33.

esotsal · 2024-11-21T12:43:28Z

Was InPlacePodVerticalScaling Beta the issue also for the Windows CI pipeline failure?

Answering to my self :-) , yes but not the feature it self, the tests. fyi this run from https://prow.k8s.io/pr-history/?org=kubernetes&repo=kubernetes&pr=128880 passes without the revert https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/128880/pull-kubernetes-e2e-capz-windows-master/1859388054804893696 . Retriggered to double confirm this. Seems the kubetest2 failure is the last of the three failures where we still have problems.

See https://github.com/kubernetes/kubernetes/pull/128880/files#r1850861491 and the PR for more information.

The CI failure seems to be caused by the e2e pod cleanup logic, but not the feature. ┓( ´∀` )┏

One more update, beside the clean up , after InPlacePodVerticalScaling seems DeleteSync used timeout on some tests was not sufficient

kubernetes/test/e2e_node/memory_manager_test.go

Line 558 in 776fb24

    
           e2epod.NewPodClient(f).DeleteSync(ctx, testPod2.Name, metav1.DeleteOptions{}, 2*time.Minute)

above timeout resulted in failures in kubetest2 pipeline. Same commit shared by pacoxy, tries to test this theory, to see behaviour using e2epod.DefaultPodDeletionTimeout ( which is 3 minutes ) instead.

esotsal · 2024-11-21T12:47:16Z

@pacoxu unfortunately PR landed E_TOO_LATE_IN_THE_CYCLE for us to dig in like this.. so let's do this one right in 1.33.

I agree it is unfortunate that it hasn't made it, to me seems issues were not rooted because of InPlacePodVerticalScaling to be honest. I wish had looked testgrid earlier, but seems we are close fixing those.

dims · 2024-11-21T13:27:20Z

Agree @esotsal. Good news is that there isn't anything other breakage hiding behind this .. the bleeding has stopped

https://storage.googleapis.com/k8s-triage/index.html?job=.*Serial.*

tallclair · 2024-11-21T18:34:08Z

Thanks all for holding the high quality bar. I'm disappointed that InPlacePodVerticalScaling won't make it in the v1.32 release, but I'd much rather a smooth rollout!

Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta"

03a15fa

k8s-ci-robot requested review from aojea and mrunalp November 20, 2024 06:56

pacoxu changed the title ~~[WIP]Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta"~~ Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" Nov 20, 2024

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Nov 20, 2024

k8s-ci-robot requested review from dchen1107, derekwaynecarr and SergeyKanzhelev November 20, 2024 08:59

k8s-ci-robot added this to the v1.32 milestone Nov 20, 2024

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 20, 2024

k8s-ci-robot merged commit bf70d28 into kubernetes:master Nov 20, 2024
21 of 24 checks passed

k8s-ci-robot assigned aojea Nov 20, 2024

This was referenced Nov 20, 2024

[FG:InPlacePodVerticalScaling] Graduate to Beta #128682

Merged

In-Place Update of Pod Resources kubernetes/enhancements#1287

Open

Update In-Place Pod Resize docs for v1.32 kubernetes/website#48503

Merged

esotsal mentioned this pull request Nov 20, 2024

[FG:InPlacePodVerticalScaling] pull-kubernetes-e2e-capz-windows-master test fail with InPlacePodVerticalScaling Beta #128897

Open

tallclair mentioned this pull request Nov 21, 2024

[FG:InPlacePodVerticalScaling] Remove ResizePolicy defaulting #128920

Merged

esotsal mentioned this pull request Dec 4, 2024

[FG:InPlacePodVerticalScaling] Emit a events when resize status changes #127172

Open

dims mentioned this pull request Dec 5, 2024

Add Podresize endpoints to pending_eligible_endpoints.yaml #129099

Merged

AnishShah mentioned this pull request Dec 9, 2024

Run in-place pod resize node tests in parallel kubernetes/test-infra#33792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" #128875

Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" #128875

pacoxu commented Nov 20, 2024 •

edited by dims

Loading

pacoxu commented Nov 20, 2024

pacoxu commented Nov 20, 2024

pacoxu commented Nov 20, 2024

pacoxu commented Nov 20, 2024

k8s-ci-robot commented Nov 20, 2024

cpanato commented Nov 20, 2024

cpanato commented Nov 20, 2024

aojea commented Nov 20, 2024

liggitt commented Nov 20, 2024 •

edited

Loading

aojea commented Nov 20, 2024

dims commented Nov 20, 2024

dims commented Nov 20, 2024

esotsal commented Nov 20, 2024 •

edited

Loading

esotsal commented Nov 20, 2024 •

edited

Loading

esotsal commented Nov 21, 2024

pacoxu commented Nov 21, 2024

dims commented Nov 21, 2024

esotsal commented Nov 21, 2024

esotsal commented Nov 21, 2024 •

edited

Loading

dims commented Nov 21, 2024

tallclair commented Nov 21, 2024

Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" #128875

Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" #128875

Conversation

pacoxu commented Nov 20, 2024 • edited by dims Loading

pacoxu commented Nov 20, 2024

pacoxu commented Nov 20, 2024

pacoxu commented Nov 20, 2024

pacoxu commented Nov 20, 2024

k8s-ci-robot commented Nov 20, 2024

cpanato commented Nov 20, 2024

cpanato commented Nov 20, 2024

aojea commented Nov 20, 2024

liggitt commented Nov 20, 2024 • edited Loading

aojea commented Nov 20, 2024

dims commented Nov 20, 2024

dims commented Nov 20, 2024

esotsal commented Nov 20, 2024 • edited Loading

esotsal commented Nov 20, 2024 • edited Loading

esotsal commented Nov 21, 2024

pacoxu commented Nov 21, 2024

dims commented Nov 21, 2024

esotsal commented Nov 21, 2024

esotsal commented Nov 21, 2024 • edited Loading

dims commented Nov 21, 2024

tallclair commented Nov 21, 2024

pacoxu commented Nov 20, 2024 •

edited by dims

Loading

liggitt commented Nov 20, 2024 •

edited

Loading

esotsal commented Nov 20, 2024 •

edited

Loading

esotsal commented Nov 20, 2024 •

edited

Loading

esotsal commented Nov 21, 2024 •

edited

Loading