-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2E Node tests for image pull backoff and crashloopbackoff behavior #128559
E2E Node tests for image pull backoff and crashloopbackoff behavior #128559
Conversation
/test pull-kubernetes-node-e2e-containerd |
Hiya @SergeyKanzhelev (cc @tallclair) I'm going to keep working on this tomorrow, but heads up if you have any opinions on how this is shaping up (as I intend to use something similar to e2e test container restarts too), the latest commit has some informative TODOs of where I'm currently at. Most relevantly I am looking for (/ possibly need to make?) a util that can snag kubelet logs for a defined time period because that's where the data I need to parse really is, as I don't seem to be able to do it with the events API; if you know of something that already does that in the node e2e suite please point me in the direction as I haven't found anything so far. Thanks! |
Ideas:
|
Is the question how to check that the image pull backoff did not inherit the container crash loop backoff? Easiest you can do - check how fast you receive the next image pull in the cri proxy. If container crash loop backoff configured to 5 seconds, make sure you are not getting image pull backoffs less then so many times in one minute. |
this should merge with #128374 , adding e2e separated of the feature is not a good practice |
Signed-off-by: Laura Lorenz <lauralorenz@google.com>
Signed-off-by: Laura Lorenz <lauralorenz@google.com>
Signed-off-by: Laura Lorenz <lauralorenz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
I looked at the tests and they make sense to me, but I'm not able to dig into the underlying framework as much as someone who knows it very well.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lauralorenz, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
LGTM label has been added. Git tree hash: 7cf3281708e43a7febbc701a3195cec8c9df0420
|
Focused too much on the container restart one in commit that fixed that Signed-off-by: Laura Lorenz <lauralorenz@google.com>
@tallclair I had changed the sleeps to shorter times in 285d433, but focused on running the container restart tests (which are green on prow here) and didn't update the expectation of the shorter sleep on the image pull test. Now in |
/test pull-kubernetes-node-e2e-cri-proxy-serial Please make sure the pull-kubernetes-node-e2e-cri-proxy-serial run passes before removing the hold |
LGTM label has been added. Git tree hash: 0db6e203d06421fab66b369522ccf14644c3d914
|
/triage accepted |
/milestone v1.32 |
/retest-required |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Add e2e tests buit on top of the CRI proxy framework to test the backoff behavior of image pulls and container restarts. Includes a case where container restarts are configured using the alpha feature from KEP-4306.
Which issue(s) this PR fixes:
Related to kubernetes/enhancements#4603
Special notes for your reviewer:
Test freeze exception: https://groups.google.com/g/kubernetes-sig-node/c/zYclDRIyD0w
How to run:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/hold