-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "Should set the condition rejected when the provision fails" test. #3170
Fix "Should set the condition rejected when the provision fails" test. #3170
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/cc @mimowo |
The code changes lgtm, but I'm not sure how they fix the issue. Let me ask some questions to better understand the problem.
|
Most of the tests fail after waiting for I would say that it's very strange that we haven't had issues with this before. :). To fix this, we can move this test to a separate |
Yes, this happens before it. |
I tried to move creation of |
But the admission check is eventually moved to Ready? And so, LocalQueue and ClusterQueue are moved to active.
I see, makes sense to have only one test for Pending admission check. This will make the other tests run faster also. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
I synced with @mbobrovskyi and I agree that the PR has a good potential to eliminate the flake (though it wasn't confirmed by actual experiment, just static log analysis).
The change also makes it easier to reason about the scenarios by splitting the positive setup (CQ active), and the negative setup (CQ remains inactive due to pending admission check).
LGTM label has been added. Git tree hash: 4da3c6d68c37b154c8ed8a78b442d6259cb1ff42
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mbobrovskyi, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Fix "Provisioning when A workload is using a provision admission check Should set the condition rejected when the provision fails" flaky test.
Which issue(s) this PR fixes:
Fixes #3079
Special notes for your reviewer:
After investigating the logs in https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kueue/3078/pull-kueue-test-integration-main/1836042561878233088, I found that the failure occurred because the
LocalQueue
reconciler ran after theWorkload
reserved quota. As a result, we now have QuotaReserved=False with the message "Workload is inadmissible because of missing LocalQueue."kueue/pkg/controller/core/workload_controller.go
Lines 282 to 287 in c03277c
As seen in the logs above, even after the
LocalQueue
becomes active, we still haveActive=False
condition on theClusterQueue
. The workload status is now "Workload is inadmissible because ClusterQueue is inactive," due to a pending admission check.The solution is to wait for the
ClusterQueue
andLocalQueue
to become active and move pending admission to anotherWhen
block.Does this PR introduce a user-facing change?