-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Double container probe timeout #61721
Double container probe timeout #61721
Conversation
in some environments, we see a combination of start latency and the corresponding effect on sync pod latency causing status manager to fail to report within the 2 minute window.
this reduces flakiness in extended suites where long start delays result in this test failing.
/sig testing |
LGTM but can we get someone from the relevant sig -- api-machinery or networking?? -- to review this? |
@smarterclayton, suggestion for best approver for this? it's more a general "what's the longest a container could take to start in a test env" question |
/lgtm |
Right now about 2 minutes is the longest i've seen, so i think 3 is a good rule of thumb (a slow pull with heavily contended low CPU node without high IOPS) |
Generally I would expect any latency issues to be caught by the sig-scalability tests that measure for this that are more controlled environments (a high parallelism e2e is not controlled since multiple conflicting workloads can vary wildly in scope). |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: fejta, liggitt, smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue (batch tested with PRs 46903, 61721, 62317). If you want to cherry-pick this change to another branch, please follow the instructions here. |
in some environments, we see a combination of start latency
and the corresponding effect on sync pod latency causing status
manager to fail to report within the 2 minute window.