-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TestUnschedulableNodes: Failed to observe reflected update for setting unschedulable=true #25845
Comments
Assigning to @davidopp per team/control-plane label |
Not sure if this is still happening. Will take a look on Monday. |
This error message comes from a place where we retry for an arbitrary period of time (20s) before giving up. If the failures are rare, this looks like the standard problem of trying to pick a timeout that fails quickly enough but doesn't flake all the time. I have tried cranking the timeout down to 50ms instead to see if I can reproduce (no luck yet), but any transient failure around this timeout is basically by design. The "problem" could be that other threads or just etcd haven't gotten enough resources to finish in 20 wall seconds. It might be interesting if we could get a thread dump when this fails. And maybe turn on etcd debug logging. |
For sure, running etcd with --debug and sending stdout to a file we keep would be super useful. It logs every request with timestamps. |
It sometime happens - I've seen it yesterday in completely unrelated PR. |
I think I wrote this test at some point btw. Let me take a look. |
I took a look and to be honest I have no idea what is happening there. I sent out a PR to slightly extend logging which may help with debugging - see #27401 |
Automatic merge from submit-queue Extend logging for UnschedulableNodes Ref #25845
I couldn't find an example of this happening at all in June. Downgrading to P2 and we'll wait until it happens again. |
This happened to me on Tuesday in #26880 |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention... |
1 similar comment
[FLAKE-PING] @mml This flaky-test issue would love to have more attention... |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
11 similar comments
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
[FLAKE-PING] @mml This flaky-test issue would love to have more attention. |
Marking this as fixed. please reopen if necessary. |
https://console.cloud.google.com/storage/kubernetes-jenkins/logs/kubernetes-test-go/12266/
The text was updated successfully, but these errors were encountered: