Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestUnschedulableNodes: Failed to observe reflected update for setting unschedulable=true #25845

Closed
lavalamp opened this issue May 19, 2016 · 26 comments
Assignees
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@lavalamp
Copy link
Member

https://console.cloud.google.com/storage/kubernetes-jenkins/logs/kubernetes-test-go/12266/

--- FAIL: TestUnschedulableNodes (20.07s)
    scheduler_test.go:166: Failed to observe reflected update for setting unschedulable=true: timed out waiting for the condition
@lavalamp lavalamp added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. team/control-plane kind/flake Categorizes issue or PR as related to a flaky test. labels May 19, 2016
@fejta
Copy link
Contributor

fejta commented May 23, 2016

Assigning to @davidopp per team/control-plane label

@mml
Copy link
Contributor

mml commented Jun 3, 2016

Not sure if this is still happening. Will take a look on Monday.

@mml
Copy link
Contributor

mml commented Jun 6, 2016

This error message comes from a place where we retry for an arbitrary period of time (20s) before giving up. If the failures are rare, this looks like the standard problem of trying to pick a timeout that fails quickly enough but doesn't flake all the time.

I have tried cranking the timeout down to 50ms instead to see if I can reproduce (no luck yet), but any transient failure around this timeout is basically by design. The "problem" could be that other threads or just etcd haven't gotten enough resources to finish in 20 wall seconds.

It might be interesting if we could get a thread dump when this fails. And maybe turn on etcd debug logging.

@mml
Copy link
Contributor

mml commented Jun 6, 2016

For sure, running etcd with --debug and sending stdout to a file we keep would be super useful. It logs every request with timestamps.

@wojtek-t
Copy link
Member

wojtek-t commented Jun 8, 2016

It sometime happens - I've seen it yesterday in completely unrelated PR.

@wojtek-t
Copy link
Member

wojtek-t commented Jun 8, 2016

I think I wrote this test at some point btw. Let me take a look.

@wojtek-t
Copy link
Member

wojtek-t commented Jun 8, 2016

I took a look and to be honest I have no idea what is happening there. I sent out a PR to slightly extend logging which may help with debugging - see #27401

k8s-github-robot pushed a commit that referenced this issue Jun 8, 2016
Automatic merge from submit-queue

Extend logging for UnschedulableNodes

Ref #25845
@mml
Copy link
Contributor

mml commented Jun 9, 2016

I couldn't find an example of this happening at all in June. Downgrading to P2 and we'll wait until it happens again.

@mml mml added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 9, 2016
@wojtek-t
Copy link
Member

wojtek-t commented Jun 9, 2016

This happened to me on Tuesday in #26880

@wojtek-t wojtek-t added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/flake Categorizes issue or PR as related to a flaky test. and removed kind/flake Categorizes issue or PR as related to a flaky test. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Jun 9, 2016
@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention...

1 similar comment
@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention...

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

11 similar comments
@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@calebamiles
Copy link
Contributor

@mml @lavalamp @fejta @wojtek-t any updates on this issue; should it be closed?

@k8s-github-robot
Copy link

[FLAKE-PING] @mml

This flaky-test issue would love to have more attention.

@dims
Copy link
Member

dims commented Nov 17, 2016

Marking this as fixed. please reopen if necessary.

@dims dims closed this as completed Nov 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

10 participants