Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestUnschedulableNodes is flaky #12312

Closed
dchen1107 opened this issue Aug 5, 2015 · 9 comments
Closed

TestUnschedulableNodes is flaky #12312

dchen1107 opened this issue Aug 5, 2015 · 9 comments
Assignees
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@dchen1107
Copy link
Member

I observed that TestUnschedulableNodes failed a couple of times today at shippable:

--- FAIL: TestUnschedulableNodes (48.13 seconds)
scheduler_test.go:262: Test 0: Pod did not get scheduled on an unschedulable node
scheduler_test.go:275: Test 0: failed to schedule a pod: timed out waiting for the condition
scheduler_test.go:262: Test 1: Pod did not get scheduled on an unschedulable node
scheduler_test.go:275: Test 1: failed to schedule a pod: timed out waiting for the condition
W0805 20:15:05.445184 2078 master.go:249] Network range for service cluster IPs is unspecified. Defaulting to 10.0.0.0/24.
I0805 20:15:05.445401 2078 master.go:275] Node port range unspecified. Defaulting to 30000-32767.
I0805 20:15:05.446485 2078 master.go:297] Will report 172.17.10.248 as public IP address.
E0805 20:15:05.540941 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolumeClaim: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumeclaims?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:05.541304 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolume: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumes?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
I0805 20:15:05.741411 2078 etcd_utils.go:66] Deleting all etcd keys
W0805 20:15:05.805577 2078 controller.go:212] Resetting endpoints for master service "kubernetes" to &{{ } {kubernetes default 0 0001-01-01 00:00:00 +0000 UTC <nil> map[] map[]} [{[{172.17.10.248 <nil>}] [{ 6443 TCP}]}]}
E0805 20:15:05.876542 2078 repair.go:52] unable to persist the updated port allocations: 100: Key not found (/kubernetes.io) [197]
E0805 20:15:06.579242 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolumeClaim: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumeclaims?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:06.579560 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolume: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumes?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:07.653616 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolume: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumes?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:07.653937 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolumeClaim: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumeclaims?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:08.745323 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolumeClaim: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumeclaims?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:08.746963 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolume: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumes?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:09.820005 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolume: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumes?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
E0805 20:15:09.820322 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolumeClaim: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumeclaims?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused
W0805 20:15:10.140209 2078 master.go:249] Network range for service cluster IPs is unspecified. Defaulting to 10.0.0.0/24.
I0805 20:15:10.140510 2078 master.go:275] Node port range unspecified. Defaulting to 30000-32767.
I0805 20:15:10.141214 2078 master.go:297] Will report 172.17.10.248 as public IP address.
E0805 20:15:10.822095 2078 reflector.go:209] pkg/runtime/proc.c:1445: Failed to watch *api.PersistentVolume: Get http://127.0.0.1:47600/api/v1/watch/persistentvolumes?resourceVersion=167: dial tcp 127.0.0.1:47600: connection refused

@dchen1107 dchen1107 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. team/master labels Aug 5, 2015
@dchen1107
Copy link
Member Author

cc/ @davidopp

@davidopp davidopp self-assigned this Aug 5, 2015
@dchen1107
Copy link
Member Author

You can access the latest failure at: https://app.shippable.com/builds/55c25e6145d4c50b00fe18b0

@ghost ghost added team/control-plane and removed team/master labels Aug 19, 2015
@brendandburns
Copy link
Contributor

This just flaked again. Raising to P0

@brendandburns brendandburns added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Sep 23, 2015
@brendandburns
Copy link
Contributor

https://app.shippable.com/builds/5602e5d74c57620b003f441f

scheduler_test.go:263: Test 0: Pod did not get scheduled on an unschedulable node
scheduler_test.go:278: Test 0: Pod got scheduled on a schedulable node
scheduler_test.go:263: Test 1: Pod did not get scheduled on an unschedulable node
scheduler_test.go:276: Test 1: failed to schedule a pod: timed out waiting for the condition

@freehan
Copy link
Contributor

freehan commented May 10, 2016

--- FAIL: TestUnschedulableNodes (34.47s)
    scheduler_test.go:246: Pod scheduled successfully on unschedulable nodes
    scheduler_test.go:249: Test 0: failed while trying to confirm the pod does not get scheduled on the node: <nil>
    scheduler_test.go:266: Test 0: Pod got scheduled on a schedulable node
    scheduler_test.go:251: Test 1: Pod did not get scheduled on an unschedulable node
    scheduler_test.go:266: Test 1: Pod got scheduled on a schedulable node

@goltermann
Copy link
Contributor

Is this still valid? P0 from 10 months ago, no recent work.

@dims
Copy link
Member

dims commented Jun 6, 2016

We should close this as a dup of #25845

@mml
Copy link
Contributor

mml commented Jun 6, 2016

@dims this is a different symptom. At least, the two issues describe failures with totally different messages.

@davidopp
Copy link
Member

davidopp commented Jun 7, 2016

Closing due to inactivity.

@davidopp davidopp closed this as completed Jun 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

7 participants