PR builder: Cluster failed to initialize within 300 seconds #28641

timstclair · 2016-07-07T22:25:13Z

Happened several times in the last few runs:
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/28543/kubernetes-pull-build-test-e2e-gce/48158/
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/26696/kubernetes-pull-build-test-e2e-gce/48157/
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/27243/kubernetes-pull-build-test-e2e-gce/48156/

Waiting up to 300 seconds for cluster initialization.

  This will continually check to see if the API for kubernetes is reachable.
  This may time out if there was some uncaught error during start up.

...........................................Cluster failed to initialize within 300 seconds.
2016/07/07 15:09:32 e2e.go:218: Error running up: exit status 2
2016/07/07 15:09:32 e2e.go:214: Step 'up' finished in 7m56.450395031s
2016/07/07 15:09:32 e2e.go:114: Error starting e2e cluster. Aborting.
exit status 1

The text was updated successfully, but these errors were encountered:

timstclair · 2016-07-07T22:25:47Z

@krousey @fejta

timstclair · 2016-07-07T22:31:46Z

Looks like a GCE issue:

ERROR: (gcloud.compute.firewall-rules.delete) Some requests did not succeed:
 - The resource 'projects/k8s-jkns-pr-gce/global/firewalls/e2e-gce-agent-pr-38-0-minion-e2e-gce-agent-pr-38-0-http-alt' was not found

ERROR: (gcloud.compute.firewall-rules.delete) Some requests did not succeed:
 - The resource 'projects/k8s-jkns-pr-gce/global/firewalls/e2e-gce-agent-pr-38-0-minion-e2e-gce-agent-pr-38-0-nodeports' was not found

A recent run failed with a slightly different error:
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/28639/kubernetes-pull-build-test-e2e-gce/48160/

ERROR: (gcloud.compute.instances.create) Some requests did not succeed:
 - The resource 'projects/google-containers/global/images/gci-dev-53-8530-6-0' is obsolete.  New uses are not allowed.  A suggested replacement is 'projects/google-containers/global/images/gci-dev-53-8490-0-0'.

Created [https://www.googleapis.com/compute/v1/projects/k8s-jkns-pr-gce/global/firewalls/e2e-gce-agent-pr-27-0-minion-all].
NAME                              NETWORK                SRC_RANGES     RULES                     SRC_TAGS  TARGET_TAGS
e2e-gce-agent-pr-27-0-minion-all  e2e-gce-agent-pr-27-0  10.180.0.0/14  tcp,udp,icmp,esp,ah,sctp            e2e-gce-agent-pr-27-0-minion
Some commands failed.
...
ERROR: (gcloud.compute.instances.describe) Could not fetch resource:
 - The resource 'projects/k8s-jkns-pr-gce/zones/us-central1-f/instances/e2e-gce-agent-pr-27-0-master' was not found

2016/07/07 15:14:40 e2e.go:218: Error running up: exit status 1
2016/07/07 15:14:40 e2e.go:214: Step 'up' finished in 2m0.195399639s
2016/07/07 15:14:40 e2e.go:114: Error starting e2e cluster. Aborting.
exit status 1

timstclair · 2016-07-07T22:34:30Z

Possibly a duplicate of #28612?

timstclair · 2016-07-08T00:13:11Z

This appears to have resolved itself.

timstclair · 2016-07-22T00:59:09Z

Happened again.

ixdy · 2016-07-25T19:12:50Z

Still flaking. cc @kubernetes/test-infra-maintainers

gmarek · 2016-07-25T20:52:34Z

@ixdy - can you paste the link to logs or a suite/run number?

ixdy · 2016-07-25T20:56:19Z

my own pet peeve! sorry about that.
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/29477/kubernetes-pull-build-test-e2e-gce/50321/ is the failing run.

gmarek · 2016-07-25T21:20:25Z

The problem here is that kubelet on master node didn't start apiserver (or any other master components) @dchen1107

dchen1107 · 2016-07-25T21:52:35Z

The most recent failure reported by @ixdy has the error message in kubelet.log of the master node:

Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/memory/system.slice/kube-logrotate.service: no such file or directory caused kubelet not being able to sync pods.

I think it is a dup of #28997, which should be fixed by #29492

ixdy · 2016-07-25T21:55:05Z

Thanks @dchen1107! I should probably have also pointed out that the failure I linked was on a PR being cherry-picked into release-1.3, so it might be fixed in master already.

timstclair added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. kind/flake Categorizes issue or PR as related to a flaky test. labels Jul 7, 2016

timstclair mentioned this issue Jul 7, 2016

Fix mungedocs TOC generation for duplicate headers #28543

Merged

timstclair closed this as completed Jul 8, 2016

euank mentioned this issue Jul 8, 2016

Better visibility to contributing docs. #28639

Merged

timstclair mentioned this issue Jul 22, 2016

Fix killing child sudo process in e2e_node tests #29380

Merged

timstclair reopened this Jul 22, 2016

ixdy mentioned this issue Jul 25, 2016

Automated cherry pick of #26439 #29477

Merged

dchen1107 closed this as completed Jul 25, 2016

bprashanth mentioned this issue Sep 7, 2016

Cluster setup in e2e tests not reliable enough nor transparent enough #31273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR builder: Cluster failed to initialize within 300 seconds #28641

PR builder: Cluster failed to initialize within 300 seconds #28641

timstclair commented Jul 7, 2016 •

edited

Loading

timstclair commented Jul 7, 2016

timstclair commented Jul 7, 2016

timstclair commented Jul 7, 2016

timstclair commented Jul 8, 2016

timstclair commented Jul 22, 2016

ixdy commented Jul 25, 2016

gmarek commented Jul 25, 2016

ixdy commented Jul 25, 2016

gmarek commented Jul 25, 2016 •

edited

Loading

dchen1107 commented Jul 25, 2016

ixdy commented Jul 25, 2016

PR builder: Cluster failed to initialize within 300 seconds #28641

PR builder: Cluster failed to initialize within 300 seconds #28641

Comments

timstclair commented Jul 7, 2016 • edited Loading

timstclair commented Jul 7, 2016

timstclair commented Jul 7, 2016

timstclair commented Jul 7, 2016

timstclair commented Jul 8, 2016

timstclair commented Jul 22, 2016

ixdy commented Jul 25, 2016

gmarek commented Jul 25, 2016

ixdy commented Jul 25, 2016

gmarek commented Jul 25, 2016 • edited Loading

dchen1107 commented Jul 25, 2016

ixdy commented Jul 25, 2016

timstclair commented Jul 7, 2016 •

edited

Loading

gmarek commented Jul 25, 2016 •

edited

Loading