-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[job failure] gce-master-1.8-downgrade-cluster #56244
Comments
Can we have a status update on this issue from the SIG? This issue has become critical for 1.9 release. Thanks! |
The test timed out waiting for the node to be recreated after node drain.
|
Now tracking against v1.9.0-beta.2 (kubernetes/sig-release#39) |
@yguo0905 is going to take a look |
@yguo0905 status update? |
For the failed run https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-master-new-downgrade-cluster/168?log#log Node bootstrap-e2e-minion-group-9sch cannot register to master because
The node is of the newly created template bootstrap-e2e-minion-template-v1-8-5-beta-0-60-dcbe09a08ac68d, which does not contain kubernetes/cluster/gce/upgrade.sh Lines 270 to 276 in 2175199
This doesn't seem to be a node issue (in scope of sig-node). @zmerlynn, do you happen to know whether some change caused this issue? Is this test critical for 1.9 release? |
@yguo0905 historically we have treated failing jobs / tests in the https://k8s-testgrid.appspot.com/sig-release-master-upgrade dashboard as release-blockers; this is how I'm treating them as well as the CI Signal Lead for this release https://github.com/kubernetes/sig-release/blob/master/release-process-documentation/release-team-guides/ci-signal-playbook.md#code-freeze |
Now tracking against v1.9.0 (kubernetes/sig-release#40) All automated downgrade jobs are failing, this could really use some attention |
Could someone from sig-cluster-lifecycle take a look at the issue on #56244 (comment)? Is |
@spiffxp FWIW; I'm running some basic downgrade tests manually using kubeadm to some coverage generally, but it really doesn't test everything, only ~what's in Conformance tests, which is a low bar, but anyway... |
@enisoc Well... I got through a node downgrade, which is where it was hanging. Now master downgrade doesn't work. I think it's because we changed etcd versions and etcd is refusing to downgrade. |
Just ran a test. If we deploy the 1.9 cluster with ETCD_VERSION=3.0.17 (the etcd version of 1.8) then master downgrade succeeds. |
xref: #57013 |
The downgrade test is now running, but some of the tests are failing: https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-master-1.8-downgrade-cluster Cluster downgrade [sig-apps] daemonset-upgrade
[sig-cluster-lifecycle] Downgrade [Feature:Downgrade] cluster downgrade should maintain a functioning cluster [Feature:ClusterDowngrade]
[k8s.io] [sig-node] Kubelet [Serial] [Slow] [k8s.io] [sig-node] regular resource usage tracking resource tracking for 100 pods per node
|
[MILESTONENOTIFIER] Milestone Issue Current Note: This issue is marked as Example update:
Issue Labels
|
The DaemonSet issues appear to be a flaky test. The condition it wants is actually true (there is one Pod on each Node), but the test seems to have the wrong idea of which Nodes exist.
|
I'm also not terribly concerned about the other test showing 0.002 more CPU usage than desired:
|
It seems those were indeed flakes. The latest run is fully green: |
/priority critical-urgent
/priority failing-test
/kind bug
/status approved-for-milestone
@kubernetes/sig-cluster-lifecycle-test-failures
This job has been failing since at least 2017-11-08. It's on the sig-release-master-upgrade dashboard,
and prevents us from cutting [v1.9.0-beta.1] (kubernetes/sig-release#34). Is there work ongoing to bring this job back to green?
https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-master-1.8-downgrade-cluster
The text was updated successfully, but these errors were encountered: