Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes-e2e-gce-master-on-cvm has been broken since 5/31 #26850

Closed
zmerlynn opened this issue Jun 5, 2016 · 10 comments
Closed

kubernetes-e2e-gce-master-on-cvm has been broken since 5/31 #26850

zmerlynn opened this issue Jun 5, 2016 · 10 comments
Assignees
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@zmerlynn
Copy link
Member

zmerlynn commented Jun 5, 2016

We have a test suite called kubernetes-e2e-gce-master-on-cvm running to allow us to, in theory, switch back from GCI masters to Container VM masters at any point.

We succeeded at http://kubekins.dls.corp.google.com/job/kubernetes-e2e-gce-master-on-cvm/lastStableBuild/ (c1c0567) and then failed at http://kubekins.dls.corp.google.com/job/kubernetes-e2e-gce-master-on-cvm/338 (ee412ef), and it's all consistently DNS related issues, so I'm pretty sure #26335 broke this suite entirely.

It's also somewhat blocking me from evaluating a new version of Container VM, but not really - I think we're safe to evaluate it on nodes only for now.

cc @roberthbailey @thockin

@zmerlynn zmerlynn added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 5, 2016
@zmerlynn zmerlynn added this to the v1.3 milestone Jun 5, 2016
@zmerlynn
Copy link
Member Author

zmerlynn commented Jun 5, 2016

An alternate resolution to this bug is that we abandon CVM masters entirely in open source, but that's a larger decision.

@roberthbailey
Copy link
Contributor

We want to keep CVM working for now (which is why I stuck this test into the 1.3 tab in Jenkins).

@girishkalele -- can you please take a look tomorrow?

@girishkalele
Copy link

I'll fix it - should be straightforward - #26335 actually was to fix the breakage in DNS when we moved to GCI masters, the skydns yaml files were moved and the dns pod is probably not even starting up in CVM masters.

@girishkalele
Copy link

@zmerlynn

Is there a way I can ask the testbot to trigger this specific Jenkins (CVM-on-Master) test on my pull request ?

The fix is simple - svc and rc DNS files were renamed.

Jun  6 17:03:48 jenkins-e2e-master startupscript: [INFO    ] Executing state file.managed for /etc/kubernetes/addons/dns/kubedns-svc.yaml
Jun  6 17:03:48 jenkins-e2e-master startupscript: [ERROR   ] Unable to cache file 'salt://kube-dns/kubedns-svc.yaml.in' from saltenv 'base'.
Jun  6 17:03:48 jenkins-e2e-master startupscript: [ERROR   ] Source file salt://kube-dns/kubedns-svc.yaml.in not found
Jun  6 17:03:48 jenkins-e2e-master startupscript: [ERROR   ] Unable to cache file 'salt://kube-dns/kubedns-rc.yaml.in' from saltenv 'base'.
Jun  6 17:03:48 jenkins-e2e-master startupscript: [ERROR   ] Source file salt://kube-dns/kubedns-rc.yaml.in not found

@zmerlynn
Copy link
Member Author

zmerlynn commented Jun 6, 2016

@girishkalele: No, but it's easy to kube-up a CVM master (just override KUBE_OS_DISTRIBUTION to debian), and if it's just detecting whether DNS comes up, you can just do it manually fairly fast.

@girishkalele
Copy link

Got an error setting just the KUBE_OS_DISTRIBUTION flag.
'''

Then, I set all image related KUBE variables to the below and the master doesn't get ready.

KUBE_GCE_NODE_IMAGE=container-v1-3-v20160604
KUBE_GCE_MASTER_IMAGE=container-v1-3-v20160604
KUBE_OS_DISTRIBUTION=debian

@zmerlynn
Copy link
Member Author

zmerlynn commented Jun 6, 2016

@girishkalele: That last set is exactly what kubernetes-e2e.yaml sets, so I'm not sure what's different with you and Jenkins:

                export KUBE_OS_DISTRIBUTION="debian"                                                                         
                export KUBE_GCE_MASTER_IMAGE="container-v1-3-v20160604"                                                      
                export KUBE_GCE_NODE_IMAGE="container-v1-3-v20160604"                                                        

@girishkalele
Copy link

The Debian debian-8 images are not visible when I do gcloud compute instances list as myself.

I think it is not accessible to folks who don't have read perms for google_containers.

@zmerlynn
Copy link
Member Author

zmerlynn commented Jun 6, 2016

That shouldn't be launching debian-8 at all.

@girishkalele
Copy link

Closed via #26902

Merged at 2:09 PM, Jenkins jobs started passing right after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants