Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl e2e tests broken #8300

Closed
ixdy opened this issue May 15, 2015 · 11 comments
Closed

kubectl e2e tests broken #8300

ixdy opened this issue May 15, 2015 · 11 comments
Assignees
Labels
area/kubectl area/test-infra priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@ixdy
Copy link
Member

ixdy commented May 15, 2015

Several of the kubectl e2e tests were broken today by #8249, which added a missing check of an err return value, all of which now fail when doing cleanup and calling kubectl stop.

Example:

kubectl guestbook should create and stop a working application

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/kubectl.go:125
Error running &{/jenkins-master-data/jobs/kubernetes-e2e-gce/workspace/kubernetes/hack/../cluster/../platforms/linux/amd64/kubectl [kubectl --server=https://104.197.99.179 --kubeconfig=/var/lib/jenkins/jobs/kubernetes-e2e-gce/workspace/.kube/config stop -f examples/guestbook] []  <nil>  Error: [Resizing the controller failed with: replicationControllers "frontend" not found; Current resource version Unknown, Resizing the controller failed with: replicationControllers "redis-master" not found; Current resource version Unknown, Resizing the controller failed with: replicationControllers "redis-slave" not found; Current resource version Unknown]
[] <nil> 0xc2083e6860 exit status 1 <nil> true [0xc20803a3f8 0xc20803a438 0xc20803a470] [0xc20803a3f8 0xc20803a438 0xc20803a470] [0xc20803a430 0xc20803a468] [0x657730 0x657730] 0xc20841cc60}:
Command stdout:

stderr:
Error: [Resizing the controller failed with: replicationControllers "frontend" not found; Current resource version Unknown, Resizing the controller failed with: replicationControllers "redis-master" not found; Current resource version Unknown, Resizing the controller failed with: replicationControllers "redis-slave" not found; Current resource version Unknown]

I think this may have been originally broken in #6194. That change refactored the implementation of Stop, which started masking the error, but it also changed the order of code a bit.

Previously, the Stop implementation in pkg/kubectl/stop.go first queried for ReplicationControllers before creating the ResizeCondition; the first thing in the method was

  rc := reaper.ReplicationControllers(namespace)
  controller, err := rc.Get(name)

After the change, this is done following the ResizeCondition in pkg/kubectl/resize.go:

rc := &api.ReplicationController{ObjectMeta: api.ObjectMeta{Namespace: namespace, Name: name}}

Is it possible that we're trying to query for something that's already been deleted, and thus it will always fail?

cc @quinton-hoole @roberthbailey

@ixdy ixdy added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/kubectl area/test-infra labels May 15, 2015
@ixdy
Copy link
Member Author

ixdy commented May 15, 2015

OK, it seems like that's not what's broken; the actual Resize seems like it might be the problem.

It could have broken anytime since April 3, unfortunately.

cc @Kargakis as well.

@ixdy
Copy link
Member Author

ixdy commented May 15, 2015

Looks like #7544 is the culprit. It moved the kubectl tests into separate namespaces, but the cleanup() method is calling kubectl stop with the default namespace. (Also, the namespace may have even been deleted by the time cleanup() is called - but not sure.)

Let's include everyone. @jayunit100

@jayunit100
Copy link
Member

interesting find ; [update] i mis-read this earlier - yes this makes sense ... and also it is interesting that kubectl stop wasn't failing even though it was running on the wrong namespaces. sounds like a bug with that was now fixed, such that it errors when told to cleanup non existent resources.

thanks for fixing this .

@derekwaynecarr
Copy link
Member

The speed of deleting a namespace was improved when moving to new controller framework:

#6771

There previously was a 1 minute delay before noticing that a namespace could be purged.

----- Original Message -----
From: "jay vyas" notifications@github.com
To: "GoogleCloudPlatform/kubernetes" kubernetes@noreply.github.com
Sent: Friday, May 15, 2015 8:07:31 AM
Subject: Re: [kubernetes] kubectl e2e tests broken (#8300)

interesting find ; Looks like now maybe whats happening, is that namespaces are being deleted fast enough that this causes a error. before NS deletion took a long time?


Reply to this email directly or view it on GitHub:
#8300 (comment)

@ghost ghost added this to the v1.0-candidate milestone May 15, 2015
@ghost
Copy link

ghost commented May 15, 2015

I love it when stuff makes sense :-)

@roberthbailey
Copy link
Contributor

#8309 partially fixes this. kubectl update-demo should do a rolling update of a replication controller is still failing consistently.

@ghost
Copy link

ghost commented May 15, 2015

@ixdy Can you keep digging on this one? I'm busy on #8326. Provisionally assigning to you.

@ghost ghost assigned ixdy May 15, 2015
@ixdy
Copy link
Member Author

ixdy commented May 15, 2015

So I actually mentioned this particular case in #8303, though I wasn't sure if it was expected or not.

The test creates update-demo-nautilus, then does a rolling update to update it to update-demo-kitten. We have the cleanup function try to delete both, in case something breaks in the middle of the test, but it's failing because update-demo-nautilus doesn't exist - as expected?

We can just remove the cleanup for this test case, since cleanup should be handled by the namespace deletion. I'm curious how this ever passed before, though.

@ixdy
Copy link
Member Author

ixdy commented May 15, 2015

Created #8333 to remove the cleanup if we think that's a reasonable solution.

@roberthbailey
Copy link
Contributor

@ixdy has this been fixed?

@ixdy
Copy link
Member Author

ixdy commented May 19, 2015

I believe so, yes. It is at least not consistently failing anymore.

@ixdy ixdy closed this as completed May 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl area/test-infra priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants