Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of e2e test that re-sizes cluster #8243

Merged
merged 1 commit into from
May 22, 2015

Conversation

jszczepkowski
Copy link
Contributor

Implementation of e2e test that re-sizes cluster; currently works on GCE.

// Context for checking pods responses by issuing GETs to them and verifying if the answer with pod name.
type podResponseChecker struct {
c *client.Client
ns string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like using abbreviations, especially one for namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ns" is a commonly used abbreviation for namespace in kubernetes, and, AFAK, golang standard in general encourages short variable names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I just wanted to mention that I don't like it:) It took me ~1sec to figure out what it stands for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a minute the first time I saw one too, but I think at this point ns is pretty common.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. If it's not only me maybe we should start using full 'namespace' for this...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @gmarek. Feel free to do a holistic cleanup of the e2e tests in this regard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm used to it I kind of like the brevity... but regardless, bikeshedding for a separate PR :-)

@zmerlynn
Copy link
Member

I'm ... ugh. I wish I knew a better way to write this test, because the invasiveness of the variables at the top level, the grosness of shelling out to gcloud in the test - there's a lot that I see wrong and potentially very brittle here.

cc @cjcullen

@jszczepkowski jszczepkowski assigned ghost and unassigned gmarek May 14, 2015
@roberthbailey
Copy link
Contributor

@mbforbes was looking into something similar and may have some suggestions.

@@ -55,6 +55,8 @@ func init() {
flag.StringVar(&cloudConfig.MasterName, "kube-master", "", "Name of the kubernetes master. Only required if provider is gce or gke")
flag.StringVar(&cloudConfig.ProjectID, "gce-project", "", "The GCE project being used, if applicable")
flag.StringVar(&cloudConfig.Zone, "gce-zone", "", "GCE zone being used, if applicable")
flag.StringVar(&cloudConfig.NodeInstanceGroup, "node-instance-group", "", "Name of the manage instance group for nodes. Valid only for gce and gke")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/manage/managed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note this is not valid for GKE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ghost ghost assigned mbforbes and unassigned ghost May 14, 2015
@jszczepkowski
Copy link
Contributor Author

@mbforbes
All comments are applied, PTAL

}
}

func verifyPodsListining(c *client.Client, ns, name string, pods *api.PodList) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verifyPodsResponding? By Listing I understand calling 'get pods' on API server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here—please change to return an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please either change to verifyPodsResponding as @gmarek suggested, or s/Listining/Listening

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Failf("Couldn't restore the original node instance group size: %s", err)
}
if err := waitForNodeInstanceGroupSize(testContext.CloudConfig.NumNodes); err != nil {
Failf("Couldn't restore the original node instance group size: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think errors are typically logged with %v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mbforbes
Copy link
Contributor

OK, I think I've thrown enough comments at you for the time being :-) Please ping and I'll be happy to look again.

The one thing I wanted to ask explicitly—so it doesn't get lost in my syntactic comments—is whether the pods you're creating ("serve hostname") are guaranteed to run only one per node? I think that's what this test is intending, and what is implied, but I wanted to make sure that it's true. I think this can be done by ensuring that the port is locked so it'll conflict if two of the same pod try to get scheduled on a single node. Is the Ports: []api.ContainerPort{{ContainerPort: 9376}}, line in newReplicationController(...) doing that?

@jszczepkowski
Copy link
Contributor Author

@mbforbes
All comments are applied now, PTAL.

Regarding distributing single pod per node: I achieve it by creating a service spanning all the pods (our scheduler distributes pods from the same service to different nodes).

return -1, err
}
pattern := "currentSize: "
i := bytes.Index(output, []byte(pattern))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry if I missed this in the giant comment stream, but is there a reason not to use strings methods (https://golang.org/pkg/strings/) so you don't have to convert to []byte?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed it. Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mbforbes
Copy link
Contributor

Awesome, this is looking gorgeous; thanks so much for your patience @jszczepkowski.

I've left a handful more minor comments, but this LGTM after they're resolved.

ginkgo_args=()
if [[ ${GINKGO_PARALLEL} =~ ^[yY]$ ]]; then
ginkgo_args+=("-p")
fi

# Use the kubectl binary from the same directory as the e2e binary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an intentionally added line? It looks like it might be a merge artifact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be an artifact, removed.

@jszczepkowski
Copy link
Contributor Author

@mbforbes PTAL

@mbforbes
Copy link
Contributor

Looks good, thanks @jszczepkowski!

I'm still a little unclear on the pod rescheduling: because we decrease the number of nodes and ensure the pods are still running, it seems like the pods must not run with a hard requirement of 1/node. That means that the resize-up test could pass without any work being scheduled on the new node.

But regardless, it'd be good to get this test in--in the off-chance I'm not confused about this, we can always tweak it a bit later.

mbforbes added a commit that referenced this pull request May 22, 2015
Implementation of e2e test that re-sizes cluster
@mbforbes mbforbes merged commit 14e47f2 into kubernetes:master May 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants