Node Conformance Test: Containerize the node e2e test #31093

Random-Liu · 2016-08-22T06:57:48Z

For #30122, #30174.
Based on #32427, #32454.

Please only review the last 3 commits.

This PR packages the node e2e test into a docker image:

1st commit: Add NodeConformance flag in the node e2e framework to avoid starting kubelet and collecting system logs. We do this because:
- There are all kinds of ways to manage kubelet and system logs, for different situation we need to mount different things into the container, run different commands. It is hard and unnecessary to handle the complexity inside the test suite.
2nd commit: Remove all sudo in the test container. We do this because:
- In most container, there is no sudo command, and there is no need to use sudo inside the container.
- It introduces some complexity to use sudo inside the test. (e2e_node tests leak kube-apiserver processes #29211, Node e2e framework killing issues #26748) In fact we just need to run the test suite with sudo.
3rd commit: Package the test into a docker container with corresponding Makefile and Dockerfile. We also added a run_test.sh script to start kubelet and run the test container. The script is only for demonstration purpose and we'll also use the script in our node e2e framework. In the future, we should update the script to start kubelet in production way (maybe with systemd or supervisord).

@dchen1107 @vishh
/cc @kubernetes/sig-node @kubernetes/sig-testing

This change is

Release note:

Release alpha version node test container gcr.io/google_containers/node-test-ARCH:0.2 for users to verify their node setup.

luxas · 2016-08-22T09:37:00Z

test/e2e_node/conformance/build/Dockerfile

+
+# The following environment variables can be override when starting the container.
+# FOCUS is regex matching test to run. By default run all conformance test.
+ENV FOCUS="\[Conformance\]"


To save some layers, you can combine all ENV statements

@luxas Thanks, good to know that! Will do.

Random-Liu · 2016-08-22T21:33:08Z

@luxas Thanks for reviewing! Addressed your comments. :)

Random-Liu · 2016-08-26T20:58:31Z

@timstclair @dchen1107 @vishh Can any of you take a look at this PR? :)

vishh · 2016-08-26T21:55:36Z

Reviewing now

timstclair · 2016-11-07T19:32:45Z

test/e2e_node/conformance/build/Dockerfile

+# MANIFEST_PATH is the kubelet manifest path in the container.
+# FLAKE_ATTEMPTS is the time to retry when there is a test failure. By default 2.
+# TEST_ARGS is the test arguments passed into the test.
+ENV FOCUS="\[Conformance\]" SKIP="" PARALLELISM=8 REPORT_PATH="/var/result" MANIFEST_PATH="/etc/manifest" FLAKE_ATTEMPTS=2 TEST_ARGS=""


There aren't currently any serial conformance tests, but I think we should either skip serial, or run serially. Since flakes are more likely with parallel tests, I wonder if running serially would be better anyway?

The only problem is that running serially is too slow. :(
I'll make it skip serial and flaky test now.

FWIW there are some [Conformance] tests under SchedulerPredicates [Serial], so the only way to catch everything conformance-related in one single e2e run is running in serial

But I agree in practice parallel gives signal much faster, and the serial tests are really just refinement

timstclair · 2016-11-07T19:34:34Z

test/e2e_node/conformance/build/Dockerfile

+# MANIFEST_PATH is the kubelet manifest path in the container.
+# FLAKE_ATTEMPTS is the time to retry when there is a test failure. By default 2.
+# TEST_ARGS is the test arguments passed into the test.
+ENV FOCUS="\[Conformance\]" SKIP="" PARALLELISM=8 REPORT_PATH="/var/result" MANIFEST_PATH="/etc/manifest" FLAKE_ATTEMPTS=2 TEST_ARGS=""


nit: style seems to put each var on a separate ENV line

@timstclair Each ENV is a separate image layer, I put them in one line intentionally to avoid extra layers #31093 (comment).

Ah, good to know, thanks. Consider putting 1 var per line then? E.g.

ENV FOCUS=... \ SKIP=... \ ...

timstclair · 2016-11-07T19:35:46Z

test/e2e_node/conformance/build/Dockerfile

+# TEST_ARGS is the test arguments passed into the test.
+ENV FOCUS="\[Conformance\]" SKIP="" PARALLELISM=8 REPORT_PATH="/var/result" MANIFEST_PATH="/etc/manifest" FLAKE_ATTEMPTS=2 TEST_ARGS=""
+
+ENTRYPOINT ginkgo --focus="$FOCUS" --skip="$SKIP" --nodes=$PARALLELISM --flakeAttempts=$FLAKE_ATTEMPTS /usr/local/bin/e2e_node.test -- --conformance=true --prepull-images=false --manifest-path=$MANIFEST_PATH --report-dir=$REPORT_PATH $TEST_ARGS


nit: quote $MANIFEST_PATH and $REPORT_PATH

timstclair · 2016-11-07T19:38:50Z

test/e2e_node/conformance/build/Makefile

+endif
+ifeq ($(ARCH),ppc64le)
+	BASEIMAGE?=ppc64le/debian:jessie
+endif


optional: Cleaner to do:

BASEIMAGE_amd64=debian:jessie BASEIMAGE_arm=armel/debian:jessie BASEIMAGE_arm64=aarch64/debian:jessie BASEIMAGE_ppc64le=ppc64le/debian:jessie BASEIMAGE?=${BASEIMAGE_${ARCH}}

Good idea! Thanks!

timstclair · 2016-11-07T19:43:34Z

test/e2e_node/conformance/run_test.sh

+ARCH=${ARCH:-"amd64"}
+
+# VERSION is the version of the test container image.
+VERSION=${VERSION:-"v0.1"}


Consider dropping the v from the version (it doesn't conform to the semantic versioning standard)

timstclair · 2016-11-07T19:52:41Z

test/e2e_node/conformance/run_test.sh

+  # * kubelet manifest path is mounted to /etc/manifest;
+  # * log collect directory is mounted to /var/result;
+  # * root file system is mounted to /rootfs.
+  sudo docker run -it --rm --privileged=true --net=host  -v /:/rootfs \


Why -it ? Is the test interactive?

We need -it so that we can use Ctrl-C to stop the test at any test.

timstclair · 2016-11-07T19:53:56Z

test/e2e_node/conformance/run_test.sh

+pod_cidr=10.180.0.0/24
+log_level=4
+start_kubelet --api-servers $apiserver \
+  --hostname-override=$(hostname) \


nit: Isn't this the default?

timstclair · 2016-11-07T19:58:05Z

test/e2e_node/e2e_node_suite_test.go

@@ -172,12 +186,11 @@ func validateSystem() error {
 	if err != nil {
 		return fmt.Errorf("can't get current binary: %v", err)
 	}
-	// TODO(random-liu): Remove sudo in containerize PR.
-	output, err := exec.Command("sudo", testBin, "--system-validate-mode").CombinedOutput()
+	output, err := exec.Command(testBin, append([]string{"--system-validate-mode"}, os.Args[1:]...)...).CombinedOutput()


Are you intentionally including all the flags here? If yes, add a comment. Otherwise, use flag.Args()

Yeah, I want to make sure that all processes see the same flag set, or else it confuses people sometimes.

timstclair · 2016-11-07T20:01:23Z

test/e2e_node/remote/remote.go

@@ -352,8 +358,20 @@ func getSshCommand(sep string, args ...string) string {
 	return fmt.Sprintf("'%s'", strings.Join(args, sep))
 }

+// Ssh executes ssh command with runSshCommand as root. The `sudo` makes sure that all commands
+// are executed by root, so that there won't be permission mismatch between different commands.
+func Ssh(host string, cmd ...string) (string, error) {


nit: s/Ssh/SSH/ (same below)

timstclair · 2016-11-07T20:02:13Z

test/e2e_node/services/services.go

-		"--logtostderr",
-		"--vmodule=*="+LOG_VERBOSITY_LEVEL,
-	)
+	startCmd := exec.Command(testBin, append([]string{"--run-services-mode"}, os.Args[1:]...)...)


ditto re:flags

Random-Liu · 2016-11-07T22:59:33Z

@timstclair Addressed comments. Will squash before merging.

timstclair · 2016-11-07T23:11:32Z

LGTM. You can self-apply after squash.

timstclair · 2016-11-07T23:12:42Z

Remove v from container version in release note.

k8s-ci-robot · 2016-11-07T23:14:36Z

Jenkins GCI GKE smoke e2e failed for commit 488e260c236703054d05031cfe3086bd383c06bb. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Random-Liu · 2016-11-07T23:24:02Z

Remove v from container version in release note.

Good catch. And I also need to update the node conformance test document http://kubernetes.io/docs/admin/node-conformance/.

Random-Liu · 2016-11-07T23:28:43Z

Squashed, will self apply LGTM based on #31093 (comment), after all tests pass.

Random-Liu · 2016-11-08T00:02:40Z

Apply LGTM based on #31093 (comment).

luxas · 2016-11-08T00:44:52Z

Great to see this is getting in 👍

k8s-ci-robot · 2016-11-08T04:54:51Z

Jenkins GCE e2e failed for commit 9345e12. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Random-Liu · 2016-11-08T05:38:34Z

@k8s-bot cvm gce e2e test this kubernetes/test-infra#937.

k8s-ci-robot · 2016-11-08T06:35:02Z

Jenkins GCE etcd3 e2e failed for commit 9345e12. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Random-Liu · 2016-11-08T06:52:06Z

@k8s-bot gce etcd3 e2e test this

k8s-github-robot · 2016-11-08T07:41:25Z

Automatic merge from submit-queue

@vishh

…r-node-e2e Automatic merge from submit-queue Add separate build process for node test. This PR is part of kubernetes#31093. However, because currently node e2e is built on `KUBE_TEST_PLATFORMS`, which includes linux/amd64, darwin/amd64, windows/amd64 and linux/arm, it caused kubernetes#32251 to fail. In fact, node e2e is running on the same node with kubelet, and it also has built-in apiserver, etcd and namespace controller. All of them are only built on `KUBE_SERVER_PLATFORMS`, so node e2e should also only be built on those platforms. ``` KUBE_SERVER_PLATFORMS=( linux/amd64 linux/arm linux/arm64 ) ``` This PR added a separate build process for node e2e to address this. @vishh Do you need this for v1.4? because this blocks your kubernetes#32251. /cc @dchen1107 (cherry picked from commit dae3bdd)

Random-Liu added area/test sig/node Categorizes an issue or PR as relevant to SIG Node. area/node-e2e labels Aug 22, 2016

Random-Liu added this to the v1.4 milestone Aug 22, 2016

Random-Liu assigned timstclair, vishh and dchen1107 Aug 22, 2016

googlebot added the cla: yes label Aug 22, 2016

Random-Liu added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Aug 22, 2016

k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note-label-needed labels Aug 22, 2016

Random-Liu removed the release-note-label-needed label Aug 22, 2016

Random-Liu mentioned this pull request Aug 22, 2016

Node Conformance Test: Package Node Conformance Test #30174

Closed

10 tasks

luxas reviewed Aug 22, 2016
View reviewed changes

Random-Liu force-pushed the containerize-node-e2e-test branch from fd046a8 to c87e7f0 Compare August 22, 2016 21:32

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2016

Random-Liu mentioned this pull request Aug 23, 2016

kubelet-gce-e2e-ci is broken and blocking the SQ #31296

Closed

Random-Liu force-pushed the containerize-node-e2e-test branch from c87e7f0 to 4d162b0 Compare August 25, 2016 08:14

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 25, 2016

k8s-github-robot mentioned this pull request Aug 25, 2016

[k8s.io] ScheduledJob should not schedule new jobs when ForbidConcurrent {Kubernetes e2e suite} #30549

Closed

Random-Liu force-pushed the containerize-node-e2e-test branch from 4d162b0 to ca928aa Compare August 25, 2016 21:13

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 25, 2016

dchen1107 mentioned this pull request Aug 26, 2016

Node Conformance Test kubernetes/enhancements#84

Closed

18 tasks

Add containerize flag to avoid starting kubelet and collecting logs.

13a50e3

Random-Liu force-pushed the containerize-node-e2e-test branch from 434bbda to 255eb61 Compare November 7, 2016 04:18

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 7, 2016

timstclair reviewed Nov 7, 2016

View reviewed changes

Random-Liu force-pushed the containerize-node-e2e-test branch 2 times, most recently from d5f75a4 to 488e260 Compare November 7, 2016 22:59

Random-Liu added 2 commits November 7, 2016 15:27

Remove sudo in test suite and run test with sudo.

919935b

Add Dockerfile and Makefile to containerize node conformance test.

9345e12

Random-Liu force-pushed the containerize-node-e2e-test branch from 488e260 to 9345e12 Compare November 7, 2016 23:28

Random-Liu added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2016

k8s-github-robot merged commit 0df6384 into kubernetes:master Nov 8, 2016

Random-Liu deleted the containerize-node-e2e-test branch November 8, 2016 08:16

chentao1596 mentioned this pull request Dec 5, 2016

WIP:kubelet: support multi-headers when getting pod from HTTP source #38089

Closed

mmerkes mentioned this pull request Sep 14, 2020

node-kubelet-master and node-kubelet-conformance seem like duplicates kubernetes/test-infra#18973

Closed

Node Conformance Test: Containerize the node e2e test #31093

Node Conformance Test: Containerize the node e2e test #31093

Conversation

Random-Liu commented Aug 22, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Random-Liu commented Aug 22, 2016

Random-Liu commented Aug 26, 2016 • edited Loading

vishh commented Aug 26, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Random-Liu commented Nov 7, 2016

timstclair commented Nov 7, 2016

timstclair commented Nov 7, 2016

k8s-ci-robot commented Nov 7, 2016

Random-Liu commented Nov 7, 2016

Random-Liu commented Nov 7, 2016 • edited Loading

Random-Liu commented Nov 8, 2016

luxas commented Nov 8, 2016

k8s-ci-robot commented Nov 8, 2016

Random-Liu commented Nov 8, 2016

k8s-ci-robot commented Nov 8, 2016

Random-Liu commented Nov 8, 2016

k8s-github-robot commented Nov 8, 2016

Random-Liu commented Aug 22, 2016 •

edited

Loading

Random-Liu commented Aug 26, 2016 •

edited

Loading

Random-Liu commented Nov 7, 2016 •

edited

Loading