-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node' #29764
Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node' #29764
Conversation
/cc @kubernetes/sig-node |
@coufon, there are many commits in your PR. Could you squash them and only leave functionally meaningful commits? Thanks! |
/cc @rrati @mffiedler @jeremyeder - add to node-vertical. |
@@ -251,6 +251,9 @@ func (es *e2eService) startKubeletServer() (*killCmd, error) { | |||
"--file-check-frequency", "10s", // Check file frequently so tests won't wait too long | |||
"--v", LOG_VERBOSITY_LEVEL, "--logtostderr", | |||
"--pod-cidr=10.180.0.0/24", // Assign a fixed CIDR to the node because there is no node controller. | |||
"--cgroup-root=/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work with systemd node.
cc/ @yifan-gu on CoreOS configuration, and cc/ @derekwaynecarr on the rest based on "standardized" systemd node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag (and the other cgroup related flags) should not be explicitly set. Just take default kubelet behaviors.
Looks good to me overall. We need to figure out how to pass those different cgroup configurations though. |
Copyright 2016 The Kubernetes Authors. | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The boilerplate is wrong. If you compare this with other files in the repository, you'll notice that an empty line missing before an after the URL. They probably got lost during copy/paste. Need to fix that to pass the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
densityTests := []DensityTest{ | ||
{ | ||
podsNr: 10, | ||
interval: 0 * time.Millisecond, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer defining the pod startup throughput (pods per second) and convert it into an interval. Pod throughput is more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the `throughput' in configuration may be different from the throughput measured from result. For example, if the interval is zero, the throughput perhaps is infinite. In this case how to define it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was a naming issue, just name it more precisely. E.g., targetThroughput
I am not sure we should have a 0 interval anyway...
The build is failed with the error:
Forgot uploading some changes? |
LGTM |
GCE e2e build/test passed for commit 04f83c7. |
Automatic merge from submit-queue |
I believe it broke k8s build. Reverting - I'll re-revert if it won't help. What's strange is that the build works locally for me... @fejta - it may be some problem with our build system |
Revert fixed the build. @coufon @dchen1107 PTAL. |
Looks like it skipped re-test because tests already succeeded at that SHA. |
I think the issue is that this PR only builds on linux. e.g. https://github.com/opencontainers/runc/blob/519529febe2e0acb14e39cf54112c292ccb2eabe/libcontainer/cgroups/utils.go, which defines To make SQ faster, we only build for Linux in PR Jenkins. So it passed there. |
Automatic merge from submit-queue Resolve docker-daemon cgroup issue for both systemd and non-systemd node for node e2e tests Fixed #29827 cc/ @coufon this should unblock your pr: #29764 I validated both containervm image and coreos image, and works as expected. This is also required for adding gci image to node e2e test infrastructure.
Automatic merge from submit-queue Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node' built for Linux only This PR adds `+build linux' to density_test.go, resource_usage.go and resource_collector.go to last PR #29764. #29764 fails build because it depends on cgroup which can not be built for os other than Linux.
This PR contains two new tests (migrate from e2e test):
It tests both batch creation and sequential (back-to-back) creation and report the throughputs.
The test creates a new resource controller for `test-node-e2e' (resource_controller.go) which monitors resource through a standalone Cadvisor pod (port 8090) with 1s housekeeping interval.