-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow gathering of stats for root cgroup on v2 #2801
Conversation
Hi @odinuge. Thanks for your PR. I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if cgroups.IsCgroup2UnifiedMode() { | ||
// On cgroup v2 there are no stats at the root cgroup | ||
// so check whether it is the root cgroup | ||
// On cgroup v2 the root cgroup stats have been introduced in recent kernel versions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know from what kernel versions root cgroup stats are supported? That would be good to note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there has been a floating transition to add various stats to the root cgroups in different versions. I am running mainline right now, and there are still some stats lacking right now as well.
I can do a small digging job, but I guess it will take some time (unless I find the data in a structured format somewhere)... :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no biggie if it's hard to find, was just curious :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a look anyway! Always nice with some background info.
container/libcontainer/handler.go
Outdated
} | ||
|
||
cgroupStats, err := h.cgroupManager.GetStats() | ||
if err != nil && !ignoreStatsError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to log something that we've ignored this error? otherwise we're kinda silencing this error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, yeah, that is smart!
Do you think we should use klog.V(4)
to avoid log spam, or should we always display it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In libcontainer, they already do this for rootless containers: https://github.com/opencontainers/runc/blob/6bae53f5894d9a9811239deb756406d9652843de/libcontainer/cgroups/fs2/fs2.go#L144-L147
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, a klog.V(4)
sounds reasonable to avoid spam.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Will fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a V(4)
now. Still not sure what to write in order to avoid avoid misunderstanding... 🤔
@odinuge Thanks for this PR. I am looking into the failures of prow job Currently on cgroup v2, I see failures in retrieving I was wondering if your PR is trying addresses the issues we are seeing |
I ran How to run the tests[harshal@localhost kubernetes]$ git diff
diff --git a/test/e2e/framework/test_context.go b/test/e2e/framework/test_context.go
index 474b61cf996..f395b95ade3 100644
--- a/test/e2e/framework/test_context.go
+++ b/test/e2e/framework/test_context.go
@@ -296,7 +296,7 @@ func RegisterCommonFlags(flags *flag.FlagSet) {
flags.StringVar(&TestContext.ReportPrefix, "report-prefix", "", "Optional prefix for JUnit XML reports. Default is empty, which doesn't prepend anything to the default name.")
flags.StringVar(&TestContext.ReportDir, "report-dir", "", "Path to the directory where the JUnit XML reports should be saved. Default is empty, which doesn't generate these reports.")
flags.Var(cliflag.NewMapStringBool(&TestContext.FeatureGates), "feature-gates", "A set of key=value pairs that describe feature gates for alpha/experimental features.")
- flags.StringVar(&TestContext.ContainerRuntime, "container-runtime", "docker", "The container runtime of cluster VM instances (docker/remote).")
+ flags.StringVar(&TestContext.ContainerRuntime, "container-runtime", "remote", "The container runtime of cluster VM instances (docker/remote).")
flags.StringVar(&TestContext.ContainerRuntimeEndpoint, "container-runtime-endpoint", "unix:///var/run/dockershim.sock", "The container runtime endpoint of cluster VM instances.")
flags.StringVar(&TestContext.ContainerRuntimeProcessName, "container-runtime-process-name", "dockerd", "The name of the container runtime process.")
flags.StringVar(&TestContext.ContainerRuntimePidFile, "container-runtime-pid-file", "/var/run/docker.pid", "The pid file of the container runtime.")
[harshal@localhost kubernetes]$ export KUBE_SSH_USER=core && make test-e2e-node REMOTE=true RUNTIME=remote IMAGE_CONFIG_FILE="/home/harshal/Downloads/image-config-cgrpv2.yaml" INSTANCE_PREFIX="harpatil" CLEANUP=false CONTAINER_RUNTIME_ENDPOINT="unix:///var/run/crio/crio.sock" FOCUS="Summary API" SKIP="\[Flaky\]|\[Slow\]|\[Serial\]" TEST_ARGS='--kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/crio.service --non-masquerade-cidr=0.0.0.0/0" --extra-log="{\"name\": \"crio.log\", \"journalctl\": [\"-u\", \"crio\"]}"' Without this change the failure output looks like, Failed /stats/summary without this change
And with this change the failure output looks like, Failed /stats/summary with this change
Notably, with this change we no longer see the failure in getting |
I am trying to address general issues with cgroup v2, so I guess a side effect is that some tests in k8s will start working for cgroup v2. 😄
Yeah, that is true. cgroup v2 do not have all stats available for the root cgroup, so this is "the best" we can do. I think this is a broader discussion about what we should do with stuff supported in cgroup v1 and not in v2. We can say "this is the best we can do", or we can gather the stats via |
I opt for relying on cgroups rather than procfs. Interested users will update eventually and silent majority will be using v1 for a while (this is just an assumption and I might be wrong). |
I think we should not break the existing API. The So if cgroups v2 do not have those CPU stats for root cgroup, then either cadvisor or kubelet will still have to fetch that information from |
This is a valid point but I'm not sure if we can get same values from procfs. |
I think it may be worth opening separate issue on k/k regarding summary API and cgroup v2 :) @harche Would be interested to understand what specific things are missing on summary API when using cAdvisor on cgroup v2. Definitely agree we need to aim to not break the data returned in summary API on cgroup v2. |
@bobbypage I created an issue on k/k to track the failures we are seeing in NodeConformance and NodeFeatures tests, kubernetes/kubernetes#99230 The issue links to the failed job run, which shows the missing fields in the failed tests' output. |
@odinuge I came to know from @giuseppe that on newer kernel versions, 5.10 and above, we have |
823e8c2
to
748deaa
Compare
748deaa
to
6ba10a3
Compare
LGTM, thanks! |
移植upstream对kubelet及cadvisor的修改,修复使用cgroupv2时指标收集统计的问题 1. port cadvisor pr google/cadvisor#2839 reading cpu stats on cgroup v2 2. port cadvisor pr google/cadvisor#2837 read "max" value for cgroup v2 3. port cadvisor pr google/cadvisor#2801 gathering of stats for root cgroup on v2 4. port cadvisor pr google/cadvisor#2800: Update heuristic for container creation time 5. Fix cgroup handling for systemd with cgroup v2 6. test: adjust summary test for cgroup v2
Some of the root stats are used by kubernetes metrics endpoints, and recent kernel versions expose a lot of the stats on the root cgroup as well.