Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing kubelet /stats/summary API and ResourceMetricsAPI tests on cgroup v2 #99230

Closed
harche opened this issue Feb 19, 2021 · 23 comments · Fixed by #102483 or #102726
Closed

Failing kubelet /stats/summary API and ResourceMetricsAPI tests on cgroup v2 #99230

harche opened this issue Feb 19, 2021 · 23 comments · Fixed by #102483 or #102726
Assignees
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@harche
Copy link
Contributor

harche commented Feb 19, 2021

Which jobs are failing:

pull-kubernetes-node-crio-cgrpv2-e2e

Which test(s) are failing:

E2eNode Suite: [k8s.io] Summary API [NodeConformance] when querying /stats/summary should report resource usage through the stats api
E2eNode Suite: [k8s.io] ResourceMetricsAPI [NodeFeature:ResourceMetrics] when querying /resource/metrics should report resource usage through the resource metrics api

Since when has it been failing:

From what I have observed these tests have always failed on cgroup v2 host.

Testgrid link:

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/99110/pull-kubernetes-node-crio-cgrpv2-e2e/1361561789094957056/

Reason for failure:

cgroup v2 has a different set of files compared to cgroup v1 to report the metrics. A lot of places in cadvisor, where the metrics end up getting read from, still assume cgroup v1 paths and files.

Anything else we need to know:

Some of the following PRs end up addressing the issue partially, but not completely.

google/cadvisor#2801
google/cadvisor#2800

@harche harche added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Feb 19, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 19, 2021
@harche
Copy link
Contributor Author

harche commented Feb 19, 2021

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 19, 2021
@harche
Copy link
Contributor Author

harche commented Feb 19, 2021

/cc @odinuge @bobbypage @rphillips @mrunalp

@ehashman
Copy link
Member

/assign @fromanirh
/triage accepted

@k8s-ci-robot
Copy link
Contributor

@ehashman: GitHub didn't allow me to assign the following users: fromanirh.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @fromanirh
/triage accepted

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 10, 2021
@ffromani
Copy link
Contributor

/assign

@ffromani
Copy link
Contributor

still working my way towards a reproducer.

@giuseppe
Copy link
Member

I've opened google/cadvisor#2837 to fix one issue I've seen.

I think we will also need to modify libcontainer to return memory stats from /proc/meminfo for the root cgroup

/cc @kolyshkin

@giuseppe
Copy link
Member

another fix for cadvisor: google/cadvisor#2839

@giuseppe
Copy link
Member

for completeness, the patch for runc/libcontainer: opencontainers/runc#2873

@giuseppe
Copy link
Member

giuseppe commented Jun 2, 2021

this should be fixed with #102483

@giuseppe
Copy link
Member

giuseppe commented Jun 3, 2021

@harche can we close now?

@odinuge
Copy link
Member

odinuge commented Jun 4, 2021

@harche can we close now?

Test is still failing. See testgrid: https://testgrid.k8s.io/sig-node-cri-o#pr-crio-cgrpv2-gce-e2e

@giuseppe
Copy link
Member

giuseppe commented Jun 4, 2021

hm.. it is failing on:

Expected
    <string>: Summary
to match fields: {
.Node.SystemContainers[pods].Memory:
	Expected
	    <string>: MemoryStats
	to match fields: {
	.MajorPageFaults:
		Expected
		    <uint64>: 198
		to be <=
		    <int>: 10
	}
	
}

not sure how to deal with this test.

Should we handle it differently for cgroup v2? Or just enable it for cgroup v1?

@kolyshkin @odinuge what do you think?

@odinuge
Copy link
Member

odinuge commented Jun 4, 2021

I don't know how changes in conformance test should be done. The issue in this case is that for cgroup v1, pageFaults and majorPageFaults will always be 0 on kubepods.slice due to how the hierarchy stuff works there. Not sure why the conformance test is doing <= 0 instead of just 0.... For v2 these value will sum all the page faults in all descendant processes below it. The same applies to pods as well, but it looks like there are no test catching that.

Updating the value in the test is the "simplest" fix, but it doesn't solve the underlying difference in metrics when switching to cgroup v2 (eg. if this cause someones cluster to go wilde du to the "extreme" increase in this value).

Who is responsible for these metrics/stats, and who should decide what to do? sig-node? I don't really know what to do..

Maybe @ehashman has some thoughts?

@giuseppe
Copy link
Member

giuseppe commented Jun 4, 2021

I am not sure how that metric was used on cgroup v1. We could hardcode it to 0 for cgroup v2, but I don't think it is a good idea as it finally seems to make sense :-) Is it just something that should be documented?

@odinuge
Copy link
Member

odinuge commented Jun 7, 2021

I am not sure how that metric was used on cgroup v1. We could hardcode it to 0 for cgroup v2, but I don't think it is a good idea as it finally seems to make sense :-) Is it just something that should be documented?

No idea myself.

A quick test on my rpi using cgroup v1 I get this:

$ kubectl proxy &
$ curl localhost:8001/api/v1/nodes/k8s-006/proxy/stats/summary -s | jq | grep pageFaults
          "pageFaults": 0,
          "pageFaults": 11041941570, 
      "pageFaults": 889020,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,
        "pageFaults": 0,

Where the first non-null value is kubelet.slice and the other one is the root control group; and that is pretty much expected based on the fact that the value is not hierarchical for cgroup v2.

Not sure who "own" this.. 🤔

giuseppe added a commit to giuseppe/kubernetes that referenced this issue Jun 9, 2021
on cgroup v2 the reported metric is recursive for the entire and it
includes all the sub cgroups.

Adjust the test accordingly.

Closes: kubernetes#99230

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe
Copy link
Member

giuseppe commented Jun 9, 2021

Thanks for trying that.

Opened a PR for comments: #102726

If that is not the right approach, the only alternative I see is to hardcode the value to 0.

@odinuge
Copy link
Member

odinuge commented Jun 9, 2021

The git blame shows that this was introduced here: #57802

Do you have any thoughts about this @dashpole?

@odinuge
Copy link
Member

odinuge commented Jun 9, 2021

Also, the structs were introduced here: #18544

/cc @tallclair

@dashpole
Copy link
Contributor

dashpole commented Jun 9, 2021

I think changing the test is the right thing to do. It is a breaking change to change the metric, but it wasn't particularly useful before. Also, we won't be able to hide the change to the metric for pod cgroup metrics, so we might as well change it for system containers as well.

@odinuge
Copy link
Member

odinuge commented Jun 9, 2021

I think changing the test is the right thing to do. It is a breaking change to change the metric, but it wasn't particularly useful before. Also, we won't be able to hide the change to the metric for pod cgroup metrics, so we might as well change it for system containers as well.

Ahh, sounds good. Thanks!

@borgerli
Copy link
Contributor

borgerli commented Jun 30, 2021

@harche, I'm trying to port the PR to 1.20 version which we're using with cgroups v2. Could you please give more information about the failing tests? such as detailed failure messages. Thanks very much!

Actually, I got the same error when running the test against upstream master with this PR and 1.20:

upstream

• Failure [81.208 seconds]
[sig-node] ResourceMetricsAPI [NodeFeature:ResourceMetrics]
_output/local/go/src/k8s.io/kubernetes/test/e2e_node/framework.go:23
  when querying /resource/metrics
  _output/local/go/src/k8s.io/kubernetes/test/e2e_node/resource_metrics_test.go:45
    should report resource usage through the resource metrics api [It]                                                                                                                                                                                                                  
    _output/local/go/src/k8s.io/kubernetes/test/e2e_node/resource_metrics_test.go:66

    Timed out after 60.004s.
    Expected
        <string>: KubeletMetrics
    to match keys: {
    ."node_cpu_usage_seconds_total"[]:
        Expected
            <string>: Sample
        to match fields: {
        .Value:
                Expected
                    <model.SampleValue>: 0
                to be >=                                                                                                                                                                                                                                                                
                    <int>: 1
        }
    
    ."node_memory_working_set_bytes"[]:
        Expected
            <string>: Sample
        to match fields: {
        .Value:
                Expected
                    <model.SampleValue>: 0
                to be >=
                    <int64>: 10000000
        }
    
    }

1.20

• Failure [84.187 seconds]
[k8s.io] ResourceMetricsAPI [NodeFeature:ResourceMetrics]
/root/k8s-1.20-src/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:624
  when querying /resource/metrics
  _output/local/go/src/k8s.io/kubernetes/test/e2e_node/resource_metrics_test.go:45
    should report resource usage through the resource metrics api [It]
    _output/local/go/src/k8s.io/kubernetes/test/e2e_node/resource_metrics_test.go:66

    Timed out after 60.000s.
    Expected
        <string>: KubeletMetrics
    to match keys: {
    ."node_cpu_usage_seconds_total"[]:
        Expected
            <string>: Sample
        to match fields: {
        .Value:
                Expected
                    <model.SampleValue>: 0
                to be >=
                    <int>: 1
        }
    
    ."node_memory_working_set_bytes"[]:
        Expected
            <string>: Sample
        to match fields: {
        .Value:
                Expected
                    <model.SampleValue>: 0
                to be >=
                    <int64>: 10000000
        }
    
    }

@harche
Copy link
Contributor Author

harche commented Jun 30, 2021

@borgerli I see no failures in cgroup v1 and cgroup v2 jobs with the stats summary test with master.

chenchun pushed a commit to chenchun/kubernetes that referenced this issue Mar 20, 2024
on cgroup v2 the reported metric is recursive for the entire and it
includes all the sub cgroups.

Adjust the test accordingly.

Closes: kubernetes#99230

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
8 participants