error setting cgroup config for procHooks process: ... cpu.weight: no such file or directory #1395

tijmenvandenbrink · 2024-03-18T09:06:47Z

Description

We're experiencing issues with Flatcar versions that run kernel version 6.x and docker 24.x (i.e. latest stable (3815.2.0) and latest beta (3850.1.0)). Containers fail to start because cgroup config can't be set (specifically cpu.weight). See following error:

Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod23bac652_98ba_4c80_a7a6_3e979420831f.slice/docker-47487ca594daa6ae9887f4781a115fc886a392efe4d83876bd95f8c4d8b6ca01.scope/cpu.weight: no such file or directory: unknown\"" pod="somenamespace/k8s-event-logger-bb99597b5-5cl2g" podUID=23bac652-98ba-4c80-a7a6-3e979420831f

Context

Our setup is as follows:

Rancher 2.8.2
RKE 1.5.5
Kubernetes 1.26 / 1.27
Flatcar 3602.2.3 trying to upgrade to latest stable (3815.2.0) or latest beta (3850.1.0)

Some things to mention:

Rancher uses cri-dockerd in their kubelet container to communicate with docker.
We verified we're using cgroup2

Impact

Because of this issue Kubelet is not able to start containers on the node and the node becomes in a faulty state.

Environment and steps to reproduce

Set-up: [ describe the environment Flatcar/Lokomotive/Nebraska etc was running in when encountering the bug; Platform etc. ]

Our setup is as follows:

Rancher 2.8.2
RKE 1.5.5
Kubernetes 1.26 and 1.27
Flatcar 3850.1.0 (Beta)

Task: After upgrading Flatcar 3602.2.3 to latest stable (3815.2.0) or latest beta (3850.1.0) the node becomes in a faulty state not able to schedule pods.
Action(s): See below:
Error: [describe the error that was triggered]
Start Rancher

$ sudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher

Get the password

docker logs <container> 2>&1 | grep "Bootstrap Password:"

Login to the UI and provide the URL you want to have rancher listen on. I used: https://rancher. You will later need to add an entry to /etc/hosts to point rancher to the IP address of the container running rancher.
Tick the RKE1 vs RKE2/K3s box so RKE1 is created
Go to Cluster Management -> Click Create
Select Custom (Use existing nodes and create a cluster using RKE)
Provide a name and click Next
Tick all Node Roles (etcd, controlplane, worker)
Copy the generated command which you'll need later on in the ignition config
Click Done

sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run  rancher/rancher-agent:v2.8.2 --server https://rancher --token n8pvx9t9qwfnjdg7jxhzl2m4qcbxqrjrs2kt2m4zgb4768kh8sfs49 --ca-checksum 83e459403bf410e1bebe2eb69d0d9abdae75b8253de8de86b5b6b6e3f632566a --etcd --controlplane --worker

Change the rancher_agent.service with the command in the previous step. And use that config.json to start a flatcar node.:

{
    "ignition": {
        "config": {},
        "security": {
            "tls": {}
        },
        "timeouts": {},
        "version": "2.3.0"
    },
    "passwd": {
        "users": [
            {
                "name": "core",
                "passwordHash": "provide-some-password-hash"
            },
            {
                "groups": [
                    "wheel",
                    "docker",
                    "sudo"
                ],
                "homeDir": "/home/rke",
                "name": "rke",
                "sshAuthorizedKeys": [],
                "shell": "/bin/bash"
            }
        ]
    },
    "storage": {
        "directories": [
            {
                "filesystem": "root",
                "path": "/etc/systemd/system/docker.service.d",
                "mode": 493
            },
            {
                "filesystem": "root",
                "path": "/etc/modprobe.d",
                "mode": 493
            }
        ],
        "files": [
            {
                "filesystem": "oem",
                "path": "/grub.cfg",
                "contents": {
                    "source": "data:,set%20oem_id%3D%22vmware%22%0Aset%20linux_append%3D%22%22%0A",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/hostname",
                "contents": {
                    "source": "data:,node-01",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/docker/daemon.json",
                "contents": {
                    "source": "data:,%7B%0A%20%20%22log-driver%22:%20%22json-file%22,%0A%20%20%22log-opts%22:%20%7B%0A%20%20%20%20%22max-size%22:%20%2210m%22,%0A%20%20%20%20%22max-file%22:%20%223%22%0A%20%20%7D,%0A%20%20%22default-ulimits%22:%20%7B%0A%20%20%20%20%22nofile%22:%20%7B%0A%20%20%20%20%20%20%22Name%22:%20%22nofile%22,%0A%20%20%20%20%20%20%22Hard%22:%2065536,%0A%20%20%20%20%20%20%22Soft%22:%2065536%0A%20%20%20%20%7D,%0A%20%20%20%20%22nproc%22:%20%7B%0A%20%20%20%20%20%20%22Name%22:%20%22nproc%22,%0A%20%20%20%20%20%20%22Hard%22:%204096,%0A%20%20%20%20%20%20%22Soft%22:%204096%0A%20%20%20%20%7D%0A%20%20%7D,%0A%20%20%22bip%22:%20%22172.31.0.1/16%22%0A%7D",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/sysctl.d/10-ipv6-disable.conf",
                "contents": {
                    "source": "data:,net.ipv6.conf.all.disable_ipv6%20%3D%201%0Anet.ipv6.conf.default.disable_ipv6%20%3D%201%0Anet.ipv6.conf.lo.disable_ipv6%20%3D%201%0A",
                    "verification": {}
                },
                "mode": 416
            },
            {
                "filesystem": "root",
                "path": "/boot/flatcar/hardening.sh",
                "contents": {
                    "source": "data:,%23!%2Fbin%2Fsh%0A%0A%23%20etcd%20user%2Fgroup%20fix%0Agrep%20-q%20%22%5Eetcd%3A%22%20%2Fetc%2Fgroup%20%7C%7C%20echo%20%22etcd%3Ax%3A52034%3A%22%20%3E%3E%20%2Fetc%2Fgroup%0Agrep%20-q%20%22%5Eetcd%3A%22%20%2Fetc%2Fpasswd%20%7C%7C%20echo%20%22etcd%3Ax%3A52034%3A52034%3A%3A%2Fdev%2Fnull%3A%2Fsbin%2Fnologin%22%20%3E%3E%20%2Fetc%2Fpasswd%0A%5B%5B%20-d%20%2Fopt%2Frke%2Fvar%2Flib%2Fetcd%20%5D%5D%20%26%26%20chown%20-R%20etcd%3Aetcd%20%2Fopt%2Frke%2Fvar%2Flib%2Fetcd%0Aexit%200%0A",
                    "verification": {}
                },
                "mode": 448
            },
            {
                "filesystem": "root",
                "path": "/etc/systemd/system/docker.service.d/override.conf",
                "contents": {
                    "source": "data:,%5BService%5D%0AEnvironment%3DTORCX_IMAGEDIR%3D%2Fdocker%20DOCKER_SELINUX%3D--selinux-enabled%3Dfalse%0AExecStartPre%3D%2Fbin%2Fbash%20-c%20'%2Fusr%2Fbin%2Fecho%20N%20%3E%20%2Fsys%2Fmodule%2Foverlay%2Fparameters%2Fredirect_dir'%0AExecStartPre%3D%2Fbin%2Fbash%20-c%20'%2Fusr%2Fbin%2Fecho%20N%20%3E%20%2Fsys%2Fmodule%2Foverlay%2Fparameters%2Fmetacopy'%0A",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/profile.d/auto-logout.sh",
                "contents": {
                    "source": "data:,export%20TMOUT%3D600%0A",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/multipath.conf",
                "contents": {
                    "source": "data:,defaults%20%7B%0A%20%20user_friendly_names%20yes%0A%20%20find_multipaths%20no%0A%7D%0A",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/modprobe.d/disable_overlay_redirect_dir.conf",
                "contents": {
                    "source": "data:,options%20overlay%20redirect_dir%3Doff",
                    "verification": {}
                },
                "mode": 420
            },
            {
                "filesystem": "root",
                "path": "/etc/coreos/update.conf",
                "contents": {
                    "source": "data:,GROUP%3Dbeta%0A",
                    "verification": {}
                },
                "mode": 420
            }
        ],
        "filesystems": [
            {
                "mount": {
                    "device": "/dev/disk/by-label/OEM",
                    "format": "ext4",
                    "label": "OEM"
                },
                "name": "oem"
            }
        ],
        "links": [
            {
                "filesystem": "root",
                "path": "/etc/localtime",
                "target": "/usr/share/zoneinfo/Europe/Amsterdam"
            },
            {
                "filesystem": "root",
                "path": "/etc/systemd/system/multi-user.target.wants/docker.service",
                "target": "/run/systemd/system/docker.service"
            }
        ]
    },
    "systemd": {
        "units": [
            {
                "enabled": false,
                "name": "sshd.socket"
            },
            {
                "enabled": true,
                "name": "ntpd.service"
            },
            {
                "enabled": true,
                "name": "docker.service"
            },
            {
                "enabled": true,
                "name": "update-engine.service"
            },
            {
                "enabled": true,
                "name": "multipathd.service"
            },
            {
                "enabled": true,
                "name": "iscsid.service"
            },
            {
                "mask": true,
                "name": "locksmithd.service"
            },
            {
                "contents": "[Unit]\nDescription=Start Rancher Agent\nConditionPathExists=!/boot/flatcar/rancher_agent_firstboot\nAfter=network.target\n\n[Service]\nType=simple\nExecStart=/usr/bin/docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run  rancher/rancher-agent:v2.8.2 --server https://rancher --token lmn9cjlq5g7tl5c6rr5pwk6rhtsmbxhk99gphkf9q7vh7nf5s2spv7 --ca-checksum 85e6882c37f6bbc4994c9d523231fe59d19b48adbe99b802919b8a67efddbf51 --etcd --controlplane --worker \nRemainAfterExit=true\nExecStartPost=touch /boot/flatcar/rancher_agent_firstboot\nRestart=on-failure\nRestartSec=30\n\n[Install]\nWantedBy=multi-user.target\n",
                "enabled": true,
                "name": "rancher_agent.service"
            },
            {
                "contents": "[Unit]\nDescription=Hardening flatcar\nAfter=rancher_agent.service\n\n[Service]\nType=simple\nExecStart=/boot/flatcar/hardening.sh\n\n[Install]\nWantedBy=multi-user.target\n",
                "enabled": true,
                "name": "hardening.service"
            }
        ]
    }
}

I'm using qemu to start the node: ./flatcar_production_qemu.sh -i config.json -nographic -smp 4 -m 4096
When the node booted alter /etc/hosts so the url provided in step 3 resolves correctly.
This part takes a while (depending on the resources you allocated to the machine). You can see the progress of bootstrapping by running:

journalctl -u rancher_agent.service -f and later by looking at the docker logs of the rancher_agent container on the flatcar node. And the docker logs of the rancher container.
After a while you should have a functioning one node cluster. (Note that due to - I think - resource issues in my case the node became in active state, but there was an issue preventing it to be 100% usable). The next couple of steps are therefor not tested locally but are tested in our environment.
Verify that pods can be scheduled and you don't see messages like this: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod23bac652_98ba_4c80_a7a6_3e979420831f.slice/docker-47487ca594daa6ae9887f4781a115fc886a392efe4d83876bd95f8c4d8b6ca01.scope/cpu.weight: no such file or directory: unknown\"" pod="somenamespace/k8s-event-logger-bb99597b5-5cl2g" podUID=23bac652-98ba-4c80-a7a6-3e979420831f
Upgrade the flatcar node. If you point /etc/flatcar/update.conf to group=beta and run the following command it will start the upgrade: update_engine_client -update.
Reboot the node
When the node is rebooted you will start to see the errors. You can verify cpu.weight is not present in any of the slices under /sys/fs/cgroup/* and containers won't start.

Not sure this is related but as of linux kernel 6.6 CFS seems to be replaced by EEVDF. See here

Among the core changes introduced in this release, one of particular interest is the replacement of the CFS scheduler with the earliest eligible virtual deadline first (EEVDF) CPU scheduler. EEVDF is also a virtual-time scheduler, but in contrast to CFS, which only uses one weight parameter, it employs two parameters: relative deadline and weight (see [LWN article](https://lwn.net/Articles/925371/) for more details). EEVDF has a better-defined scheduling policy; it removes a lot of the CFS heuristics and results in fewer knobs. Even though this scheduler offers improved performance and fairness, rare performance regressions are expected with some adversarial workloads; efforts to address regressions are ongoing and will continue post-release. This kernel version also significantly improves the memory efficiency of the tracing subsystem, with eventfs now assigning inodes and dentries structures needed for tracepoints only when tracing is actually used.

The text was updated successfully, but these errors were encountered:

ader1990 · 2024-03-18T10:27:28Z

Hello,

This issue looks interesting, as the cpu.weight in my environment with Flatcar 3908 and 3850 on ARM64/AMD64 seems to be present.

 cat config |grep -i CONFIG_SCHED_
# CONFIG_SCHED_CORE is not set
CONFIG_SCHED_MM_CID=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SCHED_MC is not set
# CONFIG_SCHED_CLUSTER is not set
CONFIG_SCHED_SMT=y
CONFIG_SCHED_HRTICK=y
CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
# CONFIG_SCHED_TRACER is not set
sh-5.2# ls -la /sys/fs/cgroup/*.slice/ |grep -i weight
-rw-r--r--.  1 root root 0 Mar 14 15:31 cpu.weight
-rw-r--r--.  1 root root 0 Mar 14 15:31 cpu.weight.nice
-rw-r--r--.  1 root root 0 Mar 14 15:31 io.bfq.weight
-rw-r--r--.  1 root root 0 Mar 14 15:30 cpu.weight
-rw-r--r--.  1 root root 0 Mar 14 15:31 cpu.weight.nice
-rw-r--r--.  1 root root 0 Mar 14 15:30 io.bfq.weight
-rw-r--r--.  1 root root 0 Mar 14 15:30 cpu.weight
-rw-r--r--.  1 root root 0 Mar 14 15:30 cpu.weight.nice
-rw-r--r--.  1 root root 0 Mar 14 15:30 io.bfq.weight
sh-5.2#
sh-5.2# uname -a
Linux sut01-altra 6.6.17-flatcar #1 SMP PREEMPT Thu Mar 14 13:20:23 -00 2024 aarch64 GNU/Linux
sh-5.2# cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3908.0.0+nightly-20240313-2100-50-g11449d2458
VERSION_ID=3908.0.0
BUILD_ID=nightly-20240313-2100-50-g11449d2458
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3908.0.0+nightly-20240313-2100-50-g11449d2458 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="arm64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3908.0.0+nightly-20240313-2100-50-g11449d2458:*:*:*:*:*:*:*"

I'll try to reproduce your workflow to see if the issue reproduces on my local environment.

jepio · 2024-03-18T14:16:07Z

Thanks for these instructions, super helpful.

This is an insane issue that I can't figure out yet. The following ignition json reproduces it on 3850.1.0, but i can't figure out why:

{
    "ignition": {
        "config": {},
        "security": {
            "tls": {}
        },
        "timeouts": {},
        "version": "2.3.0"
    },
    "systemd": {
        "units": [
            {
                "enabled": true,
                "name": "multipathd.service"
            },
            {
                "mask": true,
                "name": "locksmithd.service"
            }
        ]
    }
}

ader1990 · 2024-03-18T15:49:03Z

I reproduced the full environment with rancher. Following up @jepio findings, I found by trial and error that if you stop the systemd unit multipathd even for just a moment, everything works.

To reproduce the issue, you can just enable and start the service multipathd and run:

echo '+cpu' >> /sys/fs/cgroup/cgroup.subtree_control
-bash: echo: write error: Invalid argument

From https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=931243, it looks that if there s any real time priority process running, the cpu cgroup2 controller cannot be enabled.
I checked then multpathd process and voila, running with real time priority:

ps -eo command,rtprio|grep -i multipathd
/sbin/multipathd -d -s          99

jepio · 2024-03-18T16:25:07Z

I need to verify what scheduler config was in previous flatcar versions but the best workaround I can think of is to disable realtime priority for multipathd (assuming you depend on it) by adding a dropin:

# /etc/systemd/system/multipathd.service.d/override.conf
[Service]
RestrictRealtime=yes
Nice=-20

jepio · 2024-03-18T16:45:36Z

This will likely be the fix: flatcar/scripts#1771

jepio · 2024-03-19T16:42:37Z

I cherry-picked the fix to all branches, it won't be part of this weeks release (#1391), only the one after.

tijmenvandenbrink · 2024-03-20T07:02:01Z

@jepio would it be possible to get it into this release? This prevented users of multipathd from upgrading and are missing the runc cve fix. This would be much appreciated.

jepio · 2024-03-20T08:54:40Z

Unfortunately not - the release is already in progress and delayed a week from when it should have happened.

Sorry that you're blocked from upgrading. You can apply the fix to your nodes manually before the upgrade, create:
/etc/systemd/system/multipathd.service.d/override.conf with the contents:

[Service]
RestrictRealtime=yes
Nice=-20

tijmenvandenbrink added the kind/bug Something isn't working label Mar 18, 2024

github-project-automation bot added this to Flatcar tactical, release planning, and roadmap Mar 18, 2024

github-project-automation bot moved this to 📝 Needs Triage in Flatcar tactical, release planning, and roadmap Mar 18, 2024

jepio closed this as completed Mar 19, 2024

github-project-automation bot moved this from 📝 Needs Triage to Implemented in Flatcar tactical, release planning, and roadmap Mar 19, 2024

github-actions bot mentioned this issue Mar 22, 2024

Monthly contributions report 2024-02-22 - 2024-03-21 #1398

Closed

dongsupark removed this from Flatcar tactical, release planning, and roadmap Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error setting cgroup config for procHooks process: ... cpu.weight: no such file or directory #1395

error setting cgroup config for procHooks process: ... cpu.weight: no such file or directory #1395

tijmenvandenbrink commented Mar 18, 2024

ader1990 commented Mar 18, 2024

jepio commented Mar 18, 2024 •

edited

Loading

ader1990 commented Mar 18, 2024 •

edited

Loading

jepio commented Mar 18, 2024

jepio commented Mar 18, 2024

jepio commented Mar 19, 2024

tijmenvandenbrink commented Mar 20, 2024

jepio commented Mar 20, 2024

error setting cgroup config for procHooks process: ... cpu.weight: no such file or directory #1395

error setting cgroup config for procHooks process: ... cpu.weight: no such file or directory #1395

Comments

tijmenvandenbrink commented Mar 18, 2024

Description

Context

Impact

Environment and steps to reproduce

ader1990 commented Mar 18, 2024

jepio commented Mar 18, 2024 • edited Loading

ader1990 commented Mar 18, 2024 • edited Loading

jepio commented Mar 18, 2024

jepio commented Mar 18, 2024

jepio commented Mar 19, 2024

tijmenvandenbrink commented Mar 20, 2024

jepio commented Mar 20, 2024

jepio commented Mar 18, 2024 •

edited

Loading

ader1990 commented Mar 18, 2024 •

edited

Loading