Skip to content

Kubelet: Cleanup returns early when there is busy orphaned pod directory. #29078

Closed
@Random-Liu

Description

Today, I found that mirror pod node e2e test could never pass on my desktop. When the static pod file is removed, the mirror pod on apiserver is expected to be removed, but that didn't happen.

It turns out that on my machine, there is an orphaned pod directory maybe left from previous running:

$ ls /var/lib/kubelet/pods
3feec45e-4bc0-11e6-bea0-8cdcd43ac064
$ sudo ls /var/lib/kubelet/pods/3feec45e-4bc0-11e6-bea0-8cdcd43ac064/volumes
kubernetes.io~empty-dir

The pod directory was never successfully deleted:

Failed to remove orphaned pod "3feec45e-4bc0-11e6-bea0-8cdcd43ac064" dir; err: remove /var/lib/kubelet/pods/3feec45e-4bc0-11e6-bea0-8cdcd43ac064/volumes/kubernetes.io~empty-dir/restart-count: device or resource busy

The output of mount:

$ mount
...
tmpfs on /var/lib/kubelet/pods/3feec45e-4bc0-11e6-bea0-8cdcd43ac064/volumes/kubernetes.io~empty-dir/restart-count type tmpfs (rw)

The output of fuser:

sudo fuser -v /var/lib/kubelet/pods/3feec45e-4bc0-11e6-bea0-8cdcd43ac064/volumes/kubernetes.io~empty-dir/restart-count
                     USER        PID ACCESS COMMAND
/var/lib/kubelet/pods/3feec45e-4bc0-11e6-bea0-8cdcd43ac064/volumes/kubernetes.io~empty-dir/restart-count:
                     root     kernel mount /var/lib/kubelet/pods/3feec45e-4bc0-11e6-bea0-8cdcd43ac064/volumes/kubernetes.io~empty-dir/restart-count

After the delete failure, kubelet cleanup function will directly return https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet.go#L2077.
And because this is a permanent error, the following mirror pod cleanup code will never run https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet.go#L2082.

  1. In this case, kubelet should continue the following cleanup process.
  2. Pod directory cleanup should handle this kind of orphaned pod directory.
  3. Why is there busy volumes left and never be cleaned up.

@yujuhong @saad-ali
/cc @kubernetes/sig-node @kubernetes/sig-storage

Metadata

Assignees

No one assigned

    Labels

    area/kubeletarea/reliabilitykind/bugCategorizes issue or PR as related to a bug.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions