fix 111933 #112012

cvvz · 2022-08-24T14:29:54Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

Fixes #111933, do not check the existence of mount point during reconstruction. If mount point does not exist, reconstruction will success and update reconstructed volume into ASW, which will be unmounted and cleaned up in the subsequent reconcile.

Which issue(s) this PR fixes:

Fixes #111933

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2022-08-24T14:30:02Z

@cvvz: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2022-08-24T14:30:03Z

Hi @cvvz. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

yangjunmyfm192085 · 2022-08-24T14:44:03Z

/release-note-none

gnufied · 2022-08-24T15:04:49Z

pkg/volume/csi/csi_mounter.go

+	if err := removeMountDir(c.plugin, dir); err != nil {
+		return errors.New(log("Unmounter.TearDownAt failed to clean mount dir [%s]: %v", dir, err))
+	}
+
 	volID := c.volumeID


This does not seem correct, so if NodeUnpublish fails then subsequent calls to NodeUnpublish won't even start because directory and json files has already been removed.

I do not think fixing it here is the right solution but we need to think it from perspective of how cleanup should work if drivers are not yet available or volume is not yet mounted. IMO - the fix has to be made in reconstruction code, not here.

Actually, csi mount directory will be removed anyway by housekeeping handler in removeOrphanedPodVolumeDirs, this function will even remove entire pod volume dir, which is /var/lib/kubelet/pods/<POD_UID>/volumes, the reason why the directory was not deleted sucessfully and left over is that there is a file called vol_data.json in it, and this function can only delete directory but when any file exists, it returns error. So here I put removeMountDir before calling out csi driver to make sure vol_data.json is cleaned up no matter NodeUnpublish sucess or not, in this way, the orphan pod volume will be cleaned up by housekeeping handler eventually.

Anyway, the goal is to clean up orphan Pod volume, if we can make it by this simple way, there is no need to make things complicated.

I agree with @gnufied , this doesn't seem like the right place to fix it.
What happens when the directory is still mounted? Removal of the directory would fail then but with this change it means TearDownAt would return an error before calling NodeUnpublishVolume to unmount the volume, and it would never get unmounted. So I think this breaks the main use case (unmount volume and then remove the mountpoint).

Actually, when the directory is still mounted, removeMountDir won't return error. It is still unmounted by csi driver first. Every cleaning up action in kubelet should have considered the mount state in the first place to ensure the safety.

Considering that csi driver is responsible for unmounting and cleaning the mount directory, which also mentioned in the comment, maybe we just only remove vol_data.json here, which should be done by kubelet, rather than remove the mount dir.

@gnufied @dobsonj I agree that we should not break the original logic of TearDownAt. And I update the PR. This time I put the deletion of the volume data file in kubelet housekeeping process, which is responsible to clean up the orphan Pod volume. Could you PTAL?

dobsonj · 2022-09-02T17:54:55Z

pkg/kubelet/kubelet_volumes.go

 	podVolDir := kl.getPodVolumesDir(uid)
-	if err := removeall.RemoveDirsOneFilesystem(kl.mounter, podVolDir); err != nil {
+	if err := removeall.RemoveAllOneFilesystemCommon(kl.mounter, podVolDir, func(path string) error {


This change actually re-introduces a rather serious race condition where vol_data.json may be deleted by removeOrphanedPodVolumeDirs before NodeUnstageVolume has a chance to run, which causes unmount to fail and kubelet ends up not being able to clean the mountpoints on the node: #101911
So this is not the right place to remove vol_data.json, it will result in unmounts failing in some scenarios.
You may want to take a look at the volume reconstruction code in volumemanager. This is a tough bug in a complex part of the code, and the fix is not obvious to me, but when I saw it before syncStates failed here because the /mount subdir did not exist. Then it calls cleanupMounts but that failed because the volume failed to be reconstructed and it couldn't create the unmounter at that point. If you're able to reproduce the problem, you can check if you're seeing the same failure scenario in kubelet.log.
IMO, we either need to be able to reconstruct the volume to unmount it in this scenario, or at least find some way to create the unmounter so that cleanupMounts can remove vol_data.json even when the /mount subdir does not exist. Sorry if this direction is still a little vague, but that's as far as I've been able to dig into this.

@dobsonj Thank you so much for the explanation. I am excited that there are still so many things I didn't know😄, and I will dig into it again to find a right solution.

@dobsonj @gnufied Actually, not only the volume directory /mount will be left over, the /globalmount directory will be left over either. Because in this scenario, the volume manager did neither NodeUnstageVolume nor NodeUnpublishVolume:

When reconstructing, the volume directory /mount is not mountpoint, so volume manager tries to do cleanup by calling NodeUnpublishVolume and cleanup the /mount directory and vol_data.json, but the csi driver did not register yet, which makes cleanup fails. Actually, no matter it succeeds or fails, NodeUnstageVolume won't be called, so /globalmount will be left over.

I think there may be two solutions:

Let the reconstruction success no matter /mount is mountpoint or not. After adding the volume to ASW, the volume manager's reconcile process will do NodeUnstageVolume and NodeUnpublishVolume later. However, I wonder if there is special reason for making reconstruction failing when mountpoint does not exist.

When the unmounter executes unmounting, clean up the directory and vol_data.json without calling NodeUnpublishVolume and NodeUnstageVolume if the directory is already not a mountpoint.

WDYT? Which one is better or any other idea?

k8s-ci-robot · 2022-09-07T03:55:33Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cvvz
Once this PR has been reviewed and has the lgtm label, please assign saad-ali for approval by writing /assign @saad-ali in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

pkg/kubelet/volumemanager/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cvvz · 2022-09-07T04:30:22Z

@dobsonj @gnufied Hi, I've updated the PR, PTAL.

The latest solution is not checking the existence of mount point during reconstruction. If mount point does not exist, reconstruction will success and update reconstructed volume into ASW, which will be unmounted and cleaned up in the subsequent reconcile.

So, previously, why making reconstruction fail when mountpoint does not exist? I found the original issue #51982, it is to fix the problem that after rebooting the node, ASW and DSW will be the same after reconstruction and updating ASW succeed, thus reconcile won't remount the volume which is no longer mount point. However, this will never happen now since during reconstruction, we will not update ASW if the volume is in DSW.

k8s-ci-robot · 2022-11-08T18:35:33Z

@cvvz: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dims · 2022-12-12T15:30:44Z

If you still need this PR then please rebase, if not, please close the PR

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 24, 2022

k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Aug 24, 2022

k8s-ci-robot requested review from verult and xing-yang August 24, 2022 14:30

k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 24, 2022

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 24, 2022

gnufied reviewed Aug 24, 2022

View reviewed changes

cvvz force-pushed the fix111933 branch from 51dd548 to 50db02d Compare September 2, 2022 08:07

k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Sep 2, 2022

dobsonj reviewed Sep 2, 2022

View reviewed changes

cvvz requested review from gnufied and dobsonj and removed request for xing-yang, verult and gnufied September 6, 2022 23:29

reconstruct no matter mount point exists or not

543ab86

cvvz force-pushed the fix111933 branch from f9acf9c to 543ab86 Compare September 7, 2022 03:55

cvvz requested review from gnufied and dobsonj and removed request for dobsonj and gnufied September 13, 2022 02:06

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 8, 2022

cvvz closed this Dec 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix 111933 #112012

fix 111933 #112012

cvvz commented Aug 24, 2022 •

edited

Loading

k8s-ci-robot commented Aug 24, 2022

k8s-ci-robot commented Aug 24, 2022

yangjunmyfm192085 commented Aug 24, 2022

gnufied Aug 24, 2022

cvvz Aug 24, 2022 •

edited

Loading

dobsonj Aug 24, 2022

cvvz Aug 24, 2022 •

edited

Loading

cvvz Sep 2, 2022 •

edited

Loading

dobsonj Sep 2, 2022 •

edited

Loading

cvvz Sep 4, 2022 •

edited

Loading

cvvz Sep 5, 2022 •

edited

Loading

k8s-ci-robot commented Sep 7, 2022

cvvz commented Sep 7, 2022 •

edited

Loading

k8s-ci-robot commented Nov 8, 2022

dims commented Dec 12, 2022

fix 111933 #112012

fix 111933 #112012

Conversation

cvvz commented Aug 24, 2022 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Aug 24, 2022

k8s-ci-robot commented Aug 24, 2022

yangjunmyfm192085 commented Aug 24, 2022

gnufied Aug 24, 2022

Choose a reason for hiding this comment

cvvz Aug 24, 2022 • edited Loading

Choose a reason for hiding this comment

dobsonj Aug 24, 2022

Choose a reason for hiding this comment

cvvz Aug 24, 2022 • edited Loading

Choose a reason for hiding this comment

cvvz Sep 2, 2022 • edited Loading

Choose a reason for hiding this comment

dobsonj Sep 2, 2022 • edited Loading

Choose a reason for hiding this comment

cvvz Sep 4, 2022 • edited Loading

Choose a reason for hiding this comment

cvvz Sep 5, 2022 • edited Loading

Choose a reason for hiding this comment

k8s-ci-robot commented Sep 7, 2022

cvvz commented Sep 7, 2022 • edited Loading

k8s-ci-robot commented Nov 8, 2022

dims commented Dec 12, 2022

cvvz commented Aug 24, 2022 •

edited

Loading

cvvz Aug 24, 2022 •

edited

Loading

cvvz Aug 24, 2022 •

edited

Loading

cvvz Sep 2, 2022 •

edited

Loading

dobsonj Sep 2, 2022 •

edited

Loading

cvvz Sep 4, 2022 •

edited

Loading

cvvz Sep 5, 2022 •

edited

Loading

cvvz commented Sep 7, 2022 •

edited

Loading