Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add volume reconstruct/cleanup logic in kubelet volume manager #27970

Merged
merged 1 commit into from
Aug 15, 2016

Conversation

jingxu97
Copy link
Contributor

@jingxu97 jingxu97 commented Jun 23, 2016

Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.

Fixes #27653


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Jun 23, 2016
@yujuhong
Copy link
Contributor

@jingxu97 this fails the unit test FAIL k8s.io/kubernetes/pkg/kubelet [build failed]

@yujuhong yujuhong added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Jun 23, 2016
@yujuhong yujuhong added this to the v1.3 milestone Jun 23, 2016
@@ -901,6 +907,74 @@ func (kl *Kubelet) setupDataDirs() error {
return nil
}

func (kl *Kubelet) getPodVolumes(podUID types.UID) ([]volumeTuple, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general question: Is it possible to move all these logic into the volume package?

/cc @saad-ali

@erictune
Copy link
Member

Is this confirmed as fixing a bug?

@jingxu97
Copy link
Contributor Author

We decided to not use this clean up code. I found a race condition problem
related to this bug, will post it a little bit later. It is considered a
relatively lower priory for now.

On Sat, Jun 25, 2016 at 7:40 AM, Eric Tune notifications@github.com wrote:

Is this confirmed as fixing a bug?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#27970 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ASSNxemtyJSW-VBlqzVXr08839GuxhPjks5qPT3NgaJpZM4I9M0t
.

  • Jing

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 29, 2016
@idvoretskyi
Copy link
Member

Proposed for 1.3.1

@matchstick matchstick modified the milestones: next-candidate, v1.3 Jun 29, 2016
@matchstick matchstick added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 29, 2016
@jingxu97 jingxu97 force-pushed the restartKubelet-6-22 branch from e867383 to f4857df Compare July 20, 2016 00:55
@k8s-github-robot k8s-github-robot removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 20, 2016
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
@jingxu97 jingxu97 force-pushed the restartKubelet-6-22 branch from 8fd2e93 to f19a114 Compare August 15, 2016 18:29
@k8s-bot
Copy link

k8s-bot commented Aug 15, 2016

GCE e2e build/test passed for commit f19a114.

@thockin thockin removed their assignment Aug 15, 2016
@jingxu97 jingxu97 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 15, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Aug 15, 2016

GCE e2e build/test passed for commit f19a114.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 79ed706 into kubernetes:master Aug 15, 2016
@ixdy
Copy link
Member

ixdy commented Aug 16, 2016

This PR broke the Darwin and Windows builds, I think:

+++ [0816 15:23:49] darwin/amd64: go build started
# k8s.io/kubernetes/pkg/util/mount
pkg/util/mount/mount.go:89: cannot use Mounter literal (type *Mounter) as type Interface in return argument:
        *Mounter does not implement Interface (missing GetDeviceNameFromMount method)
+++ [0816 15:23:49] windows/amd64: go build started
# k8s.io/kubernetes/pkg/util/mount
pkg/util/mount/mount.go:89: cannot use Mounter literal (type *Mounter) as type Interface in return argument:
        *Mounter does not implement Interface (missing GetDeviceNameFromMount method)
# k8s.io/kubernetes/pkg/util/procfs
pkg/util/procfs/procfs.go:84: undefined: syscall.Kill

It looks like you're missing changes to pkg/util/mount/mount_unsupported.go?

@ixdy
Copy link
Member

ixdy commented Aug 16, 2016

cc @k8s-oncall

@ixdy
Copy link
Member

ixdy commented Aug 16, 2016

Note the breakage in pkg/util/procfs is due to #30087.

@jingxu97
Copy link
Contributor Author

Yes, I forgot to put the function into the pkg/util/mount/mount_unsupported.go.
Will you issue a PR for it now. Thank you!

Jing

On Tue, Aug 16, 2016 at 3:43 PM, Jeff Grafton notifications@github.com
wrote:

This PR broke the Darwin and Windows builds, I think:

+++ [0816 15:23:49] darwin/amd64: go build started

k8s.io/kubernetes/pkg/util/mount

pkg/util/mount/mount.go:89 http://k8s.io/kubernetes/pkg/util/mountpkg/util/mount/mount.go:89: cannot use Mounter literal (type *Mounter) as type Interface in return argument:
*Mounter does not implement Interface (missing GetDeviceNameFromMount method)
+++ [0816 15:23:49] windows/amd64: go build started

k8s.io/kubernetes/pkg/util/mount

pkg/util/mount/mount.go:89 http://k8s.io/kubernetes/pkg/util/mountpkg/util/mount/mount.go:89: cannot use Mounter literal (type *Mounter) as type Interface in return argument:
*Mounter does not implement Interface (missing GetDeviceNameFromMount method)

k8s.io/kubernetes/pkg/util/procfs

pkg/util/procfs/procfs.go:84 http://k8s.io/kubernetes/pkg/util/procfspkg/util/procfs/procfs.go:84: undefined: syscall.Kill

It looks like you're missing changes to pkg/util/mount/mount_
unsupported.go?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#27970 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ASSNxYkn7kpGz9wl1xKaYX7PYXHy8Mr1ks5qgjz5gaJpZM4I9M0t
.

  • Jing

k8s-github-robot pushed a commit that referenced this pull request Aug 18, 2016
#30724-upstream-release-1.3

Automatic merge from submit-queue

Automated cherry pick of #27970 #30724

Cherry pick of #27970 #30724 on release-1.3.
@fabioy fabioy added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. and removed cherrypick-candidate labels Aug 18, 2016
@ixdy
Copy link
Member

ixdy commented Aug 19, 2016

@jingxu97 did you have a chance to create a PR for this yet? build still seems broken.

@jingxu97
Copy link
Contributor Author

Hi Jeff

Yes, #30724
This PR is already merged.

From error log message you sent in previous email, there is an error which
I don't think related to my change

k8s.io/kubernetes/pkg/util/procfs

pkg/util/procfs/procfs.go:84
http://k8s.io/kubernetes/pkg/util/procfspkg/util/procfs/procfs.go:84:
undefined: syscall.Kill

Please let me know if there is any issue. Thank you!

On Thu, Aug 18, 2016 at 7:27 PM, Jeff Grafton notifications@github.com
wrote:

@jingxu97 https://github.com/jingxu97 did you have a chance to create a
PR for this yet? build still seems broken.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#27970 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ASSNxZN__MH5SgdHKXrMO79TiK5hSzQbks5qhRSXgaJpZM4I9M0t
.

  • Jing

glog.V(3).Infof("Orphaned pod %q found, but volumes are not cleaned up; err: %v", uid, err)
continue
}
// Check whether volume is still mounted on disk. If so, do not delete directory
if volumeNames, err := kl.getPodVolumeNameListFromDisk(uid); err != nil || len(volumeNames) != 0 {
glog.V(3).Infof("Orphaned pod %q found, but volumes are still mounted; err: %v, volumes: %v ", uid, err, volumeNames)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not unmount unused mounted volumes here?
I think that's what #31596 is asking for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luxas https://github.com/luxas, we do unmount unused mounted volumes,
this logic is implemented in pkg/kubelet/volumemanager/reconciler/reconciler.go
because this is the place for all mount/unmount related logic.

Please let me know if you have any questions about it.

Jing

On Tue, Aug 30, 2016 at 10:15 PM, Lucas Käldström notifications@github.com
wrote:

In pkg/kubelet/kubelet.go
#27970 (comment)
:

        glog.V(3).Infof("Orphaned pod %q found, but volumes are not cleaned up; err: %v", uid, err)
        continue
    }
  •   // Check whether volume is still mounted on disk. If so, do not delete directory
    
  •   if volumeNames, err := kl.getPodVolumeNameListFromDisk(uid); err != nil || len(volumeNames) != 0 {
    
  •       glog.V(3).Infof("Orphaned pod %q found, but volumes are still mounted; err: %v, volumes: %v ", uid, err, volumeNames)
    

Why not unmount unused mounted volumes here?
I think that's what #31596
#31596 is asking for


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/pull/27970/files/f19a1148db1b7584be6b6b60abaf8c0bd1503ed3#r76925660,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ASSNxZ2PPiHykY7mWc1u4e7Uagxqpgsgks5qlQ3WgaJpZM4I9M0t
.

  • Jing

shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016
…ck-of-#27970-kubernetes#30724-upstream-release-1.3

Automatic merge from submit-queue

Automated cherry pick of kubernetes#27970 kubernetes#30724

Cherry pick of kubernetes#27970 kubernetes#30724 on release-1.3.
dims pushed a commit to dims/kubernetes that referenced this pull request Feb 8, 2018
Automatic merge from submit-queue

Add volume reconstruct/cleanup logic in kubelet volume manager

Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.

Fixes kubernetes#27653
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.