Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform resize of mounted volume if necessary #58794

Merged

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Jan 25, 2018

Under certain conditions - we must perform resize of volume even when it is mounted. This enables us to get around problem of resizing volumes used by deployments etc.

Allow expanding mounted volumes

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 25, 2018
@gnufied
Copy link
Member Author

gnufied commented Jan 25, 2018

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jan 25, 2018
@gnufied
Copy link
Member Author

gnufied commented Jan 25, 2018

/assign @jsafrane @rootfs

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jan 25, 2018
Copy link
Member

@jsafrane jsafrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're trying to do, but IMO we should do resize even if a pod is Running - both ext4 and xfs allow online resize and ext3 allows it in most cases too. Why should user restart pods?

[note that ext3 without resize_inode option will fail also in this PR, the volume must be fully unmounted first]

return false, deviceOpenErr
}

if deviceOpened {
deviceAlreadyOpenErr := fmt.Errorf("the device %s is already in use", devicePath)
return false, deviceAlreadyOpenErr
glog.Warningf("ResizeFS.Resize - Expanding mounted volume %s", devicePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's Warning? It should be normal operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please fix the level? I'm inclined to approve this PR then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

deployment, err := framework.CreateDeployment(c, int32(1), map[string]string{"test": "app"}, nodeKeyValueLabel, ns, pvcClaims, "")
defer c.ExtensionsV1beta1().Deployments(ns).Delete(deployment.Name, &metav1.DeleteOptions{})

By("Expanding current pvc")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you wait for a pod to be Running before resize? Expand controller may be faster than the pod.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CreateDeoployment function waits for a pod to be running.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@gnufied
Copy link
Member Author

gnufied commented Jan 25, 2018

@jsafrane proposal for implementing online resize is here - kubernetes/community#1535 , so we will get there. But there are design problems to be solved to get there.

@gnufied gnufied force-pushed the perform-online-resize-if-mounted branch from 3fd411c to 85d17d0 Compare January 25, 2018 14:29
@gnufied
Copy link
Member Author

gnufied commented Jan 25, 2018

@jsafrane also - I checked RHEL, Fedora 26 and 27, Ubuntu, Debian and Google COS - they all have resize_inode option enabled in mke2fs.conf. For now - may be we can document this (until we get out of alpha). Not all PVCs are resizable anyways, only those that are explicitly enabled by kube admin. I am not sure what else we can do in this release.

@jsafrane
Copy link
Member

I am not sure what else we can do in this release

Just throw a sensible event on the PVC that online resize failed with error from stderr. Let's hope it's enough to the users to either try offline resize or nag system admin to resize the volume manually (e.g. because is has errors).

@gnufied gnufied force-pushed the perform-online-resize-if-mounted branch from 85d17d0 to 3e269eb Compare January 26, 2018 15:24
@gnufied
Copy link
Member Author

gnufied commented Jan 26, 2018

/test pull-kubernetes-unit

@gnufied
Copy link
Member Author

gnufied commented Jan 26, 2018

The failure looks like unrelated flake or breakage - I have filed #58881 . I did not see an existing ticket for the failure.

@gnufied
Copy link
Member Author

gnufied commented Jan 26, 2018

@jsafrane PTAL, changed log level of resize message.

@jsafrane
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 27, 2018
@gnufied
Copy link
Member Author

gnufied commented Jan 27, 2018

because some changes spilled into pkg/ and test/

/assign @smarterclayton @brendandburns

@@ -84,7 +90,7 @@ func (resizefs *ResizeFs) Resize(devicePath string) (bool, error) {
}
return resizefs.xfsResize(devicePath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the resize operation maybe can't work correctly for xfs file system: resizeFileSystem method is performed between attaching device and mounting device, it means the resize operation maybe performed before the device mounted. But xfs_growfs requires the device has been mounted. So if we try to resize a volume with xfs file system, we may receive an error event like this:

MountVolume.resizeFileSystem failed for volume "pvc-60265674-0406-11e8-9c91-0800273c9701" : resize of device /dev/rbd0 failed: exit status 1. xfs_growfs output: xfs_growfs: /dev/rbd0 is not a mounted XFS filesystem

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this. It is fixed now.

@gnufied gnufied force-pushed the perform-online-resize-if-mounted branch from 3e269eb to 8beaa2f Compare January 29, 2018 21:35
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2018
Add e2e test for mounted volume resize
@gnufied gnufied force-pushed the perform-online-resize-if-mounted branch from 8beaa2f to afeb53e Compare January 29, 2018 22:50
@gnufied
Copy link
Member Author

gnufied commented Jan 30, 2018

@jsafrane PTAL.

@jsafrane
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2018
@jsafrane
Copy link
Member

/lgtm

@childsb
Copy link
Contributor

childsb commented Jan 31, 2018

/approve

@gnufied
Copy link
Member Author

gnufied commented Feb 2, 2018

/assign @smarterclayton

@smarterclayton
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: childsb, gnufied, jsafrane, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants