Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed admission controller to allow volume expansion for all volume plugins #66780

Merged
merged 1 commit into from
Aug 14, 2018

Conversation

kangarlou
Copy link
Contributor

@kangarlou kangarlou commented Jul 30, 2018

What this PR does / why we need it:
There are two motivations for this change:

  1. CSI plugins are soon going to support volume expansion. For such plugins, admission controller doesn't know whether the plugins are capabale of supporting volume expansion or not.
  2. Currently, admission controller rejects PVC updates for in-tree plugins that don't support volume expansion (e.g., NFS, iSCSI). This change allows external controllers to expand volumes similar to how external provisioners are accommodated.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:
This PR mimics the behavior of the PV controller when PVs are provisioned externally by logging and setting a new event for PVs that are being expanded externally. As SIG Storage is planning new types of operations on PVs, it may make more sense to a have a single event for all actions taken by external controllers.

Release note:

The check for unsupported plugins during volume resize has been moved from the admission controller to the two controllers that handle volume resize.

/sig storage
/assign @gnufied @jsafrane @wongma7

… plugins

There are two motivations for this change:
(1) CSI plugins are soon going to support volume expansion. For such
plugins, admission controller doesn't know whether the plugins are
capabale of supporting volume expansion or not.
(2) Currently, admission controller rejects PVC updates for in-tree plugins
that don't support volume expansion (e.g., NFS, iSCSI). This change allows
external controllers to expand volumes similar to how external provisioners
operate.
@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Jul 30, 2018
@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jul 30, 2018
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 30, 2018
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 30, 2018
@k8s-ci-robot k8s-ci-robot requested review from deads2k and rootfs July 30, 2018 14:44
@wongma7
Copy link
Contributor

wongma7 commented Jul 30, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 30, 2018
Copy link
Contributor

@wongma7 wongma7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
i dont like how we have duplciate the event from two places, seems impossible to avoid with how the expand controller is written though...

@wongma7
Copy link
Contributor

wongma7 commented Jul 30, 2018

you made need to run hack/update-bazel.sh

@gnufied
Copy link
Member

gnufied commented Jul 30, 2018

@wongma7 @kangarlou it should be possible to reject the volume resize request in AddPVCUpdate to avoid code duplication in two places? the VolumeResizeMap may need VolumePluginMgr but it should not be too hard.

@kangarlou
Copy link
Contributor Author

kangarlou commented Jul 30, 2018

@gnufied That's actually how I implemented the code first; however, that complicated the unit tests for AddPVCUpdate. More specifically, initializing VolumePluginMgr for unit tests is not straightforward.
@wongma7 I rebased this morning and it seems the bazel tests for some recent PRs are failing. I'll look into it.

@kangarlou
Copy link
Contributor Author

/retest

@kangarlou
Copy link
Contributor Author

kangarlou commented Jul 31, 2018

@gnufied @wongma7 The tests seem to be passing now. I'll squash the commits if everyone is ok with the current implementation.

I agree having the same logic to check valid plugins in two controllers isn't ideal, but short of a major refactoring, I don't see a better solution. Implementing this logic inside AddPVCUpdate also results in code duplication of a different kind and complicates unit tests.

One outstanding question is whether we should use a single generic event (e.g., ExternalController) for all actions handled by external controllers (e.g., provisioning, resize, snapshot). Having an event for every action supported on PVs will lead to unnecessary proliferation of events.

Copy link
Member

@gnufied gnufied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do no see why we can't just stop volume/pod from being added to resizemap inside AddPVCUpdate. Initializing volume plugin manager in tests is not very difficult. For example - https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_test.go#L322

Also - preferably - we should not implement VolumeHost interface again for pvc populator.

kubeClient: kubeClient,
}
return populator
if err := populator.volumePluginMgr.InitPlugins(plugins, nil, populator); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any advantage over initializing plugin manager here again rather than just passing the one that already has been initialzied in expand_controller ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied govet doesn't like passing volumePluginMgr as an argument (see the first commit in my PR) because it results in passing a lock by value:

2018/07/30 19:02:56 Code for OpenAPI definitions generated
# k8s.io/kubernetes/pkg/controller/volume/expand
pkg/controller/volume/expand/expand_controller.go:145: call of NewPVCPopulator copies lock value: k8s.io/kubernetes/pkg/volume.VolumePluginMgr contains sync.Mutex
pkg/controller/volume/expand/pvc_populator.go:64: NewPVCPopulator passes lock by value: k8s.io/kubernetes/pkg/volume.VolumePluginMgr contains sync.Mutex
pkg/controller/volume/expand/pvc_populator.go:71: literal copies lock value from volumePluginMgr: k8s.io/kubernetes/pkg/volume.VolumePluginMgr contains sync.Mutex
make[1]: *** [vet] Error 1

PVC populator implementing the VolumeHost interface is precisely to handle this situation. As I said, I'm not a fan of code duplication, but this is a side-effect of the original implementation where we have two controllers more or less doing the same job.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this not be worked around by passing the volume plugin manager as a pointer rather than by value?

@@ -30,4 +30,5 @@ const (
ProvisioningCleanupFailed = "ProvisioningCleanupFailed"
ProvisioningSucceeded = "ProvisioningSucceeded"
WaitForFirstConsumer = "WaitForFirstConsumer"
ExternalExpanding = "ExternalExpanding"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ExternalExpanding is bit vague. Can we rename this to something like VolumeResizePending or VolumeExpansionPending ? Once a external controller picks up the PVC it can go ahead and finish the process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExternalProvisioning is the same, I prefer ExternalExpanding for consistency. VolumeResizePending wouldn't tell you kubernetes is just waiting for an external actor to do something, so the user wouldn't know who to blame

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see your point. It can go either way. I am not sure if ExternalProvisioning name makes sense either. Kubernetes can not know for sure that a volume is being externally provisioned or resized. All it knows is - it can't do this itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also went for consistency. This is how events stack up for externally provisioned volumes after my changes:

Events:
  Type       Reason                Age   From                         Message
  ----       ------                ----  ----                         -------
  Normal     ExternalProvisioning  44s   persistentvolume-controller  waiting for a volume to be created, either by external provisioner "netapp.io/trident" or manually created by system administrator
  Normal     ProvisioningSuccess   36s   netapp.io/trident            Kubernetes frontend provisioned a volume and a PV for the PVC.
  Normal     ExternalExpanding     14s   volume_expand                Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
Mounted By:  <none>

@kangarlou
Copy link
Contributor Author

kangarlou commented Aug 2, 2018

@gnufied Good idea regarding passing *VolumePluginMgr. I made the change. I didn't change AddPVCUpdate as expand controller already needs a recorder to set events and already has to initialize VolumePluginMgr. I also don't think it makes sense to set events inside AddPVCUpdate as it unnecessarily complicates unit tests.

glog.V(3).Infof("Ignoring the PVC %q (uid: %q) : %v.",
strings.JoinQualifiedName(pvc.GetNamespace(), pvc.GetName()),
pvc.UID, err)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a util function or something out of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please see the third commit.

@kangarlou
Copy link
Contributor Author

/retest

@gnufied
Copy link
Member

gnufied commented Aug 3, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 3, 2018
@gnufied
Copy link
Member

gnufied commented Aug 3, 2018

@kangarlou can you please squash your commits?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 3, 2018
@kangarlou
Copy link
Contributor Author

@gnufied squashed and rebased. Thanks!

@kangarlou
Copy link
Contributor Author

/retest

@kangarlou
Copy link
Contributor Author

/assign @derekwaynecarr

@gnufied
Copy link
Member

gnufied commented Aug 8, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 8, 2018
@kangarlou
Copy link
Contributor Author

/retest

@jsafrane
Copy link
Member

jsafrane commented Aug 9, 2018

/approve

/assign @deads2k @derekwaynecarr
for admission approval.

@kangarlou
Copy link
Contributor Author

/retest

@deads2k
Copy link
Contributor

deads2k commented Aug 14, 2018

The admission chain change removes the protection for plugins which don't have this ability. I'd like to see that in the release note. Otherwise, assuming reasonable status reporting in the controller, this ought to be ok.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, gnufied, jsafrane, kangarlou

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 14, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@kangarlou
Copy link
Contributor Author

Thanks @deads2k. Yes, the PR moves the checks for unsupported plugins from admission controller to the two controllers that handle volume resize. These controllers log proper error messages and set the right events on PVCs.

@kangarlou
Copy link
Contributor Author

/retest

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 66780, 67330). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 76434bd into kubernetes:master Aug 14, 2018
@k8s-ci-robot
Copy link
Contributor

@kangarlou: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws ee747b8 link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

aniket-s-kulkarni pushed a commit to aniket-s-kulkarni/kubernetes that referenced this pull request Aug 16, 2018
aniket-s-kulkarni pushed a commit to aniket-s-kulkarni/kubernetes that referenced this pull request Aug 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants