Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate PVC requested AccessModes against StorageClass's underlying plugin's supported AccessModes #46540

Closed
wongma7 opened this issue May 26, 2017 · 17 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@wongma7
Copy link
Contributor

wongma7 commented May 26, 2017

Currently, when a user creates a PVC asking for a certain StorageClass, and dynamic provisioning gets invoked, the AccessModes field is effectively ignored: the provisioned PV is created to have AccessModes matching the PVC's regardless of whether the underlying plugin actually supports it.

e.g. in an AWS cluster a user creates a PVC asking for a "standard" class aws-ebs volume & RWM AccessModes. They'll get a RWM PV and, if/when the multiple pods attempting to use it fail (i.e. the vol fails to attach to multiple nodes), they'll wonder what went wrong and may have no way of understanding what went wrong.

There is no way for a StorageClass (admin) to communicate to users that it only supports creating PVs with lesser AccessModes than what the user may need.

Mount options set something of a precedent by validating at PV creation time whether the specified plugin actually supports mount options. IMO we should do something similar here, validate at PVC creation time whether the specified class's plugin actually supports the requested AccessModes according to the documentation here: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes.

We should also (automatically) put plugin's supported AccessModes in every StorageClass so that users can determine which StorageClasses support what AccessModes.

Thoughts please?

@kubernetes/sig-storage-feature-requests @kubernetes/sig-storage-bugs @gnufied

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label May 26, 2017
@k8s-ci-robot
Copy link
Contributor

@wongma7: Reiterating the mentions to trigger a notification:
@kubernetes/sig-storage-feature-requests, @kubernetes/sig-storage-bugs.

In response to this:

Currently, when a user creates a PVC asking for a certain StorageClass, and dynamic provisioning gets invoked, the AccessModes field is effectively ignored: the provisioned PV is created to have AccessModes matching the PVC's regardless of whether the underlying plugin actually supports it.

e.g. in an AWS cluster a user creates a PVC asking for a "standard" class aws-ebs volume & RWM AccessModes. They'll get a RWM PV and, if/when the multiple pods attempting to use it fail (i.e. the vol fails to attach to multiple nodes), they'll wonder what went wrong and may have no way of understanding what went wrong.

There is no way for a StorageClass (admin) to communicate to users that it only supports creating PVs with lesser AccessModes than what the user may need.

Mount options set something of a precedent by validating at PV creation time whether the specified plugin actually supports mount options. IMO we should do something similar here, validate at PVC creation time whether the specified class's plugin actually supports the requested AccessModes according to the documentation here: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes.

We should also (automatically) put plugin's supported AccessModes in every StorageClass so that users can determine which StorageClasses support what AccessModes.

Thoughts please?

@kubernetes/sig-storage-feature-requests @kubernetes/sig-storage-bugs @gnufied

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@wongma7
Copy link
Contributor Author

wongma7 commented May 26, 2017

Also sorry in advance if this discussion already happened somewhere, couldn't find it. It's somewhat similar to the fstype stuff discussed today. (And IMO orthogonal to the discussion about the "actual meaning of AccessModes wrt scheduling, mount vs. attach; I am talking purely about user expectations, i.e. what is already documented per plugin.)

@wongma7
Copy link
Contributor Author

wongma7 commented May 26, 2017

also ping @jsafrane

@gnufied
Copy link
Member

gnufied commented May 26, 2017

There is some related discussion about AccessModes here- kubernetes-retired/external-storage#110 (comment)

@gnufied
Copy link
Member

gnufied commented May 26, 2017

I think it is doable to perform this validation, if we introduce AccessMode as a field on StorageClass, but general validation of PVCs against what volume plugin actually supports, isn't backward compatible.

For example - user may rely on this behaviour to use 2 pods with same PVCs (as long as both pods are scheduled on same node).

@wongma7
Copy link
Contributor Author

wongma7 commented May 27, 2017

Yeah I mean preventing administrators from creating EBS RWM PVs might be overreaching. But in our position acting as a PV-creating administrator, we should provide the best UX possible. Even if that means breaking people who are currently aware they can treat the PVC AccessModes as a "don't care" when using dynamic provisioning. Because to be clear, introducing new validation obviously does break backward compatibility, just not to that extent you're saying

The issue of enforcing AccessModes at mount/attach time is I think being worked on elsewhere and IMO while it's difficult not to conflate that issue with this issue, we can safely ignore it while we work on this if we just say the docs are 100% correct. Like this: EBS!=RWM, and so users, whatever their understand of RWO vs RWM, definitely don't want EBS when they ask for RWM. So let's error out early instead of giving them a bad PV.

@smarterclayton
Copy link
Contributor

Generally we do not strengthen validation for any reason other than security, even if we think it's a better experience. The risk of breaking existing clusters is too high

@wongma7
Copy link
Contributor Author

wongma7 commented May 27, 2017

What if we made the validation opt-in by making this hypothetical StorageClass AccessModes field optional? kube won't automatically set it, admins can weigh the risk and set it themselves. If empty, do nothing & maintain current behavior; if filled in, validate PVCs against it.

@smarterclayton
Copy link
Contributor

smarterclayton commented May 28, 2017 via email

@jsafrane
Copy link
Member

I think that adding SupportedAcessModes to StorageClass could work.

@wongma7
Copy link
Contributor Author

wongma7 commented Jun 1, 2017

Implementation question: validation code would need to Get a PVC's StorageClass every time a PVC comes in. Is that even "allowed"? If not, would this validation be better implemented as some sort of admission plugin? The issue with that is then it must run after the existing DefaultStorageClass admission plugin. :/ Any thoughts on this?

I don't suppose there's any other example of validation where one object's validity depends on another?

@smarterclayton
Copy link
Contributor

smarterclayton commented Jun 1, 2017 via email

@wongma7
Copy link
Contributor Author

wongma7 commented Jun 2, 2017

Yes that's the only way provisioners can bubble up errors now (leave PVC pending and write an e.g. 'ProvisioningFailed' event on the PVC), but I have to consider the possibility of introducing create time validation

Because it would be nice to have + because it was the original RFE behind this issue... and also because we have other features that might benefit from getting this pattern done: e.g. resize: StorageClasses could maybe have a 'Resizable' field just like 'SupportedAccessModes', and then PVCs could be validated against that if they ask for a resizable PV. @gnufied that sound right?

This issue has a lot of overlap with the unresolved discussion about combining class + selector.

It also has a lot of overlap with discussions over people's (my, at least) UX distaste/confusion over the 'my PVC is stuck in pending and I don't know why' situation.

All this to say, I'd personally like create time validation of some kind. Something better than updating the PVC's events. Anyway, we have an entire release cycle to figure this out so I'll try to bring this up in storage sig and solicit more opinions on whether it's worth thinking about.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jun 2, 2017 via email

k8s-github-robot pushed a commit that referenced this issue Jun 13, 2017
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)

Don't provision for PVCs with AccessModes unsupported by plugin

Fail early in case the user actually expects e.g. RWM from AWS when in reality that isn't possible.
@eparis @gnufied 

edit: this needs release note because it's a breaking bugfix; will write one.

#46540
```release-note
Fix dynamic provisioning of PVs with inaccurate AccessModes by refusing to provision when PVCs ask for AccessModes that can't be satisfied by the PVs' underlying volume plugin
```
dims pushed a commit to dims/openstack-cloud-controller-manager that referenced this issue Sep 6, 2017
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)

Don't provision for PVCs with AccessModes unsupported by plugin

Fail early in case the user actually expects e.g. RWM from AWS when in reality that isn't possible.
@eparis @gnufied 

edit: this needs release note because it's a breaking bugfix; will write one.

kubernetes/kubernetes#46540
```release-note
Fix dynamic provisioning of PVs with inaccurate AccessModes by refusing to provision when PVCs ask for AccessModes that can't be satisfied by the PVs' underlying volume plugin
```
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2017
dims pushed a commit to dims/openstack-cloud-controller-manager that referenced this issue Jan 13, 2018
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)

Don't provision for PVCs with AccessModes unsupported by plugin

Fail early in case the user actually expects e.g. RWM from AWS when in reality that isn't possible.
@eparis @gnufied 

edit: this needs release note because it's a breaking bugfix; will write one.

kubernetes/kubernetes#46540
```release-note
Fix dynamic provisioning of PVs with inaccurate AccessModes by refusing to provision when PVCs ask for AccessModes that can't be satisfied by the PVs' underlying volume plugin
```
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 25, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

6 participants