-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Persistent Storage #4055
Conversation
HostPath | ||
EmptyDir | ||
GitRepo | ||
GCEPersistentDisk { modes: ReadWriteOnce | ReadOnlyMany } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do ReadWriteOnce/ReadOnlyMany/ReadWriteMany mean? Why need to distinguish Once and Many?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VolumeSources can be mounted in different ways.
A GCE disk can be attached once as ReadWrite or attached to many VMs as ReadOnly. An AWS EBS volume can only be mounted once as ReadWrite.
I am using VolumeSource.modes for a bitmask on a volume's mounting capabilities. It is important for a pod author to know and rely on expected behavior of the storage device.
dcba44e
to
78ec395
Compare
Thanks! Will take a look more quickly this time. |
3da03dd
to
f22d1b9
Compare
Name | ||
Spec | ||
VolumeSource | ||
GCEPersistentDisk { modes: ReadWriteOnce | ReadOnlyMany, resources: 5gb 30iops, .48 throughput } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how are you going to materialize resources? there are many vendor specific capabilities on QoS and HA and they don't always use the same terms. Some can be found from Cinder Volume Type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PersistentStorageController
defined in the abstract mentions an interface that every provider must implement to be used within the controller. Implementations of this interface will provide the functionality of the provider and must allow for different plugins for implementations. Once exposed this way, all types of storage management can eventually be automated even if they are not able to support dynamic creation OOTB.
This looks great. I especially like the fact that the VolumeSource's are pluggable. Hopefully that provides some incentive and an easy path for the broader community to create a wide variety of implementations. Object Stores and FileSystems have very different I/O semantics. Perhaps I missed it but I didn't see where the TYPE of a VolumeSource was articulated when a Pod Author queries Kubernetes for the available Storage Services . It would be bad if someone thought they were getting a Block Device and instead they actually got an object store. |
|
||
|
||
// Manages one type of PersistentStorage | ||
PersistentStorageController |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this will need TypeMeta and ObjectMeta if it's an API resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's terrible pseudo-go. I am implementing the API now with real structs.
f22d1b9
to
61bae42
Compare
I changed PersistentStorage to PersistentStore, because the plural of storage is ... I think this is a better noun. |
61bae42
to
e28a09f
Compare
`PersistentStoreBinder` watches for pods with persistence requirements and matches the request with what is available in the pool. | ||
|
||
`PersistenceRequirements` expresses the capabilities and performance attributes of the underlying VolumeSource. Allow discovery | ||
of storage by its capabilities and use those capabilities as pod storage requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that these all seem to have nice analogs in the world of pods :)
PersistentStore -> Pod
PersistentStoreController -> ReplicationController
PersistentStoreBinder -> Scheduler
PersistenceRequirements -> ResourceRequirementSpec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That was the revelation after @bgrant0607's suggestions left in a comment #3318.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach described here actually paralleled Nodes rather than Pods. An approach that more closely parallels Pods is described below.
e28a09f
to
0bb4d35
Compare
Labels - labeled like Volume.PersistenceRequirements.Selector | ||
Spec | ||
VolumeSource -- backed by one of these: | ||
HostPath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this include HostPath and EmptyDir and GitRepo - I thought this was exclusively about network attached storage. Does this include some notion of host-binding now? What are the semantics of a persistent HostDir vs a persistent EmptyDir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's terrible pseudo-go intended to show what could back the VolumeSource, but with the implied knowledge that VolumeSources work now by having exactly one of those types.
A PersistentStore would have 1 VolumeSource with 1 backing implementation of actual storage in the cluster. The PersistentStore gives longevity to the VolumeSource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was whether you're proposing that local storage be managed in this same way, and if so that the act of binding to a PS with local storage would influence scheduling? That's a somewhat different proposal than network attached storage. Or maybe not, given things like max PDs per VM... ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should only be about network volumes.
) | ||
|
||
type PersistentVolumeClaimVolumeSource struct { | ||
AccessMode AccessModeType `json:"accessMode,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a struct that users will specify - let's make it as simple as possible and document every field.
Why does the user need to spec AccessMode? Presumably it's so that the IaaS knows that a multi-reader mount is OK vs a single-writer? Document it.
Do we need a full ObjectReference? That object itself is under-specced. Can a user reference a claim in a different namespace than their own? Do they need to know the name and UID or just name? In other words, could we get away with just "ClaimName string" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer "ClaimName string". The user is referencing something they created and named themselves.
} | ||
return nil | ||
}, | ||
func(in *newer.PersistentVolumeList, out *PersistentVolumeList, s conversion.Scope) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed at all? It seems to be the default behavior, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for almost all of these. Conversion functions should be the exception, not the rule - especially for new structs. The default field-name matching logic should just work.
Mark, You're a trooper and I'm gonna buy you a yard of beer one of these days... The biggest problem I have with this PR at this point is the scope. OMG the review is painful. After 2 hours I have to go do other stuff. Please think about what we can do to peel off logically whole changes and get those committed. Even if the changes are not "useful" on their own. Even if they are going to evolve over a series of PRs, getting them out of this monolith means we can have small focused discussions. The biggest reason I put off reviewing this is because I need a contiguous 4 hour block just to review it. For example, the work needed to make plugin-wrapper volume plugins can be extracted and done on its own. Please think hard about what else we can do to make this easier, or I am afraid it will languish for lack of review time. It's seriously daunting :) |
The resource design can be a separate pull, the controllers that use them can be a separate pull.
|
thanks @thockin for the review and for the future yard of beer. I intend to make one of the developer summits and I might have to bring a printed version of your comment as a "beer coupon". I'll start breaking this up once I get a working POC through the plugin. I just need it to correctly mount something, I'll have proved the concept, and will feel good about doing the necessary work of carving up this beast. It is, indeed, huge. I've been watching @pmorie's Secrets proposal and subsequent PRs, so I have a good idea of what you mean. |
@markturansky The soft-reset trick we recently discussed could be applied to carve this PR into more manageable chunks. Let me know if you want to discuss further. |
Is this PR active? If not please close it according to https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/devel/pull-requests.md |
Closing in favor of #5105. |
Continuing the conversation from #3318 ...
@bgrant0607 Based on your feedback, I arrived at another design with just a single new top level object. All of the concepts you addressed in your comments are to be found in the new design. Quite a bit more is expressed in fewer moving parts.
@thockin I think several of your questions and concerns from #3318 are addressed by the redesign.
Overall, fewer moving pieces in this design, a complete story of volume re-use from a pod to replController, a way to create and replenish available storage, and expansions of volumes in @thockin's new framework.
Your feedback is appreciated!
/cc @smarterclayton