Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Ceph rados block device (rbd) volume plugin #6578

Merged
merged 1 commit into from
May 21, 2015

Conversation

rootfs
Copy link
Contributor

@rootfs rootfs commented Apr 8, 2015

@thockin @vishh

Updated. Ready for your review. Thank you!

@vishh vishh self-assigned this Apr 8, 2015
@rootfs rootfs force-pushed the wip-rbd branch 7 times, most recently from ba9c29f to 792a9af Compare April 8, 2015 18:52
// RBDVolumeSource represents a Rados Block Device Mount that lasts the lifetime of a pod
type RBDVolumeSource struct {
// Required: CephMonitor is the ip:port of the ceph cluster monitor
CephMonitor string `json:"monitor" description:"ip:port of the ceph cluster monitor"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are normally N monitors; is this a comma separated list, or a json list? Alternatively ceph can also take a dns name and it'll form a list from the A or AAAA records, but most users don't have DNS set up that way for their clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the center of another topic. In gluster, we create an endpoints (i.e. array of hosts) to represent the cluster. But the endpoint concept is being debated now.

Add @thockin @smarterclayton

@rootfs
Copy link
Contributor Author

rootfs commented Apr 9, 2015

@liewegas

Sage: The Ceph monitors is now represented by endpoints. Passing secret is also allowed. If both keyring and secret are passed, secret overrides keyring.

// RBDVolumeSource represents a Rados Block Device Mount that lasts the lifetime of a pod
type RBDVolumeSource struct {
// Required: EndpointsName is the endpoint name that represents a collection of Ceph monitors
EndpointsName string `json:"endpoints" description:"a collection of Ceph monitors"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin: Given that we are re-considering using endpoints for representing external entities, should we switch to using headless service here instead of endpoints?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think a headless service is preferred to raw Endpoints, even if just to prevent collisions. But the discussion never concluded. My last comment was essentially "As written, the endpoints must be in the same namespace as the pod. I don't think this is what you want to do." We need to rethink how we want to describe connecting to these sorts of services, and I don't think we should commit this (and frankly neither gluster) until we have an asnwer. Gluster will now have to be backwards compatible or else break compat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at a minimum so far it sounds like we should have used an object reference. If we did that, and stated that we will for a period of time support endpoints as the Kind while we solve this, does that reduce the concern about API stability?

On Apr 13, 2015, at 2:46 PM, Tim Hockin notifications@github.com wrote:

In pkg/api/v1beta1/types.go:

@@ -1522,3 +1526,27 @@ type GlusterfsVolumeSource struct {
// the Glusterfs volume to be mounted with read-only permissions
ReadOnly bool json:"readOnly,omitempty" description:"Glusterfs volume to be mounted with read-only permissions"
}
+
+// RBDVolumeSource represents a Rados Block Device Mount that lasts the lifetime of a pod
+type RBDVolumeSource struct {

  • // Required: EndpointsName is the endpoint name that represents a collection of Ceph monitors
  • EndpointsName string json:"endpoints" description:"a collection of Ceph monitors"
    In general I think a headless service is preferred to raw Endpoints, even if just to prevent collisions. But the discussion never concluded. My last comment was essentially "As written, the endpoints must be in the same namespace as the pod. I don't think this is what you want to do." We need to rethink how we want to describe connecting to these sorts of services, and I don't think we should commit this (and frankly neither gluster) until we have an asnwer. Gluster will now have to be backwards compatible or else break compat.


Reply to this email directly or view it on GitHub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super concerned about API compat with gluster - we could PROBABLY take the hit and just break compat. Or make a new field that is an ObjectRef, deprecate the old field, but use it preferentially if specified. That's not too hard, just need to pick a name that you don't hate, and be consistent across all of these scenarios.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton @thockin Why not use an object reference to a headless service? That would give you ns flexibility.

@vishh
Copy link
Contributor

vishh commented Apr 9, 2015

Curious: Is it possible to run a Ceph cluster inside a kube cluster?

@smarterclayton
Copy link
Contributor

Should be. Same for gluster.

----- Original Message -----

Curious: Is it possible to run a Ceph cluster inside a kube cluster?


Reply to this email directly or view it on GitHub:
#6578 (comment)

@vishh
Copy link
Contributor

vishh commented Apr 9, 2015

Being able to setup a ceph/gluster cluster in kube will help setup e2e testing for these plugins.

@rootfs
Copy link
Contributor Author

rootfs commented Apr 13, 2015

@vishh i am working on containerizing ceph cluster

// RBDVolumeSource represents a Rados Block Device Mount that lasts the lifetime of a pod
type RBDVolumeSource struct {
// Required: EndpointsName is the endpoint name that represents a collection of Ceph monitors
EndpointsName string `json:"endpoints"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This i subject to the same discussion as gluster - current implication is that it can only access in-namespace Endpoints. Is that really sufficient?

@rootfs
Copy link
Contributor Author

rootfs commented Apr 14, 2015

@thockin please correct me if my understanding goes wrong.

  • For this PR, convert Secret from plain text string to a Secret object reference
  • convert EndpointsName from a string to a EndpointObjectReference , for both ceph and gluster (gluster change will be in a separate PR).

@smarterclayton
Copy link
Contributor

Ugh, cross namespace secrets. @pmorie, we need to decide whether we should force every consumer of a cluster volume to have their own copy of the secrets (proving they have access) or whether it's better to embed the secret directly in the volume definition, or if we want to allow cross namespace secrets. I really don't think we can do the latter, because in order to secure that we need to have a way to say repo X has access to the secret. However, if we take the first option, then when you want to use a persistent volume defined by a cluster admin (which presumably has secrets attached to it) the assignment process also has to copy the secrets into your namespace (which is possible, but somewhat ugly). The middle solution - embedding the secret directly - is the least impactful now, but leaves secrets coupled to volumes.

----- Original Message -----

@thockin please correct me if my understanding goes wrong.

  • For this PR, convert Secret from plain text string to an Secret
    object reference
  • convert EndpointsName from a string to a EndpointObjectReference , for both ceph and gluster (gluster change will
    be in a separate PR).

Reply to this email directly or view it on GitHub:
#6578 (comment)

@pmorie
Copy link
Member

pmorie commented Apr 14, 2015

@smarterclayton @erictune @thockin @rootfs

I think this is a new use-case for secrets from all of the ones we called out in the proposal. It's close to the 'secret used on behalf of the pod by the kubelet' in the sense that it doesn't touch the container; ie, the container never knows about it. However, it's different from the cases I had personally imagined for secrets used by the kubelet on behalf of the container in a couple ways:

  1. The use-cases we had in mind for the kubelet performing actions on behalf of the pod were focused on service accounts and things like private docker registry pull, LOAS daemon, etc.
  2. This is a separate volume type which needs to use a secret to initialize the volume, but which does not actually expose the secret data in said volume.

So, that said, should we update the proposal to include this variant of use-case?

@smarterclayton
Copy link
Contributor

Before we force Huamin and Sage to go redesign everything, yes. This is definitely a secret the infrastructure may have access to, which the user doesn't really need in the container. Should the rules be different than for regular uses of secrets? Probably not. It may just mean that the persistent volume controller, upon binding a claim into the namespace, also needs to bind and update a secret.

On Apr 14, 2015, at 11:26 AM, Paul Morie notifications@github.com wrote:

@smarterclayton @erictune @thockin @rootfs

I think this is a new use-case for secrets from all of the ones we called out in the proposal. It's close to the 'secret used on behalf of the pod by the kubelet' in the sense that it doesn't touch the container; ie, the container never knows about it. However, it's different from the cases I had personally imagined for secrets used by the kubelet on behalf of the container in a couple ways:

The use-cases we had in mind for the kubelet performing actions on behalf of the pod were focused on service accounts and things like private docker registry pull, LOAS daemon, etc.
This is a separate volume type which needs to use a secret to initialize the volume, but which does not actually expose the secret data in said volume.
So, that said, should we update the proposal to include this variant of use-case?


Reply to this email directly or view it on GitHub.

@pmorie
Copy link
Member

pmorie commented Apr 14, 2015

@smarterclayton Yep, getting to your comments, three ways were offered:

  1. Force consumers to have their own copy of secrets
  2. Allow cross-namespace secrets
  3. Allow embedding the secret where you need it

I don't like (2) for the same reason you don't: you would have to define policies that express that user X has access to secret Y in a different ns.

I don't think (3) is great because one of the points of secrets is to keep the secret data orthogonal to where it's used.

I think we should go with (1) and facilitate copying secrets into an NS on things like PV volume claim.

Any thoughts of the above, @markturansky?

@rootfs rootfs force-pushed the wip-rbd branch 4 times, most recently from 5ee14a9 to 4899463 Compare May 20, 2015 18:18
@rootfs
Copy link
Contributor Author

rootfs commented May 20, 2015

@pmorie @vishh @smarterclayton

converted secret to LocalObjectReference as suggest by @bgrant0607

// Optional: Keyring is the path to key ring for RBDUser, default is /etc/ceph/keyring
Keyring string `json:"keyring"`
// Optional: SecretRef is name of the authentication secret for RBDUser, default is empty.
SecretRef LocalObjectReference `json:"secretName"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a pointer if it is optional. If it is not, then the godoc comment should be updated.

@smarterclayton
Copy link
Contributor

It looks like some of my comments weren't addressed - specifically the logging is unnecessary if you return errors (and if you want the errors to be better, wrap them with fmt.Errorf("specific message: %v", err).

Also, remove the cnt--

@rootfs
Copy link
Contributor Author

rootfs commented May 21, 2015

@smarterclayton i see what you mean now, please see the latest. thanks.

// Optional: Keyring is the path to key ring for RBDUser, default is /etc/ceph/keyring
Keyring string `json:"keyring" description:"Keyring is the path to key ring for rados user. default is /etc/ceph/keyring"`
// Optional: SecretRef is name of the authentication secret for RBDUser, default is empty.
SecretRef *LocalObjectReference `json:"secretName" description:"The name of secret. secret is authentication secret. If provided, it overrides keyring"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'secretRef' not 'secretName'

@smarterclayton
Copy link
Contributor

One last set of comments. After that this is LGTM, @vishh do you have any additional feedback?

Signed-off-by: Huamin Chen <hchen@redhat.com>
@smarterclayton
Copy link
Contributor

LGTM

@smarterclayton smarterclayton added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 21, 2015
smarterclayton added a commit that referenced this pull request May 21, 2015
add Ceph rados block device (rbd) volume plugin
@smarterclayton smarterclayton merged commit f3dd404 into kubernetes:master May 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants