Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CanMount() for gfsMounter for linux #36686

Merged
merged 1 commit into from
Nov 14, 2016

Conversation

rkouj
Copy link
Contributor

@rkouj rkouj commented Nov 12, 2016

What this PR does / why we need it:
To implement CanMount() check for glusterfs. If mount binaries are not present on the underlying node, the mount will not proceed and return an error message stating so.

Related to issue : #36098

Related to similar change for NFS :
#36280

Release note:
Check binaries for GlusterFS on the underlying node before doing mount


This change is Reviewable

Sample output from testing in GCE/GCI:

rkouj@rkouj0:~/go/src/k8s.io/kubernetes$ kubectl describe pods
Name: glusterfs
Namespace: default
Node: e2e-test-rkouj-minion-group-kjq3/10.240.0.3
Start Time: Fri, 11 Nov 2016 17:22:04 -0800
Labels:
Status: Pending
IP:
Controllers:
Containers:
glusterfs:
Container ID:
Image: gcr.io/google_containers/busybox
Image ID:
Port:
QoS Tier:
cpu: Burstable
memory: BestEffort
Requests:
cpu: 100m
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables:
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
glusterfs:
Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime)
EndpointsName: glusterfs-cluster
Path: kube_vol
ReadOnly: true
default-token-2zcao:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-2zcao
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message


8s 8s 1 {default-scheduler } Normal Scheduled Successfully assigned glusterfs to e2e-test-rkouj-minion-group-kjq3
7s 4s 4 {kubelet e2e-test-rkouj-minion-group-kjq3} Warning FailedMount Unable to mount volume kubernetes.io/glusterfs/6bb04587-a876-11e6-a712-42010af00002-glusterfs (spec.Name: glusterfs) on pod glusterfs (UID: 6bb04587-a876-11e6-a712-42010af00002). Verify that your node machine has the required components before attempting to mount this volume type. Required binary /sbin/mount.glusterfs is missing

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note-label-needed labels Nov 12, 2016
@rkouj
Copy link
Contributor Author

rkouj commented Nov 12, 2016

@saad-ali

@rootfs
Copy link
Contributor

rootfs commented Nov 12, 2016

does it work for ns mount?

@rkouj
Copy link
Contributor Author

rkouj commented Nov 12, 2016

Edit : I misread it as nfs mount. The following comment is for nfs mount

Yes we have enabled the check for only GCI and only when the containerized mount flag is turned off.
I have linked the PR in description with sample output.

cc: @saad-ali

exe := exec.New()
switch runtime.GOOS {
case "linux":
if _, err := exe.Command("/bin/ls", "/sbin/mount.glusterfs").CombinedOutput(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define the mount client binary path as constant. It would also be good if you can add a comment about what distros are verified under linux .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

@saad-ali
Copy link
Member

does it work for ns mount?

@rootfs Can you clarify what you mean by this?

@rootfs
Copy link
Contributor

rootfs commented Nov 14, 2016

@rkouj @saad-ali In ns_mount.go, the mount binary is under hostRootFsPath, rather than /sbin.

@rkouj
Copy link
Contributor Author

rkouj commented Nov 14, 2016

@rootfs : For now the change is enabled only in GCI where we don't run kubelet in container.

Once we start doing that or if we need to make the change take effect elsewhere, we need to add additional code to support it.

@rkouj rkouj force-pushed the check-gluster-fs-binaries branch from 06733df to b85ac95 Compare November 14, 2016 20:18
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 14, 2016
@rootfs
Copy link
Contributor

rootfs commented Nov 14, 2016

@rkouj Unfortunately, there are use cases to run kubelet in containers.

I think a clean way to do this is to have a mounter interface to detect mount binaries. @saad-ali

@rkouj
Copy link
Contributor Author

rkouj commented Nov 14, 2016

As discussed on slack with @rootfs
the CanMount() is part of the mounter inteface which means we can detect where the mounter runs.

@rootfs
Copy link
Contributor

rootfs commented Nov 14, 2016

@rkouj checkout util mounter and ns mounter

@saad-ali saad-ali assigned saad-ali and rootfs and unassigned rootfs Nov 14, 2016
@rkouj
Copy link
Contributor Author

rkouj commented Nov 14, 2016

@rootfs : The CanMount() check that we are doing is specific to each volume plugin and the intent of this change was to make CanMount() isolated to each plugin.

Later on we can refactor this when we want this change to turned on on other platforms and make it more generic.

@saad-ali
Copy link
Member

LGTM

@saad-ali saad-ali added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Nov 14, 2016
@saad-ali saad-ali added this to the v1.4 milestone Nov 14, 2016
@saad-ali saad-ali added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2016
@k8s-ci-robot
Copy link
Contributor

@rkouj: you can't LGTM your own PR.

In response to [this comment](https://github.com//pull/36686#issuecomment-260484357):

/lgtm

If you have questions or suggestions related to this bot's behavior, please file an issue against the kubernetes/test-infra repository.

@rkouj
Copy link
Contributor Author

rkouj commented Nov 14, 2016

Post applying the lgtm label, the Submit Queue is not getting updated to pick up the change.

@saad-ali
Copy link
Member

/lgtm

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 9be1191 into kubernetes:master Nov 14, 2016
@eparis
Copy link
Contributor

eparis commented Nov 17, 2016

Can someone @saad-ali @rootfs help me understand. Does this break containerized kubelet and gluster? If so, it seems like a regression that should not have been merged...

@rootfs
Copy link
Contributor

rootfs commented Nov 17, 2016

The code checks /sbin/mount.glusterfs for linux (btw, gluster client also runs on Mac), while containerized kubelet actually executes /rootfs/sbin/mount.glusterfs, there is a mismatch: if containerized kubelet doesn't have /sbin/mount.glusterfs, while host has, then we get a false negative

@eparis
Copy link
Contributor

eparis commented Nov 17, 2016

@saad-ali I'm removing the cherrypick-candidate label. This is a regression breaking working setups. Personally I think we should revert until it and look for a non-regressing path, but I'd love to hear your take.

@rkouj
Copy link
Contributor Author

rkouj commented Nov 17, 2016

@eparis I'm not sure if this helps but this check is enabled only when we don't do a containerized mount.

Right now the flag is enabled only in GCI (where we don't do containerized kubelet yet I believe). https://github.com/kubernetes/kubernetes/pull/36280/files#diff-6e3b476b9225d1213dc6ad13e453fc16R477

If we are doing a containerized mount, this check will not run. https://github.com/kubernetes/kubernetes/pull/36280/files#diff-bf28da68f62a8df6e99e447c4351122dR719

Once we start exploring more avenues, we will need to refactor/add more code to handle different cases.

@rootfs
Copy link
Contributor

rootfs commented Nov 17, 2016

@rkouj As I understand it, experimental-check-node-capabilities-before-mount is a user visible flag. Although it is currently only set on gci image, there is nothing preventing other kubernetes distribution to set it, and there is no words around or doc update to alert the potential issue for containerized kubelet.

@rkouj
Copy link
Contributor Author

rkouj commented Nov 17, 2016

Right. What do you suggest ? I can create an issue and link it if it helps. Or add a comment around the flag in options.go .

@rootfs
Copy link
Contributor

rootfs commented Nov 17, 2016

@eparis what's the best practice?

@saad-ali
Copy link
Member

Sorry for the delay in response--overwhelmed with CCs so I just saw this thread.

Can someone @saad-ali @rootfs help me understand. Does this break containerized kubelet and gluster? If so, it seems like a regression that should not have been merged...

@eparis No, it does not (or at least should not). The CanMount() check is only enabled if you are deploying a GCI OS image on GCE (see https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh#L486). It is disabled by default in all other cases. This PR was a follow up to #36280 which has the bulk of the logic. On GCE/GKE the new default OS image, GCI, did not have the correct binaries for NFS and GlusterFS and those mounts were failing for users with cryptic mount errors. This was a way for us to at least give users better error message (while we work on adding support back). While the check is designed to be extensible per plugin, it is not, as @rootfs mentioned, extensible based on underlying platform. I like the suggestion of extending the mounter interface so that the check is custom per platform, but that also gets messy since the check has to also be custom per plugin. We will iterate on the current design in the future and refactor (we plan to invest heavily in this area of better user messaging). But rest assured that this should not affect any platforms other than GCE on GCI.

@saad-ali I'm removing the cherrypick-candidate label. This is a regression breaking working setups. Personally I think we should revert until it and look for a non-regressing path, but I'd love to hear your take.

It should not have broken anything. If it did, our bad, please file a bug and we'll take a look ASAP. Any break here would be a release blocker.

@rkouj As I understand it, experimental-check-node-capabilities-before-mount is a user visible flag. Although it is currently only set on gci image, there is nothing preventing other kubernetes distribution to set it, and there is no words around or doc update to alert the potential issue for containerized kubelet.

The flag is explicitly marked experimental because it is not guaranteed to work on all platforms. It is also explicitly disabled by default for this reason.

We need this in GKE for 1.4, I'm going to re-add the cherrypick candidate label. If you have any other concerns, I'll be happy to discuss them. Feel free to ping me here, or directly on Slack, and I'm happy to discuss this during the Storage SIG meeting as well.

@jessfraz jessfraz added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Dec 1, 2016
jessfraz added a commit that referenced this pull request Dec 7, 2016
…upstream-release-1.4

Automated cherry pick of #36686
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.4" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants