-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to mount volume gcePersistentDisk with readOnly: true for multiple pods (timeout expired waiting for volumes to attach/mount) #31176
Comments
@ceefour Could you please share the full log during this error occurred and also the spec for your RC? Are you using PVC/PV or directly use pd name? Is that possible to your PD mongo-conf is attached to some node with RW mode? Thanks! |
It's possible that because I ran into #29358 , this problem masked it. On Wed, Aug 31, 2016, 06:06 Jing Xu notifications@github.com wrote:
|
@ceefour Thank you for your response. Since we fixed a few things recently in storage related part. If it is ok, could you please give a try for readOnly PD and let us know if you still have any problem? Thanks! |
Situation improves on 1.4.6, but minor annoyance because there is still lag while it's retrying:
|
Hi Hendy,
Thank you very much for your feedback. Yes, it will take a couple of more
minutes for volume manager to detect node is deleted and volume is detached
already. There is a trade off between quick response and low overhead. We
can adjust some parameters to balance between and your comments are
appreciated!
Best,
Jing
…On Sun, Nov 27, 2016 at 2:30 AM, Hendy Irawan ***@***.***> wrote:
Situation improves on 1.4.6, but minor annoyance because there is still
lag while it's retrying:
>kubectl get po
NAME READY STATUS RESTARTS AGE
mongo-arb-3311699171-5yc0t 1/1 Running 0 3m
mongo0-375173819-7jtxa 0/1 ContainerCreating 0 3m
mongo1-1101050559-y1s10 1/1 Running 0 9m
>kubectl describe po mongo0-375173819-7jtxa
Name: mongo0-375173819-7jtxa
Namespace: default
Node: gke-fatih-g1-d2fb3adc-nl4a/10.142.0.4
Start Time: Sun, 27 Nov 2016 17:23:06 +0700
Labels: instance=fatih0
mongo-rs-name=bippo
pod-template-hash=375173819
Status: Pending
IP:
Controllers: ReplicaSet/mongo0-375173819
Containers:
mongo:
Container ID:
Image: mongo
Image ID:
Port: 27017/TCP
Command:
/bin/sh
-c
Args:
cp /etc/mongo-keyfile/mongo.keyfile /etc/mongo.keyfile && chmod 600 /etc/mongo.keyfile && chown mongodb:mongodb /etc/mongo.keyfile && mongod --replSet bippo --keyFile /etc/mongo.keyfile
Requests:
cpu: 100m
memory: 700Mi
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Volume Mounts:
/data/db from mongo-persistent-storage0 (rw)
/etc/mongo-keyfile from mongo-keyfile (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-i0lox (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
mongo-keyfile:
Type: Secret (a volume populated by a Secret)
SecretName: mongo-keyfile
mongo-persistent-storage0:
Type: GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
PDName: mongodb-disk0
FSType: ext4
Partition: 0
ReadOnly: false
default-token-i0lox:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-i0lox
QoS Class: Burstable
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason
Message
--------- -------- ----- ---- ------------- -------- ------
-------
2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned mongo0-375173819-7jtxa to gke-fatih-g1-d2fb3adc-nl4a
2m 1m 8 {controller-manager } Warning FailedMount Failed to attach volume "mongo-persistent-storage0" on node "gke-fatih-g1-d2fb3adc-nl4a" with: googleapi: Error 400: The disk resource 'mongodb-disk0' is already being used by 'gke-fatih-pool-1-2efbbabb-kfxk'
47s 47s 1 {kubelet gke-fatih-g1-d2fb3adc-nl4a} Warning FailedMount Unable to mount volumes for pod "mongo0-375173819-7jtxa_default(7cd25e46-b48b-11e6-a2cb-42010af000ba)": timeout expired waiting for volumes to attach/mount for pod "mongo0-375173819-7jtxa"/"default". list of unattached/unmounted volumes=[mongo-persistent-storage0]
47s 47s 1 {kubelet gke-fatih-g1-d2fb3adc-nl4a} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "mongo0-375173819-7jtxa"/"default". list of unattached/unmounted volumes=[mongo-persistent-storage0]
The disk resource 'mongodb-disk0' is already being used by
'gke-fatih-pool-1-2efbbabb-kfxk' but that instance no longer exists now:
>kubectl get node
NAME STATUS AGE
gke-fatih-g1-d2fb3adc-l0ad Ready 5m
gke-fatih-g1-d2fb3adc-nl4a Ready 5m
C:\Users\ceefour\git\yoopabot>kubectl get po
NAME READY STATUS RESTARTS AGE
mongo-arb-3311699171-5yc0t 1/1 Running 0 5m
mongo0-375173819-7jtxa 1/1 Running 0 5m
mongo1-1101050559-y1s10 1/1 Running 0 11m
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#31176 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxU_ad3A1X68mNNDny8SypRQuFdgLks5rCVu_gaJpZM4JqYKF>
.
--
- Jing
|
I am seeing the same or a similar issue, but it happens after the pod has been running for quite some time.
Actually, now that I'm looking more closely at the Note also the pods were not sharing the same GCEPersistentDisk. The nodes are currently at v1.4.5, but I see v1.4.7 is available on GKE. I can upgrade if it will fix the problem. |
Can this be closed? |
I seem to run into the same problem I have a backend api service which mounts a volume readonly false and a worker which mounts it readonly true. This results in the following:
helm chart (for both except the read-only): volumes:
- name: datadir-{{ template "trackableappname" . }}
gcePersistentDisk:
pdName: datadir-{{ template "trackableappname" . }}
fsType: ext4
persistentVolumeClaim:
readOnly: true
claimName: {{ .Values.existingClaim | default (include "fullname" .) }} |
HI @patvdleer, there's a few problems with your helm chart:
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
I hit a similar issue several times, and after some digging in the node's |
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):
timeout expired waiting for volumes to attach/mount gcePersistentDisk readOnly
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Bug report
Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.4", GitCommit:"dd6b458ef8dbf24aff55795baa68f83383c9b3a9", GitTreeState:"clean", BuildDate:"2016-08-01T16:45:16Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.4", GitCommit:"dd6b458ef8dbf24aff55795baa68f83383c9b3a9", GitTreeState:"clean", BuildDate:"2016-08-01T16:38:31Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Environment:
uname -a
): Linux gke-fatih-small-pool-59881027-k909 3.16.0-4-amd64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/LinuxWhat happened:
readOnly persistent disk won't mount multiple times
What you expected to happen:
Can be mounted multiple times
How to reproduce it (as minimally and precisely as possible):
Create 3 RCs (with 1 replica each) that mount the same persistent disk as read only.
The first pod always succeeds mounting.
The second pod is intermittent. The third pod (below) always fails.
Anything else do we need to know:
Relevant
/var/log/kubelet.log
in the node:/var/log/kubelet.log
for 50 last lines:The text was updated successfully, but these errors were encountered: