GCE persited disk sometimes gets mounted read-only in a single container #32929
Description
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): "read-only"
Is this a BUG REPORT or FEATURE REQUEST? (choose one): Bug report. I am not sure if this is an issue with Kubernetes, Docker, or Google Cloud, but posting here to get this started.
I've looked at 29358, 29166, 28750, 29903 but they all appear to be related to different issues.
Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.6", GitCommit:"ae4550cc9c89a593bcda6678df201db1b208133b", GitTreeState:"clean", BuildDate:"2016-08-26T18:13:23Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.6", GitCommit:"ae4550cc9c89a593bcda6678df201db1b208133b", GitTreeState:"clean", BuildDate:"2016-08-26T18:06:06Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Environment:
- Cloud provider or hardware configuration: Google Cloud (GKE)
- OS (e.g. from /etc/os-release):
In container:
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.4.0
PRETTY_NAME="Alpine Linux v3.4"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"
- Kernel (e.g.
uname -a
):
In container:
Linux redis-cache 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 Linux
- Install tools: Google Cloud SDK
- Others:
What happened:
When kubernetes moves around pods, sometimes the pods with gcePersistentDisk
volumes end up not coming up ("timeout expired waiting for volumes to attach/mount for pod"). When this doesn't resolve it self, I delete the pod. Kubernetes recreates, eventually mounting the disk, but intermittently the disk ends up being mounted read-only even though it is the only container connected to it and kubectl describe pod
shows the volume as "ReadWrite".
To resolve this, I have to delete the deployment and wait for the disk to auto-detach, or use glcoud compute instance detach-disk
to remove it from the instance. After doing this, re-creating the deployment seems to launch the pod correctly and the drive gets attached in read-write mode as it should.
I've talked with another team that has seen the same behavior (@davidewatson).
What you expected to happen:
A persisted disk assigned to a single node should be read-write within the container.
How to reproduce it (as minimally and precisely as possible):
It's intermittent. :/
In our case we see this when attaching disks to redis instances. The disks are not huge (200G of which maybe 15G is used) but the redis instance (PID 1 in the container) immediately launches and tries to read the disk so it would need to be mounted on container start.
We have a fairly high rate of mount timeouts that never resolve. Perhaps 20% of the time. I've let some of them run as long as an hour or a day and they just keep trying to remount. Sometimes deleting the pod resolves this. Sometimes deleting the pod results in this read-only mount.
Deployment spec:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: redis-cache
spec:
template:
metadata:
labels:
purpose: cache
service: redis
spec:
volumes:
- name: data
gcePersistentDisk:
pdName: redis-cache
fsType: ext4
containers:
- name: redis
image: us.gcr.io/xxxxxxxx/redis-cache:latest
resources:
requests:
memory: "10G"
ports:
- containerPort: 6379
volumeMounts:
- mountPath: /data
name: data
args: ["redis-server", "/etc/redis.conf"]
Dockerfile
:
FROM redis:3.2-alpine
ADD ./deploy/redis-cache.conf /etc/redis.conf
Anything else do we need to know:
Here's a full log of the behavior.
First we saw that the redis instance was down. Inspecting the cluster showed that kubernetes had shuffled the pod.
$ kubectl get pods -l purpose=cache
NAME READY STATUS RESTARTS AGE
redis-cache-3629295551-h5zig 1/1 Terminating 1 2d
redis-cache-3629295551-mqkm2 0/1 ContainerCreating 0 2m
About an hour later the pod still had not come up. Inspecting the pod, shows that it's been retrying the mount over the hour and not succeeding. Note that this was re-launched on the same node that the evicted pod was running on.
$ kubectl describe pod redis-cache-3629295551-abfoh
Name: redis-cache-3629295551-abfoh
Namespace: default
Node: gke-app-name-pool-1-b5918e16-rqip/10.138.0.6
...
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
59m 59m 1 {default-scheduler } Normal Scheduled Successfully assigned redis-cache-3629295551-abfoh to gke-app-name-pool-1-b5918e16-rqip
57m 14s 27 {kubelet gke-app-name-pool-1-b5918e16-rqip} Warning FailedMount Unable to mount volumes for pod "redis-cache-3629295551-abfoh_default(72768d65-7c4a-11e6-a198-42010a8a000d)": timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-abfoh"/"default". list of unattached/unmounted volumes=[data]
57m 14s 27 {kubelet gke-app-name-pool-1-b5918e16-rqip} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-abfoh"/"default". list of unattached/unmounted volumes=[data]
I deleted the pod and let the deployment create a new one:
$ kubectl delete pod redis-cache-3629295551-abfoh
pod "redis-cache-3629295551-abfoh" deleted
$ kubectl get pods -l purpose=cache
NAME READY STATUS RESTARTS AGE
redis-cache-3629295551-abfoh 0/1 Terminating 0 1h
redis-cache-3629295551-o7mm9 0/1 ContainerCreating 0 7s
This eventually comes up, on another node:
$ kubectl describe pod redis-cache-3629295551-o7mm9
Name: redis-cache-3629295551-o7mm9
Namespace: default
Node: gke-app-name-pool-1-b5918e16-8cdb/10.138.0.4
...
Volumes:
data:
Type: GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
PDName: redis-cache
FSType: ext4
Partition: 0
ReadOnly: false
...
Events:
FirstSeen LastSeen Count From SubobjectPath Type ReasonMessage
--------- -------- ----- ---- ------------- -------- -------------
7m 7m 1 {default-scheduler } Normal Scheduled Successfully assigned redis-cache-3629295551-o7mm9 to gke-app-name-pool-1-b5918e16-8cdb
5m 3m 2 {kubelet gke-app-name-pool-1-b5918e16-8cdb} Warning FailedMount Unable to mount volumes for pod "redis-cache-3629295551-o7mm9_default(d6fb24b0-7c52-11e6-a198-42010a8a000d)": timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-o7mm9"/"default". list of unattached/unmounted volumes=[data]
5m 3m 2 {kubelet gke-app-name-pool-1-b5918e16-8cdb} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-o7mm9"/"default". list of unattached/unmounted volumes=[data]
1m 1m 1 {kubelet gke-app-name-pool-1-b5918e16-8cdb} spec.containers{redis} Normal PulledContainer image "us.gcr.io/xxxxxxx/redis-cache:latest" already present on machine
1m 1m 1 {kubelet gke-app-name-pool-1-b5918e16-8cdb} spec.containers{redis} Normal CreateCreated container with docker id 896b189ef167
1m 1m 1 {kubelet gke-app-name-pool-1-b5918e16-8cdb} spec.containers{redis} Normal StarteStarted container with docker id 896b189ef167
But the volume isn't writable inside the container:
$ kubectl exec redis-cache-3629295551-o7mm9 -- mount | grep /data
/dev/sdb on /data type ext4 (ro,relatime,data=ordered)