GCE persited disk sometimes gets mounted read-only in a single container

**Is this a request for help?** (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No

**What keywords did you search in Kubernetes issues before filing this one?** (If you have found any duplicates, you should instead reply there.): "read-only"

---

**Is this a BUG REPORT or FEATURE REQUEST?** (choose one): Bug report. I am not sure if this is an issue with Kubernetes, Docker, or Google Cloud, but posting here to get this started.

I've looked at 29358, 29166, 28750, 29903 but they all appear to be related to different issues.

**Kubernetes version** (use `kubectl version`):

```
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.6", GitCommit:"ae4550cc9c89a593bcda6678df201db1b208133b", GitTreeState:"clean", BuildDate:"2016-08-26T18:13:23Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.6", GitCommit:"ae4550cc9c89a593bcda6678df201db1b208133b", GitTreeState:"clean", BuildDate:"2016-08-26T18:06:06Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
```

**Environment**:
- **Cloud provider or hardware configuration**: Google Cloud (GKE)
- **OS** (e.g. from /etc/os-release): 
  In container:

```
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.4.0
PRETTY_NAME="Alpine Linux v3.4"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"
```
- **Kernel** (e.g. `uname -a`):
  In container:

```
Linux redis-cache 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 Linux
```
- **Install tools**: Google Cloud SDK
- **Others**:

**What happened**:
When kubernetes moves around pods, sometimes the pods with `gcePersistentDisk` volumes end up not coming up ("timeout expired waiting for volumes to attach/mount for pod"). When this doesn't resolve it self, I delete the pod. Kubernetes recreates, eventually mounting the disk, but intermittently the disk ends up being mounted read-only even though it is the only container connected to it and `kubectl describe pod` shows the volume as "ReadWrite".

To resolve this, I have to delete the _deployment_ and wait for the disk to auto-detach, or use `glcoud compute instance detach-disk` to remove it from the instance. After doing this, re-creating the deployment seems to launch the pod correctly and the drive gets attached in read-write mode as it should.

I've talked with another team that has seen the same behavior (@davidewatson).

**What you expected to happen**:
A persisted disk assigned to a single node should be read-write within the container.

**How to reproduce it** (as minimally and precisely as possible):
It's intermittent. :/ 

In our case we see this when attaching disks to redis instances. The disks are not huge (200G of which maybe 15G is used) but the redis instance (PID 1 in the container) immediately launches and tries to read the disk so it would need to be mounted on container start.

We have a fairly high rate of mount timeouts that never resolve. Perhaps 20% of the time. I've let some of them run as long as an hour or a day and they just keep trying to remount. Sometimes deleting the pod resolves this. Sometimes deleting the pod results in this read-only mount.

Deployment spec:

```
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: redis-cache
spec:
  template:
    metadata:
      labels:
        purpose: cache
        service: redis
    spec:
      volumes:
        - name: data
          gcePersistentDisk:
            pdName: redis-cache
            fsType: ext4
      containers:
      - name: redis
        image: us.gcr.io/xxxxxxxx/redis-cache:latest
        resources:
          requests:
            memory: "10G"
        ports:
        - containerPort: 6379
        volumeMounts:
        - mountPath: /data
          name: data
        args: ["redis-server", "/etc/redis.conf"]
```

`Dockerfile`:

```
FROM redis:3.2-alpine
ADD ./deploy/redis-cache.conf /etc/redis.conf
```

**Anything else do we need to know**:

Here's a full log of the behavior.

First we saw that the redis instance was down. Inspecting the cluster showed that kubernetes had shuffled the pod.

```
$ kubectl get pods -l purpose=cache
NAME                                                            READY     STATUS              RESTARTS   AGE
redis-cache-3629295551-h5zig                            1/1       Terminating         1          2d
redis-cache-3629295551-mqkm2                            0/1       ContainerCreating   0          2m
```

About an hour later the pod still had not come up. Inspecting the pod, shows that it's been retrying the mount over the hour and not succeeding. Note that this was re-launched on the same node that the evicted pod was running on.

```
$ kubectl describe pod redis-cache-3629295551-abfoh
Name:       redis-cache-3629295551-abfoh
Namespace:  default
Node:       gke-app-name-pool-1-b5918e16-rqip/10.138.0.6
...
Events:
  FirstSeen LastSeen    Count   From                            SubobjectPath   Type        Reason      Message
  --------- --------    -----   ----                            -------------   --------    ------      -------
  59m       59m     1   {default-scheduler }                            Normal      Scheduled   Successfully assigned redis-cache-3629295551-abfoh to gke-app-name-pool-1-b5918e16-rqip
  57m       14s     27  {kubelet gke-app-name-pool-1-b5918e16-rqip}         Warning     FailedMount Unable to mount volumes for pod "redis-cache-3629295551-abfoh_default(72768d65-7c4a-11e6-a198-42010a8a000d)": timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-abfoh"/"default". list of unattached/unmounted volumes=[data]
  57m       14s     27  {kubelet gke-app-name-pool-1-b5918e16-rqip}         Warning     FailedSync  Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-abfoh"/"default". list of unattached/unmounted volumes=[data]
```

I deleted the pod and let the deployment create a new one:

```
$ kubectl delete pod redis-cache-3629295551-abfoh
pod "redis-cache-3629295551-abfoh" deleted
$ kubectl get pods -l purpose=cache
NAME                                                            READY     STATUS              RESTARTS   AGE
redis-cache-3629295551-abfoh                            0/1       Terminating         0          1h
redis-cache-3629295551-o7mm9                            0/1       ContainerCreating   0          7s
```

This eventually comes up, on another node:

```
$ kubectl describe pod redis-cache-3629295551-o7mm9
Name:       redis-cache-3629295551-o7mm9
Namespace:  default
Node:       gke-app-name-pool-1-b5918e16-8cdb/10.138.0.4
...
Volumes:
  data:
    Type:   GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
    PDName: redis-cache
    FSType: ext4
    Partition:  0
    ReadOnly:   false
...
Events:
  FirstSeen LastSeen    Count   From                            SubobjectPath       Type        ReasonMessage
  --------- --------    -----   ----                            -------------       --------    -------------
  7m        7m      1   {default-scheduler }                                Normal      Scheduled   Successfully assigned redis-cache-3629295551-o7mm9 to gke-app-name-pool-1-b5918e16-8cdb
  5m        3m      2   {kubelet gke-app-name-pool-1-b5918e16-8cdb}             Warning     FailedMount Unable to mount volumes for pod "redis-cache-3629295551-o7mm9_default(d6fb24b0-7c52-11e6-a198-42010a8a000d)": timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-o7mm9"/"default". list of unattached/unmounted volumes=[data]
  5m        3m      2   {kubelet gke-app-name-pool-1-b5918e16-8cdb}             Warning     FailedSync  Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "redis-cache-3629295551-o7mm9"/"default". list of unattached/unmounted volumes=[data]
  1m        1m      1   {kubelet gke-app-name-pool-1-b5918e16-8cdb} spec.containers{redis}  Normal      PulledContainer image "us.gcr.io/xxxxxxx/redis-cache:latest" already present on machine
  1m        1m      1   {kubelet gke-app-name-pool-1-b5918e16-8cdb} spec.containers{redis}  Normal      CreateCreated container with docker id 896b189ef167
  1m        1m      1   {kubelet gke-app-name-pool-1-b5918e16-8cdb} spec.containers{redis}  Normal      StarteStarted container with docker id 896b189ef167
```

But the volume isn't writable inside the container:

```
$ kubectl exec redis-cache-3629295551-o7mm9 -- mount | grep /data
/dev/sdb on /data type ext4 (ro,relatime,data=ordered)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCE persited disk sometimes gets mounted read-only in a single container #32929

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development