persistent storage volumes data is deleted when node becomes unresponsive #13003
Description
We have a 3 master , 3 etcd and 3 nodes cluster.
we are running wordpress apps (wordpress app and mysql DB) backed by persistent storage (Amazon EFS) on openshift.
Out of three nodes, one node's(node1) cpu spiked and load average also increased a lot. In that node we had two wordpress and one DB container was running.
since node1 is unresponsive, openshift scheduled those containers in other healthy nodes.
Problem started when node 1 started responding again.
node1 had three containers which ran on node1 had three persistent volumers. suddenly all the three volumes became empty.
i tried adding data again. something is deleting that data again and again.
Deletion didnt stop until i deleted that directory in EFS and created again.
In that time containers which was rescheduled in healthy nodes are running fine till node1 responded.
After data deletion , containeres started restarting.
We have observerd this twice in our environment.
Version
oc v1.3.1
kubernetes v1.3.0+52492b4
features: Basic-Auth GSSAPI Kerberos SPNEGO
docker version 1.9.1
Steps To Reproduce
- run apps with persistent storage from amazon EFS
- load a node which runs some containers with persistent storage
- when the node becomes unresponsive, those containers will be placed in new healthy nodes.
- when the unresponsive node responds, data in persistent storage which was attached to that node will vanish
- If we try adding additional data, that also will be deleted
Current Result
all contents of that directory is deleted
Expected Result
When containers are moved to healthy nodes, it should move persistent volume also
Additional Information
i will provide further details when required