Attach/detach controller does not recover from missed pod deletion

We run OpenShift in master-slave setup and our master crashes once in a while (from unrelated reason). When a new master starts, it does not detach volumes that should be detached.

Steps to reproduce on AWS with standard Kubernetes:
1. run a  AWS-aware cluster, hack/local-up-cluster.sh is fine
2. create several pods that use claims that point to AWS PVs
3. kill controller-manager process
4. delete all pods
5. start a new controller manager

Result: volumes are attached forever (or at least for next 30 minutes).
It should be reproducible also on GCE.  Shouldn't there be a periodic sync that ensures the controller finds deleted pods? This comment looks scary: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/attach_detach_controller.go#L76

Affected version: kubernetes-1.3.8

@saad-ali @jingxu97 @kubernetes/sig-storage 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attach/detach controller does not recover from missed pod deletion #34242

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development