Attach/detach controller does not recover from missed pod deletion #34242
Closed
Description
We run OpenShift in master-slave setup and our master crashes once in a while (from unrelated reason). When a new master starts, it does not detach volumes that should be detached.
Steps to reproduce on AWS with standard Kubernetes:
- run a AWS-aware cluster, hack/local-up-cluster.sh is fine
- create several pods that use claims that point to AWS PVs
- kill controller-manager process
- delete all pods
- start a new controller manager
Result: volumes are attached forever (or at least for next 30 minutes).
It should be reproducible also on GCE. Shouldn't there be a periodic sync that ensures the controller finds deleted pods? This comment looks scary: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/attach_detach_controller.go#L76
Affected version: kubernetes-1.3.8