[BUG] RWX workload gets stuck in ContainerCreating after cluster restart #6924
Labels
area/resilience
System or volume resilience
area/volume-rwx
Volume RWX related
backport/1.4.4
backport/1.5.2
component/longhorn-share-manager
Longhorn share manager (control plane for NFS server, RWX)
kind/bug
priority/0
Must be implement or fixed in this release (managed by PO)
require/backport
Require backport. Only used when the specific versions to backport have not been definied.
require/qa-review-coverage
Require QA to review coverage
Describe the bug (🐛 if you encounter this issue)
When running
cluster restart negative test case
onv1.4.4-rc1
, an issue happened.After cluster restart (reboot all nodes including control plane), a deployment workload with rwx volume get stuck in
ContainerCreating
:But the corresponding volume has already been healthy:
There are some errors in this
test-deployment-rwx-69c97d4ffb-778xp
:And there is no share-manager running on the same node as this pod, but i'm not sure if it's necessary:
To Reproduce
Run
Restart Cluster While Workload Heavy Writing
negative test case:Expected behavior
Support bundle for troubleshooting
supportbundle_0ed39870-8d1a-483c-a645-ae056e81ace5_2023-10-19T08-09-38Z.zip
Environment
Additional context
The text was updated successfully, but these errors were encountered: