Skip to content

[BUG] Duplicated default instance manager leads to engine/replica cannot be started #3000

Closed
@timmy59100

Description

Describe the bug
All existing rwx are not attaching anymore after upgrading to 1.2.

To Reproduce
Existing volumes are not attaching to redeployed pods. Not even after setting the workload to zero. Restarted longhorn components and longhorn nodes.

Expected behavior
Volumes should attach.

Log
If applicable, add the Longhorn managers' log when the issue happens.
sent longhorn bundle

AttachVolume.Attach failed for volume "pvc-93aad038-6dda-482f-a8f6-d237a0414561" : rpc error: code = DeadlineExceeded desc = volume pvc-93aad038-6dda-482f-a8f6-d237a0414561 failed to attach to node pax-p-95

Environment:

  • Longhorn version: 1.2
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher Catalog App
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: rancher kubernets v1.20.10
    • Number of management node in the cluster: 3
    • Number of worker node in the cluster: 3
  • Node config
    • OS type and version: Ubuntu 20.04.3 LTS
    • CPU per node: 12
    • Memory per node: 32G
    • Disk type(e.g. SSD/NVMe):
    • Network bandwidth between the nodes:
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Xen
  • Number of Longhorn volumes in the cluster: 80

Additional context
Add any other context about the problem here.

Metadata

Labels

backport/1.2.7backport/1.3.3component/longhorn-instance-managerLonghorn instance manager (interface between control and data plane)component/longhorn-managerLonghorn manager (control plane)investigation-neededIdentified the issue but require further investigation for resolution (won't be stale)kind/bugpriority/1Highly recommended to implement or fix in this release (managed by PO)

Type

No type

Projects

  • Status

    Closed

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions