-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volume Attaching Timeout - Failed to attache #7629
Comments
@derekbit Can you please help? It took us 3 hours to get this up and running. |
@aljabertareq cc @james-munson for the community coordinator of this sprint |
Just to be clear, it did eventually attach, correct? The issue is about the time it took? currentNodeID: node1
expansionRequired: false
frontendDisabled: false
isStandby: false
kubernetesStatus:
lastPVCRefAt: "null"
lastPodRefAt: "null"
namespace: avs
pvName: pvc-e9b4c5e1-1c5d-4428-bd0e-a02167136cbf
pvStatus: Bound
pvcName: minio-pv-claim-cloned
workloadsStatus:
- podName: minio-6dd56bd767-qp67r
podStatus: Running
workloadName: minio-6dd56bd767
workloadType: ReplicaSet
lastBackup: "null"
lastBackupAt: "null"
lastDegradedAt: "null"
ownerID: node1
pendingNodeID: "null"
remountRequestedAt: "2024-01-10T22:17:01Z"
restoreInitiated: false
restoreRequired: false
robustness: healthy
shareEndpoint: "null"
shareState: "null"
state: attached And cluster/storage.k8s.io/v1/volumeattachments.yaml shows it attached as well: - apiVersion: storage.k8s.io/v1
kind: VolumeAttachment
metadata:
annotations:
csi.alpha.kubernetes.io/node-id: node1
creationTimestamp: "2024-01-10T21:59:34Z"
finalizers:
- external-attacher/driver-longhorn-io
managedFields:
...
name: csi-baef1bd4fbd4f963e849e1345359e542eaf2bab7a7e33135f546f14a7d9a5949
resourceVersion: "124700040"
uid: 3a412db5-daaa-43bf-9f67-39302c839d46
spec:
attacher: driver.longhorn.io
nodeName: node1
source:
persistentVolumeName: pvc-e9b4c5e1-1c5d-4428-bd0e-a02167136cbf
status:
attached: true By the way, the volume also shows: numberOfReplicas: 1 which is a dangerous way to operate. |
It did have some trouble getting there though. longhorn-manager-h82kr/longhorn-manager.log shows
and it seems to recover here:
And instance-manager-r-1e8d2b6ac4bdab53558aa36fa56425b5/replica-manager.log logs this
and then thousands of these between 22:12 and 22:17,
|
@james-munson not just trouble. We had to try and do so many things to make it attach. It took us around 4 hours. These are production environment that's supposed to be 24/7, so eventually is an issue. Any help replicating the issue and suggesting a solution would be much appreciated @4mohamedalaa please add a detailed steps that we tried to solve this issue. |
@james-munson geres the command that they used " systemctl restart iscsi " |
@james-munson @derekbit any thoughts on this? |
If this happens again, it would be interesting to see the full output from |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
@james-munson any findings here? |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Describe the bug
Volume attaching timeout. Kakfa and Minio instances failed to attaché.
To Reproduce
Expected behavior
Support bundle for troubleshooting
https://drive.google.com/file/d/15Y3qSLjWtKADKL0NWE9HMkMCjaD8VO7u/view?usp=sharing
Environment
Additional context
<!-Please add any other context about the problem here.-->
The text was updated successfully, but these errors were encountered: