Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: lvmvolume is inconsistent with the actual allocated lv size #316

Open
bxy4543 opened this issue Jun 20, 2024 · 29 comments
Open

BUG: lvmvolume is inconsistent with the actual allocated lv size #316

bxy4543 opened this issue Jun 20, 2024 · 29 comments
Assignees
Labels
Backlog question Further information is requested

Comments

@bxy4543
Copy link

bxy4543 commented Jun 20, 2024

What steps did you take and what happened:
Due to insufficient pvc resources, the pod keeps restarting. After expanding the pvc, it is found that the lv has not been successfully expanded. The remaining vg resources of the node are sufficient. Since the underlying lv has not been successfully expanded, the pod keeps restarting.

root@test-node-003:~# lvdisplay /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  --- Logical volume ---
  LV Path                /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  LV Name                pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  VG Name                lvmvg
  LV UUID                vjfkb2-ls7c-YzO7-nZYf-Ds70-3CVP-7EA5jl
  LV Write Access        read/write
  LV Creation host, time openebs-lvm-node-d8krs, 2024-06-19 14:35:24 +0800
  LV Status              available
  # open                 1
  LV Size                1.00 GiB
  Current LE             256
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:151

root@test-node-003:~# kubectl get lvmvolume -n openebs pvc-bcc852f8-7266-48c4-aedc-34adb523d224 -o yaml
apiVersion: local.openebs.io/v1alpha1
kind: LVMVolume
metadata:
  creationTimestamp: "2024-06-19T06:35:24Z"
  finalizers:
  - lvm.openebs.io/finalizer
  generation: 4
  labels:
    kubernetes.io/nodename: test-node-003
  name: pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  namespace: openebs
  resourceVersion: "241412658"
  uid: 00e27140-261e-449f-a048-af94b1b312f3
spec:
  capacity: "17179869184"
  ownerNodeID: test-node-003
  shared: "yes"
  thinProvision: "no"
  vgPattern: ^lvmvg$
  volGroup: lvmvg
status:
  state: Ready

lvSize: 1Gib , lvmvo: 16Gib

kubectl get pvc -n ns-gewclvtg data-test-ytxrw6b0n6-mongodb-2 -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: local.csi.openebs.io
    volume.kubernetes.io/selected-node: test-node-003
    volume.kubernetes.io/storage-provisioner: local.csi.openebs.io
  creationTimestamp: "2024-06-19T06:35:23Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app.kubernetes.io/instance: test-ytxrw6b0n6
    app.kubernetes.io/managed-by: kubeblocks
    app.kubernetes.io/name: mongodb
    apps.kubeblocks.io/component-name: mongodb
    apps.kubeblocks.io/vct-name: data
    kubeblocks.io/volume-type: data
  name: data-test-ytxrw6b0n6-mongodb-2
  namespace: ns-gewclvtg
  ownerReferences:
  - apiVersion: apps.kubeblocks.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Component
    name: test-ytxrw6b0n6-mongodb
    uid: f131a510-755d-4ae2-a828-9d878b82e121
  resourceVersion: "241412684"
  uid: bcc852f8-7266-48c4-aedc-34adb523d224
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 16Gi
  storageClassName: openebs-backup
  volumeMode: Filesystem
  volumeName: pvc-bcc852f8-7266-48c4-aedc-34adb523d224
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 16Gi
  phase: Bound

What did you expect to happen:

lv Size = LvmVolume cr size
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other Pastebin is fine.)

  • kubectl logs -f openebs-lvm-localpv-controller-7b6d6b4665-fk78q -n openebs -c openebs-lvm-plugin
    | no error
  • kubectl logs -f openebs-lvm-localpv-node-[xxxx] -n openebs -c openebs-lvm-plugin

| I0620 02:51:50.488174 1 grpc.go:72] GRPC call: /csi.v1.Node/NodePublishVolume requests {"target_path":"/var/lib/kubelet/pods/77cf54bb-be60-44e4-8ae0-bb3744b2f10e/volumes/kuberne
tes.io~csi/pvc-bcc852f8-7266-48c4-aedc-34adb523d224/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"csi.storage.k8s.
io/ephemeral":"false","csi.storage.k8s.io/pod.name":"xxx-mongodb-2","csi.storage.k8s.io/pod.namespace":"ns-gewclvtg","csi.storage.k8s.io/pod.uid":"77cf54bb-be60-44e4-8ae
0-bb3744b2f10e","csi.storage.k8s.io/serviceAccount.name":"xxx","openebs.io/cas-type":"localpv-lvm","openebs.io/volgroup":"lvmvg","storage.kubernetes.io/cs
iProvisionerIdentity":"1718646815547-5148-local.csi.openebs.io"},"volume_id":"pvc-bcc852f8-7266-48c4-aedc-34adb523d224"}
I0620 02:51:50.534603 1 mount_linux.go:312] fsck error fsck from util-linux 2.38.1
/dev/mapper/lvmvg-pvc--bcc852f8--7266--48c4--aedc--34adb523d224 is mounted.
e2fsck: Cannot continue, aborting.

  • kubectl get pods -n openebs
    | health

  • kubectl get lvmvol -A -o yaml

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • LVM Driver version
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
@bxy4543 bxy4543 changed the title Why is the lvmvolume size inconsistent with the actual allocated lv size BUG: lvmvolume is inconsistent with the actual allocated lv size Jun 20, 2024
@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

kubectl logs -n kube-system openebs-lvm-node-d8krs -c openebs-lvm-plugin

I0619 06:35:24.036535       1 volume.go:85] Got add event for Vol pvc-bcc852f8-7266-48c4-aedc-34adb523d224
I0619 06:35:24.036547       1 volume.go:86] lvmvolume object to be enqueued by Add handler: &{{LVMVolume local.openebs.io/v1alpha1} {pvc-bcc852f8-7266-48c4-aedc-34adb523d224  openebs  00e27140-261e-449f-a048-af94b1b312f3 241412558 1 2024-06-19 06:35:24 +0000 UTC <nil> <nil> map[] map[] [] [] [{lvm-driver Update local.openebs.io/v1alpha1 2024-06-19 06:35:24 +0000 UTC FieldsV1 {"f:spec":{".":{},"f:capacity":{},"f:ownerNodeID":{},"f:shared":{},"f:thinProvision":{},"f:vgPattern":{},"f:volGroup":{}},"f:status":{".":{},"f:state":{}}} }]} {test-node-003  ^lvmvg$ 1073741824 yes no} {Pending <nil>}}
I0619 06:35:24.036609       1 volume.go:55] Getting lvmvol object name:pvc-bcc852f8-7266-48c4-aedc-34adb523d224, ns:openebs from cache
I0619 06:35:24.396811       1 lvm_util.go:308] lvm: created volume lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224
I0619 06:35:24.406693       1 volume.go:367] Successfully synced 'openebs/pvc-bcc852f8-7266-48c4-aedc-34adb523d224'
I0619 06:35:26.217611       1 grpc.go:72] GRPC call: /csi.v1.Node/NodePublishVolume requests {"target_path":"/var/lib/kubelet/pods/e128a99f-28ea-4ed2-b098-900b886744bb/volumes/kubernetes.io~csi/pvc-bcc852f8-7266-48c4-aedc-34adb523d224/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"test-ytxrw6b0n6-mongodb-2","csi.storage.k8s.io/pod.namespace":"ns-gewclvtg","csi.storage.k8s.io/pod.uid":"e128a99f-28ea-4ed2-b098-900b886744bb","csi.storage.k8s.io/serviceAccount.name":"test-mongodb-test-ytxrw6b0n6","openebs.io/cas-type":"localpv-lvm","openebs.io/volgroup":"lvmvg","storage.kubernetes.io/csiProvisionerIdentity":"1718646815547-5148-local.csi.openebs.io"},"volume_id":"pvc-bcc852f8-7266-48c4-aedc-34adb523d224"}
I0619 06:35:26.240614       1 mount_linux.go:367] Disk "/dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224" appears to be unformatted, attempting to format as type: "ext4" with options: [-F -m0 /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224]
I0619 06:35:26.264351       1 mount_linux.go:377] Disk successfully formatted (mkfs): ext4 - /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224 /var/lib/kubelet/pods/e128a99f-28ea-4ed2-b098-900b886744bb/volumes/kubernetes.io~csi/pvc-bcc852f8-7266-48c4-aedc-34adb523d224/mount
I0619 06:35:26.280532       1 mount.go:209] lvm: volume lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224 mounted /var/lib/kubelet/pods/e128a99f-28ea-4ed2-b098-900b886744bb/volumes/kubernetes.io~csi/pvc-bcc852f8-7266-48c4-aedc-34adb523d224/mount fs ext4
I0620 02:51:50.488174       1 grpc.go:72] GRPC call: /csi.v1.Node/NodePublishVolume requests {"target_path":"/var/lib/kubelet/pods/77cf54bb-be60-44e4-8ae0-bb3744b2f10e/volumes/kubernetes.io~csi/pvc-bcc852f8-7266-48c4-aedc-34adb523d224/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"test-ytxrw6b0n6-mongodb-2","csi.storage.k8s.io/pod.namespace":"ns-gewclvtg","csi.storage.k8s.io/pod.uid":"77cf54bb-be60-44e4-8ae0-bb3744b2f10e","csi.storage.k8s.io/serviceAccount.name":"test-mongodb-test-ytxrw6b0n6","openebs.io/cas-type":"localpv-lvm","openebs.io/volgroup":"lvmvg","storage.kubernetes.io/csiProvisionerIdentity":"1718646815547-5148-local.csi.openebs.io"},"volume_id":"pvc-bcc852f8-7266-48c4-aedc-34adb523d224"}
I0620 02:51:50.534603       1 mount_linux.go:312] `fsck` error fsck from util-linux 2.38.1
/dev/mapper/lvmvg-pvc--bcc852f8--7266--48c4--aedc--34adb523d224 is mounted.
e2fsck: Cannot continue, aborting.


I0620 02:51:50.535623       1 mount.go:209] lvm: volume lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224 mounted /var/lib/kubelet/pods/77cf54bb-be60-44e4-8ae0-bb3744b2f10e/volumes/kubernetes.io~csi/pvc-bcc852f8-7266-48c4-aedc-34adb523d224/mount fs ext4
I0620 02:51:50.535640       1 grpc.go:81] GRPC response: {}
I0620 02:51:57.684960       1 lvmnode.go:306] Successfully synced 'openebs/test-node-003

@dsharma-dc
Copy link
Contributor

@bxy4543 Could you share the storage class and pvc yaml files here, or in Github gist?

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

controller version:

        image: registry.k8s.io/sig-storage/csi-resizer:v1.8.0
        image: registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2
        image: registry.k8s.io/sig-storage/snapshot-controller:v6.2.2
        image: registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
openebs/lvm-driver:ci

and i found error:

{"log":"I0619 06:35:25.755602       1 event.go:298] Event(v1.ObjectReference{Kind:\"PersistentVolumeClaim\", Namespace:\"ns-gewclvtg\", Name:\"data-test-ytxrw6b0n6-mongodb-2\", UID:\"bcc852f8-7266-48c4-aedc-34adb523d224\", APIVersion:\"v1\", ResourceVersion:\"241412655\", FieldPath:\"\"}): type: 'Warning' reason: 'VolumeResizeFailed' Mark PVC \"ns-gewclvtg/data-test-ytxrw6b0n6-mongodb-2\" as file system resize required failed: can't patch status of  PVC ns-gewclvtg/data-test-ytxrw6b0n6-mongodb-2 with Operation cannot be fulfilled on persistentvolumeclaims \"data-test-ytxrw6b0n6-mongodb-2\": the object has been modified; please apply your changes to the latest version and try again","logtag":"F"} |  
-- | --
  |   | {"log":"E0619 06:35:25.755515       1 controller.go:286] Error syncing PVC: Mark PVC \"ns-gewclvtg/data-test-ytxrw6b0n6-mongodb-2\" as file system resize required failed: can't patch status of  PVC ns-gewclvtg/data-test-ytxrw6b0n6-mongodb-2 with Operation cannot be fulfilled on persistentvolumeclaims \"data-test-ytxrw6b0n6-mongodb-2\": the object has been modified; please apply your changes to the latest version and try again","logtag":"F"}

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-backup
parameters:
  shared: "yes"
  storage: lvm
  volgroup: lvmvg
provisioner: local.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: local.csi.openebs.io
    volume.kubernetes.io/selected-node: test-node-003
    volume.kubernetes.io/storage-provisioner: local.csi.openebs.io
  creationTimestamp: "2024-06-19T06:35:23Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app.kubernetes.io/instance: test-ytxrw6b0n6
    app.kubernetes.io/managed-by: kubeblocks
    app.kubernetes.io/name: mongodb
    apps.kubeblocks.io/component-name: mongodb
    apps.kubeblocks.io/vct-name: data
    kubeblocks.io/volume-type: data
  name: data-test-ytxrw6b0n6-mongodb-2
  namespace: ns-gewclvtg
  ownerReferences:
  - apiVersion: apps.kubeblocks.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Component
    name: test-ytxrw6b0n6-mongodb
    uid: f131a510-755d-4ae2-a828-9d878b82e121
  resourceVersion: "241412684"
  uid: bcc852f8-7266-48c4-aedc-34adb523d224
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 16Gi
  storageClassName: openebs-backup
  volumeMode: Filesystem
  volumeName: pvc-bcc852f8-7266-48c4-aedc-34adb523d224
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 16Gi
  phase: Bound

@dsharma-dc
Copy link
Contributor

The filesystem resize path seems to have run into error while doing fsck and that left it there. I'll check this for some more clues.

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

Thank you for your support. After the underlying lv is expanded, the file system expansion step is reached. However, it can be seen that the lv of the test-node-003 node has not been successfully expanded.

@dsharma-dc
Copy link
Contributor

Thank you for your support. After the underlying lv is expanded, the file system expansion step is reached. However, it can be seen that the lv of the test-node-003 node has not been successfully expanded.

I see that the storage class mentions shared, are there actually two applications currently using the volume when this happened?

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

And I see fsck comes from mount_linux.formatAndMountSensitive which is referenced in NodePublishVolume.MountFilesystem, but my pod is still able to mount the pvc successfully

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

I see that the storage class mentions shared, are there actually two applications currently using the volume when this happened?

It was there for a while for backup, but it's gone now

@dsharma-dc
Copy link
Contributor

@bxy4543 If possible please upload all log files here as zipped.

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

log.tar.gz
@dsharma-dc

@dsharma-dc dsharma-dc added the bug Something isn't working label Jun 20, 2024
@dsharma-dc
Copy link
Contributor

dsharma-dc commented Jun 20, 2024

The openebs-lvm-node-003 log is missing the logs from earlier timestamp for volume pvc-bcc852f8-7266-48c4-aedc-34adb523d224. I would also like to see output of kubectl describe pvc <pvcname>
I notice that I also see fsck error when I try this scenario locally, but that doesn' impact anything w.r.t expanding the pvc. Expand succeeds.

@bxy4543
Copy link
Author

bxy4543 commented Jun 20, 2024

kubectl describe pvc -n ns-gewclvtg data-test-ytxrw6b0n6-mongodb-2
Name:          data-test-ytxrw6b0n6-mongodb-2
Namespace:     ns-gewclvtg
StorageClass:  openebs-backup
Status:        Bound
Volume:        pvc-bcc852f8-7266-48c4-aedc-34adb523d224
Labels:        app.kubernetes.io/instance=test-ytxrw6b0n6
               app.kubernetes.io/managed-by=kubeblocks
               app.kubernetes.io/name=mongodb
               apps.kubeblocks.io/component-name=mongodb
               apps.kubeblocks.io/vct-name=data
               kubeblocks.io/volume-type=data
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: local.csi.openebs.io
               volume.kubernetes.io/selected-node: test-node-003
               volume.kubernetes.io/storage-provisioner: local.csi.openebs.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      16Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       test-ytxrw6b0n6-mongodb-2
Events:        <none>

@dsharma-dc
Copy link
Contributor

Two questions:

  1. Can you share the describe of pv, lvmvolume and o/p of lvdisplay /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224 again.
  2. Where was this error seen? If you can provide that log:
    controller.go:286] Error syncing PVC: Mark PVC \"ns-gewclvtg/data-test-ytxrw6b0n6-mongodb-2\" as file system resize required failed: can't patch status of PVC

@bxy4543
Copy link
Author

bxy4543 commented Jun 21, 2024

root@test-node-003:~# lvdisplay /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  --- Logical volume ---
  LV Path                /dev/lvmvg/pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  LV Name                pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  VG Name                lvmvg
  LV UUID                vjfkb2-ls7c-YzO7-nZYf-Ds70-3CVP-7EA5jl
  LV Write Access        read/write
  LV Creation host, time openebs-lvm-node-d8krs, 2024-06-19 14:35:24 +0800
  LV Status              available
  # open                 1
  LV Size                1.00 GiB
  Current LE             256
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:151
kubectl describe pv pvc-bcc852f8-7266-48c4-aedc-34adb523d224
Name:              pvc-bcc852f8-7266-48c4-aedc-34adb523d224
Labels:            <none>
Annotations:       pv.kubernetes.io/provisioned-by: local.csi.openebs.io
                 volume.kubernetes.io/provisioner-deletion-secret-name:
                 volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      openebs-backup
Status:            Bound
Claim:             ns-gewclvtg/data-test-ytxrw6b0n6-mongodb-2
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          16Gi
Node Affinity:
Required Terms:
  Term 0:        openebs.io/nodename in [test-node-003]
Message:
Source:
  Type:              CSI (a Container Storage Interface (CSI) volume source)
  Driver:            local.csi.openebs.io
  FSType:            ext4
  VolumeHandle:      pvc-bcc852f8-7266-48c4-aedc-34adb523d224
  ReadOnly:          false
  VolumeAttributes:      openebs.io/cas-type=localpv-lvm
                         openebs.io/volgroup=lvmvg
                         storage.kubernetes.io/csiProvisionerIdentity=1718646815547-5148-local.csi.openebs.io
Events:                <none>
kubectl describe lvmvolume -n openebs pvc-bcc852f8-7266-48c4-aedc-34adb523d224
Name:         pvc-bcc852f8-7266-48c4-aedc-34adb523d224
Namespace:    openebs
Labels:       kubernetes.io/nodename=test-node-003
Annotations:  <none>
API Version:  local.openebs.io/v1alpha1
Kind:         LVMVolume
Metadata:
  Creation Timestamp:  2024-06-19T06:35:24Z
  Finalizers:
    lvm.openebs.io/finalizer
  Generation:        4
  Resource Version:  241412658
  UID:               00e27140-261e-449f-a048-af94b1b312f3
Spec:
  Capacity:        17179869184
  Owner Node ID:   test-node-003
  Shared:          yes
  Thin Provision:  no
  Vg Pattern:      ^lvmvg$
  Vol Group:       lvmvg
Status:
  State:  Ready
Events:   <none>

And the Error syncing PVC log comes from the controller csi-resizer module. The old log has been cleaned up, but I found a new similar log:

Common labels: {"app":"openebs-lvm-controller","container":"csi-resizer","instance":"openebs-lvm-controller-546768b57c-tlnbf","job":"fluent-bit","namespace":"kube-system","node":"test-node-001","stream":"stderr"}
Line limit: "1000 (45 returned)"
Total bytes processed: "228  kB"


2024-06-20T22:23:55+08:00	{"log":"I0620 14:23:55.005706       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:49+08:00	{"log":"I0620 14:23:49.999362       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:44+08:00	{"log":"I0620 14:23:44.994267       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:39+08:00	{"log":"I0620 14:23:39.989566       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:34+08:00	{"log":"I0620 14:23:34.984822       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:29+08:00	{"log":"I0620 14:23:29.979466       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:24+08:00	{"log":"I0620 14:23:24.975015       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:19+08:00	{"log":"I0620 14:23:19.969413       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:14+08:00	{"log":"I0620 14:23:14.963766       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:09+08:00	{"log":"I0620 14:23:09.957577       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:05+08:00	{"log":"I0620 14:23:05.200975       1 controller.go:343] No need to resize PVC \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\"","logtag":"F"}
2024-06-20T22:23:05+08:00	{"log":"I0620 14:23:05.200931       1 controller.go:295] Started PVC processing \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\"","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.952653       1 leaderelection.go:276] successfully renewed lease kube-system/external-resizer-local-csi-openebs-io","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.217272       1 event.go:298] Event(v1.ObjectReference{Kind:\"PersistentVolumeClaim\", Namespace:\"ns-3m5m56ni\", Name:\"data-teable-uylicrfk-pg-postgresql-0\", UID:\"b781dc96-49e1-4391-b09f-3da5974f2f52\", APIVersion:\"v1\", ResourceVersion:\"408970568\", FieldPath:\"\"}): type: 'Normal' reason: 'VolumeResizeSuccessful' Resize volume succeeded","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.217271       1 controller.go:343] No need to resize PVC \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\"","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.217262       1 controller.go:295] Started PVC processing \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\"","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.217231       1 controller.go:546] Resize PVC \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\" finished","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.211897       1 controller.go:489] Update capacity of PV \"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\" to 8Gi succeeded","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.208677       1 controller.go:590] Resize volume succeeded for volume \"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\", start to update PV's capacity","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.208671       1 controller.go:483] Resize volume succeeded for volume \"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\", start to update PV's capacity","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.208665       1 connection.go:201] GRPC error: \u003cnil\u003e","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.208658       1 connection.go:200] GRPC response: {\"capacity_bytes\":8589934592}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203509       1 connection.go:194] GRPC request: {\"capacity_range\":{\"required_bytes\":8589934592},\"volume_capability\":{\"AccessType\":{\"Mount\":{\"fs_type\":\"ext4\"}},\"access_mode\":{\"mode\":1}},\"volume_id\":\"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\"}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203507       1 connection.go:193] GRPC call: /csi.v1.Controller/ControllerExpandVolume","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203495       1 connection.go:201] GRPC error: \u003cnil\u003e","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203488       1 connection.go:200] GRPC response: {\"capabilities\":[{\"Type\":{\"Rpc\":{\"type\":1}}},{\"Type\":{\"Rpc\":{\"type\":9}}},{\"Type\":{\"Rpc\":{\"type\":5}}},{\"Type\":{\"Rpc\":{\"type\":4}}}]}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203149       1 event.go:298] Event(v1.ObjectReference{Kind:\"PersistentVolumeClaim\", Namespace:\"ns-3m5m56ni\", Name:\"data-teable-uylicrfk-pg-postgresql-0\", UID:\"b781dc96-49e1-4391-b09f-3da5974f2f52\", APIVersion:\"v1\", ResourceVersion:\"408970568\", FieldPath:\"\"}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-b781dc96-49e1-4391-b09f-3da5974f2f52","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203094       1 connection.go:194] GRPC request: {}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.203087       1 connection.go:193] GRPC call: /csi.v1.Controller/ControllerGetCapabilities","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.200034       1 event.go:298] Event(v1.ObjectReference{Kind:\"PersistentVolumeClaim\", Namespace:\"ns-3m5m56ni\", Name:\"data-teable-uylicrfk-pg-postgresql-0\", UID:\"b781dc96-49e1-4391-b09f-3da5974f2f52\", APIVersion:\"v1\", ResourceVersion:\"408970566\", FieldPath:\"\"}): type: 'Warning' reason: 'VolumeResizeFailed' Mark PVC \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\" as file system resize required failed: can't patch status of  PVC ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0 with Operation cannot be fulfilled on persistentvolumeclaims \"data-teable-uylicrfk-pg-postgresql-0\": the object has been modified; please apply your changes to the latest version and try again","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.199989       1 controller.go:295] Started PVC processing \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\"","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"E0620 14:23:04.199959       1 controller.go:286] Error syncing PVC: Mark PVC \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\" as file system resize required failed: can't patch status of  PVC ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0 with Operation cannot be fulfilled on persistentvolumeclaims \"data-teable-uylicrfk-pg-postgresql-0\": the object has been modified; please apply your changes to the latest version and try again","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.197199       1 controller.go:489] Update capacity of PV \"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\" to 8Gi succeeded","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.192885       1 controller.go:590] Resize volume succeeded for volume \"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\", start to update PV's capacity","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.192878       1 controller.go:483] Resize volume succeeded for volume \"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\", start to update PV's capacity","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.192871       1 connection.go:201] GRPC error: \u003cnil\u003e","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.192861       1 connection.go:200] GRPC response: {\"capacity_bytes\":8589934592,\"node_expansion_required\":true}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.178124       1 connection.go:194] GRPC request: {\"capacity_range\":{\"required_bytes\":8589934592},\"volume_capability\":{\"AccessType\":{\"Mount\":{\"fs_type\":\"ext4\"}},\"access_mode\":{\"mode\":1}},\"volume_id\":\"pvc-b781dc96-49e1-4391-b09f-3da5974f2f52\"}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.178121       1 connection.go:193] GRPC call: /csi.v1.Controller/ControllerExpandVolume","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.178106       1 connection.go:201] GRPC error: \u003cnil\u003e","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.178098       1 connection.go:200] GRPC response: {\"capabilities\":[{\"Type\":{\"Rpc\":{\"type\":1}}},{\"Type\":{\"Rpc\":{\"type\":9}}},{\"Type\":{\"Rpc\":{\"type\":5}}},{\"Type\":{\"Rpc\":{\"type\":4}}}]}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.177597       1 event.go:298] Event(v1.ObjectReference{Kind:\"PersistentVolumeClaim\", Namespace:\"ns-3m5m56ni\", Name:\"data-teable-uylicrfk-pg-postgresql-0\", UID:\"b781dc96-49e1-4391-b09f-3da5974f2f52\", APIVersion:\"v1\", ResourceVersion:\"408970566\", FieldPath:\"\"}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-b781dc96-49e1-4391-b09f-3da5974f2f52","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.177454       1 connection.go:194] GRPC request: {}","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.177445       1 connection.go:193] GRPC call: /csi.v1.Controller/ControllerGetCapabilities","logtag":"F"}
2024-06-20T22:23:04+08:00	{"log":"I0620 14:23:04.165365       1 controller.go:295] Started PVC processing \"ns-3m5m56ni/data-teable-uylicrfk-pg-postgresql-0\"","logtag":"F"}

same problem:

root@test-node-004:~# kubectl describe pvc -n ns-3m5m56ni data-teable-uylicrfk-pg-postgresql-0
Name:          data-teable-uylicrfk-pg-postgresql-0
Namespace:     ns-3m5m56ni
StorageClass:  openebs-lvmpv
Status:        Bound
Volume:        pvc-b781dc96-49e1-4391-b09f-3da5974f2f52
Labels:        app.kubernetes.io/instance=teable-uylicrfk-pg
               app.kubernetes.io/managed-by=kubeblocks
               app.kubernetes.io/name=postgresql
               apps.kubeblocks.io/component-name=postgresql
               apps.kubeblocks.io/vct-name=data
               kubeblocks.io/volume-type=data
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: local.csi.openebs.io
               volume.kubernetes.io/selected-node: test-node-004
               volume.kubernetes.io/storage-provisioner: local.csi.openebs.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      8Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       teable-uylicrfk-pg-postgresql-0
Events:        <none>
root@test-node-004:~#
root@test-node-004:~# kubectl describe lvmvolume -n openebs pvc-bcf9096f-b5d7-4a2d-8425-0b3ea2fce0a3
Name:         pvc-bcf9096f-b5d7-4a2d-8425-0b3ea2fce0a3
Namespace:    openebs
Labels:       kubernetes.io/nodename=test-node-004
Annotations:  <none>
API Version:  local.openebs.io/v1alpha1
Kind:         LVMVolume
Metadata:
  Creation Timestamp:  2024-06-18T06:38:28Z
  Finalizers:
    lvm.openebs.io/finalizer
  Generation:        3
  Resource Version:  398423326
  UID:               1540acdc-5c42-449a-82f7-dcc7831024e9
Spec:
  Capacity:        1073741824
  Owner Node ID:   test-node-004
  Shared:          yes
  Thin Provision:  no
  Vg Pattern:      ^lvmvg$
  Vol Group:       lvmvg
Status:
  State:  Ready
Events:   <none>
root@test-node-004:~# lvdisplay /dev/lvmvg/pvc-bcf9096f-b5d7-4a2d-8425-0b3ea2fce0a3
  --- Logical volume ---
  LV Path                /dev/lvmvg/pvc-bcf9096f-b5d7-4a2d-8425-0b3ea2fce0a3
  LV Name                pvc-bcf9096f-b5d7-4a2d-8425-0b3ea2fce0a3
  VG Name                lvmvg
  LV UUID                MhTTYf-XHAw-RhGc-WXfF-Uyeu-rNMS-KmMqd6
  LV Write Access        read/write
  LV Creation host, time openebs-lvm-node-c66l4, 2024-06-18 14:38:29 +0800
  LV Status              available
  # open                 1
  LV Size                1.00 GiB
  Current LE             256
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           253:352

@dsharma-dc
Copy link
Contributor

dsharma-dc commented Jun 21, 2024

There are few things that are observed here:

  1. The LVM LV didn't get expanded because the csi node plugin didn't receive NodeExpandVolume call. The LV is expanded as part of this gRPC and filesystem is natively expanded by LVM(except btrfs which we do from the plugin). I'm assuming the application pod was running and attached to the pvc here.
  2. The NodeExpandVolume call was not sent to csi-node plugin because csi-resizer failed to mark the pvc with node expansion required as seen from the error logs from csi-resizer.
  3. Why csi-resizer failed to patch pvc status correctly is not clear.

It would be good if you can share the version of all the things involved i.e. kubernetes, openebs plugin, host OS details etc.
Also if it is reproducible easily every time, then request to repro and provide the csi-controller logs (openebs-lvm-plugin + csi-resizer containers), and csi-node logs(openebs-lvm-plugin container) from that timestamp.

NOTE: In the last comment, the pvc and lvmvolume look unrelated, they are having different IDs.

@bxy4543
Copy link
Author

bxy4543 commented Jun 25, 2024

Thanks, I'll share it later

@dsharma-dc
Copy link
Contributor

It would also be interesting to know how the edits are being made to the PVC. Also, whether a retry worked successfully or not. The error doesn't indicate any issue with this provisioner till now. The failure happened in kubernetes API calls during resize due to PVC edit conflicting with some other conflicting operation on PVC.

@bxy4543
Copy link
Author

bxy4543 commented Jun 25, 2024

Yes, but I wonder why the csi controller failed and the retry succeeded, but the LVMVolume was inconsistent with the physical LV

@dsharma-dc
Copy link
Contributor

but the LVMVolume was inconsistent with the physical LV

this was because physical LV is expanded when NodeExpandVolume gRPC is sent by csi-resizer sidecar, which in this case never arrived due to the pvc patching error in sidecar.

@dsharma-dc dsharma-dc added question Further information is requested wontfix This will not be worked on and removed bug Something isn't working wontfix This will not be worked on labels Jun 25, 2024
@bxy4543
Copy link
Author

bxy4543 commented Jun 26, 2024

How to avoid this situation? why does patching the error pvc not re-trigger grpc

@dsharma-dc
Copy link
Contributor

dsharma-dc commented Jun 27, 2024

How to avoid this situation? why does patching the error pvc not re-trigger grpc

I'd guess kubernetes expects us to retry manually as the message says please apply your changes to the latest version and try again. We can check with k8s folks w.r.t this behaviour on Slack. All the details that I requested earlier would be required, or if you'd like you can go ahead posting the query.

@bxy4543
Copy link
Author

bxy4543 commented Jun 27, 2024

please apply your changes to the latest version and try again does not mean that we need to create it manually. This is entirely due to the inconsistency of pvc resources. I can trigger it 100% and just keep adding pvc tags when expanding pvc:

cat rechange-pvc.sh
#!/bin/bash

NAMESPACE="ns-hwhbg4vf"
PVC_NAME="data-dfsd-mongodb-0"
LABEL_KEY="test"

while true; do
  CURRENT_VALUE=$(kubectl get pvc $PVC_NAME -n $NAMESPACE -o jsonpath="{.metadata.labels.$LABEL_KEY}")

  if [[ -z "$CURRENT_VALUE" ]]; then
    NEXT_VALUE=1
  elif [[ "$CURRENT_VALUE" =~ ^[0-9]+$ ]]; then
    NEXT_VALUE=$((CURRENT_VALUE + 1))
  else
    echo "tag $LABEL_KEY Current value is not an integer: $CURRENT_VALUE"
    exit 1
  fi

  echo "Update Tags $LABEL_KEY 到 $NEXT_VALUE"
  kubectl label pvc $PVC_NAME -n $NAMESPACE $LABEL_KEY=$NEXT_VALUE --overwrite

  sleep 0.5
done
lvdisplay /dev/mapper/lvmvg-pvc--c745ae87--6b56--4684--9cfb--9e21ca9517e5
  --- Logical volume ---
  LV Path                /dev/lvmvg/pvc-c745ae87-6b56-4684-9cfb-9e21ca9517e5
  LV Name                pvc-c745ae87-6b56-4684-9cfb-9e21ca9517e5
  VG Name                lvmvg
  LV UUID                40ZI9j-4f2C-M1AW-CVrs-ZcSk-wR53-NwMUCF
  LV Write Access        read/write
  LV Creation host, time openebs-lvm-node-nddtz, 2024-06-27 02:49:15 +0000
  LV Status              available
  # open                 1
  LV Size                3.00 GiB
  Current LE             768
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:13

root@dev-56:~# kubectl get pvc -n ns-hwhbg4vf data-dfsd-mongodb-0 -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: local.csi.openebs.io
    volume.kubernetes.io/selected-node: dev-56
    volume.kubernetes.io/storage-provisioner: local.csi.openebs.io
  creationTimestamp: "2024-06-27T10:10:31Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app.kubernetes.io/instance: dfsd
    app.kubernetes.io/managed-by: kubeblocks
    app.kubernetes.io/name: mongodb
    apps.kubeblocks.io/component-name: mongodb
    apps.kubeblocks.io/vct-name: data
    kubeblocks.io/volume-type: data
    test: "98"
  name: data-dfsd-mongodb-0
  namespace: ns-hwhbg4vf
  ownerReferences:
  - apiVersion: apps.kubeblocks.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Component
    name: dfsd-mongodb
    uid: 4cc9d636-2084-4f81-871b-0ee30567fa52
  resourceVersion: "99144306"
  uid: ce0a06d8-53e5-46a9-b529-a79f0fa101b5
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 4Gi
  storageClassName: openebs-lvmpv
  volumeMode: Filesystem
  volumeName: pvc-c745ae87-6b56-4684-9cfb-9e21ca9517e5
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 4Gi
  phase: Bound

pvc-c745ae87-6b56-4684-9cfb-9e21ca9517e5: 4Gi
Physical size: 3Gi

@bxy4543
Copy link
Author

bxy4543 commented Jun 28, 2024

I think ControllerExpandVolume needs to add a state to modify the LVMVolume resource. Otherwise, it is not advisable to just modify LVMVolume.spec.capacity to represent the success of ControllerExpandVolume execution. Otherwise, even if NodeExpandVolume execution fails, ControllerExpandVolume will not retry

@dsharma-dc
Copy link
Contributor

dsharma-dc commented Jun 28, 2024

The one key difference here in this LVM provisioner from other e.g openebs/Mayastor is that ControllerExpandVolume actually expands the volume there but not in case of LVM. This is because with LVM we can't split the volume expand operation into two step - 1. to expand underlying volume , 2. expand the filesystem.
LVM natively expands filesystem too.
That said, I see there could be some scope of improvement for LVM expansion workflow. Also investigate the ControllerExpandVolume return path as suggested above. Will keep it under track.

I haven't been able to reproduce the error yet using the script provided by you though.

  1. created pvc
  2. published application
  3. ran label add script
  4. do kubectl pvc edit to resize

@bxy4543
Copy link
Author

bxy4543 commented Jul 1, 2024

Got it, looking forward to improving this problem. I can always reproduce it in my environment:

uname -a
Linux dev-55 5.15.0-113-generic #123-Ubuntu SMP Mon Jun 10 08:16:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
kubectl get no -owide
NAME               STATUS   ROLES           AGE    VERSION    INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
dev-55             Ready    control-plane   152d   v1.27.13   192.168.0.55   <none>        Ubuntu 22.04.3 LTS   5.15.0-113-generic   containerd://1.6.23
dev-56             Ready    <none>          152d   v1.27.13   192.168.0.56   <none>        Ubuntu 22.04.3 LTS   5.15.0-113-generic   containerd://1.6.23
dev-58             Ready    <none>          152d   v1.27.13   192.168.0.58   <none>        Ubuntu 22.04.3 LTS   5.15.0-113-generic   containerd://1.6.23

use version:

registry.k8s.io/sig-storage/csi-resizer:v1.8.0
registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2
registry.k8s.io/sig-storage/snapshot-controller:v6.2.2
registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
openebs/lvm-driver:ci

@avishnu
Copy link
Member

avishnu commented Sep 12, 2024

@datacore-tilangovan can you help in reproducing this issue in your setup where you are benchmarking for postgresql?

@avishnu
Copy link
Member

avishnu commented Sep 19, 2024

Does this PR #260 help to address this issue?

@dsharma-dc
Copy link
Contributor

I had run the script to keep modifying the pvc while a resize is attempted. I can hit the error that fails to patch the PVC and underlying LV doesn't get expanded.

E0110 06:18:46.748421       1 controller.go:286] Error syncing PVC: Mark PVC "default/csi-lvmpv-ds-1" as file system resize required failed: can't patch status of  PVC default/csi-lvmpv-ds-1 with Operation cannot be fulfilled on persistentvolumeclaims "csi-lvmpv-ds-1": the object has been modified; please apply your changes to the latest version and try again
I0110 06:18:46.748521       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"csi-lvmpv-ds-1", UID:"cad90bad-68de-4e8d-a374-f7c83335cfd0", APIVersion:"v1", ResourceVersion:"9462", FieldPath:""}): type: 'Warning' reason: 'VolumeResizeFailed' Mark PVC "default/csi-lvmpv-ds-1" as file system resize required failed: can't patch status of  PVC default/csi-lvmpv-ds-1 with Operation cannot be fulfilled on persistentvolumeclaims "csi-lvmpv-ds-1": the object has been modified; please apply your changes to the latest version and try again

However, after sometime I see that the sync happens and underlying LV gets expanded as expected. This could be explained by the external resizer behaviour that retries the sync so eventually we see the LVs get expanded.

	if err := ctrl.syncPVC(key); err != nil {
		// Put PVC back to the queue so that we can retry later.
		klog.ErrorS(err, "Error syncing PVC")
		ctrl.claimQueue.AddRateLimited(key)
	} else {
		ctrl.claimQueue.Forget(key)
	}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backlog question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants