Skip to content

[BUG] = Disk IO queue (aqu-sq) is high. Disk saturation #9685

Open
@frit0-rb

Description

Describe the bug

During the daily-snapshots exist very high disk saturation.

this ever happened in partitions dm- / sda

image

To Reproduce

Rke2 cluster with longhorn installed via Rancher, and version RKE2 1.27.15.

Cluster with 3 controllers and 5 workers, daily snapshopt created via Longhorn UI with 3 retain snapshot

Expected behavior

Low saturation in disks

Support bundle for troubleshooting

Environment

  • Longhorn version: 1.5.5
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: Rke2
    • Number of control plane nodes in the cluster: 3
    • Number of worker nodes in the cluster: 5
  • Node config
    • OS type and version: RHEL 8.10
    • Kernel version: 4.18.0-553.22.1.el8_10.x86_64
    • CPU per node: 24 CVPU
    • Memory per node: 120 GB
    • Disk type (e.g. SSD/NVMe/HDD): SSD
    • Network bandwidth between the nodes (Gbps):
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Baremetal / Hyper-V
  • Number of Longhorn volumes in the cluster: 45

Additional context

Sometimes this saturation bring some worker freesing and many pods remaining in terminating state

Workaround and Mitigation

I dont have any way to

Metadata

Assignees

No one assigned

    Labels

    area/performanceSystem, volume performancekind/bugrequire/backportRequire backport. Only used when the specific versions to backport have not been definied.require/qa-review-coverageRequire QA to review coveragestale

    Type

    No type

    Projects

    • Status

      In Progress
    • Status

      New Issues

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions