Skip to content

[BUG] Replica rebuilding gets triggered if network bandwidth is restricted below 80mbit #2882

Closed
@khushboo-rancher

Description

Describe the bug
Replica rebuilding is observed if the network between engine and replica is restricted below 80mbit.

To Reproduce
Steps to reproduce the behavior:

  1. Create a volume with 3 replicas.
  2. Go to engine and run the below command to restrict the network bandwidth
tc qdisc del dev eth0 root
tc qdisc add dev eth0 root tbf rate 50mbit latency 0.1ms burst 50mbit
  1. Attach the volume to a node and run below job to write data.
fio -filename=/dev/longhorn/vol -name=write-test -ioengine=libaio -direct=1 -iodepth=32 -rw=write -numjobs=2 -runtime=60 -group_reporting -bs=1m
  1. Observe the rebuilding of replicas (Either all or one of them)

Expected behavior
Replica rebuilding should not happen.

Environment:

  • Longhorn version: Longhorn v1.2.0-preview1
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Kubectl
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: RKE 1.21.3
    • Number of management node in the cluster: 1
    • Number of worker node in the cluster: 3
  • Node config
    • OS type and version: Ubuntu 20.04
    • CPU per node: 4
    • Memory per node: 8 Gi
    • Disk type(e.g. SSD/NVMe): SSD
    • Network bandwidth between the nodes: Upto 10 Gi
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): DO
  • Number of Longhorn volumes in the cluster: 1

Metadata

Assignees

Labels

backport/1.1.3Require to backport to 1.1.3 release branchkind/bugseverity/3Function working but has a major issue w/ workaround

Type

No type

Projects

  • Status

    Closed

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions