Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] allow volume migration when volume is degraded (harvester vm) #2805

Closed
guangbochen opened this issue Jul 17, 2021 · 7 comments
Closed
Assignees
Labels
backport/1.1.3 Require to backport to 1.1.3 release branch backport/1.2.1 Require to backport to 1.2.1 release branch component/longhorn-manager Longhorn manager (control plane) kind/bug priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated
Milestone

Comments

@guangbochen
Copy link
Contributor

Is your feature request related to a problem? Please describe.
during the Harvester VM live migration(consist of 2 nodes), all LH volumes are degraded, given that the Longhorn replica is equal to 3, so the pod is failed to start correclty.

unable to attach volume pvc-83b575fa-4229-46be-a6a0-186561a903cc to bm-harv2: volume must be healthy to start migration, code=Server Error, detail=] from [http://longhorn-backend:9500/v1/volumes/pvc-83b575fa-4229-46be-a6a0-186561a903cc?action=attach

NAME                                     READY   STATUS              RESTARTS   AGE
virt-launcher-lawr-1-j75dc               1/1     Running             0          116m
virt-launcher-lawr-1-tcxvq               0/1     ContainerCreating   0          8m47s


Warning  FailedAttachVolume  3s (x3 over 9s)    attachdetach-controller  AttachVolume.Attach failed for volume "pvc-83b575fa-4229-46be-a6a0-186561a903cc" : rpc error: code = Internal desc = Bad response statusCode [500]. Status [500 Internal Server Error]. Body: [message=unable to attach volume pvc-83b575fa-4229-46be-a6a0-186561a903cc to bm-harv2: volume must be healthy to start migration, code=Server Error, detail=] from [http://longhorn-backend:9500/v1/volumes/pvc-83b575fa-4229-46be-a6a0-186561a903cc?action=attach]

Describe the solution you'd like
need a feasible way to relax the restriction on VM migration when the volume is degraded.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
harvester/harvester#798

@guangbochen guangbochen added the kind/feature Feature request, new feature label Jul 17, 2021
@PhanLe1010 PhanLe1010 removed the kind/feature Feature request, new feature label Jul 17, 2021
@yasker yasker added this to the v1.2.0 milestone Jul 17, 2021
@yasker yasker added priority/1 Highly recommended to implement or fix in this release (managed by PO) component/longhorn-manager Longhorn manager (control plane) kind/bug labels Jul 17, 2021
@yasker yasker added priority/0 Must be implement or fixed in this release (managed by PO) and removed priority/1 Highly recommended to implement or fix in this release (managed by PO) labels Aug 1, 2021
@yasker
Copy link
Member

yasker commented Aug 1, 2021

Raise the priority due to it has a high impact in case the user only has 3 nodes and lost one node.

@innobead innobead assigned PhanLe1010 and unassigned shuo-wu Aug 2, 2021
@joshimoo joshimoo changed the title [FEATURE] feasible way to relax the restriction on VM migration [FEATURE] allow volume migration when volume is degraded (harvester vm) Aug 2, 2021
@innobead innobead assigned shuo-wu and unassigned PhanLe1010 Aug 3, 2021
@innobead innobead changed the title [FEATURE] allow volume migration when volume is degraded (harvester vm) [BUG] allow volume migration when volume is degraded (harvester vm) Aug 11, 2021
@yasker yasker modified the milestones: v1.2.0, v1.2.1 Aug 17, 2021
@yasker
Copy link
Member

yasker commented Aug 17, 2021

We can fix this to v1.2.1.

@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Sep 9, 2021

Pre Ready-For-Testing Checklist

@shuo-wu shuo-wu added require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated backport/1.2.1 Require to backport to 1.2.1 release branch labels Sep 9, 2021
@yasker yasker modified the milestones: v1.2.1, v1.3.0 Sep 10, 2021
@shuo-wu
Copy link
Contributor

shuo-wu commented Sep 13, 2021

The live upgrade feature doesn't work correctly. Will have a PR fixing this regression later.

@shuo-wu
Copy link
Contributor

shuo-wu commented Sep 16, 2021

The regression is fixed. Please verify it during testing.

@khushboo-rancher khushboo-rancher self-assigned this Sep 20, 2021
@khushboo-rancher
Copy link
Contributor

khushboo-rancher commented Sep 22, 2021

Validated with v1.2.1-rc1 and Longhorn-master -09/21/2021

Validation - Pass

Validated below scenarios with a Harvester setup with upgraded Longhorn.

  1. The migration works with degraded volume.
  2. Migration with rebuilding is in progress - First rebuilding completes and then only the migration starts and complete.
  3. Migration with failed replica - Only healthy replicas get created first for the migration.

Note: Observed the live upgrade failure on the set up as mentioned #2805 (comment). @shuo-wu Do we have any issue to track it?

@khushboo-rancher
Copy link
Contributor

Created #3052 for upgrade problem
Observed another problem #3053

@innobead innobead added backport-needed/1.1.x and removed backport/1.1.3 Require to backport to 1.1.3 release branch labels Oct 11, 2021
@innobead innobead added the backport/1.1.3 Require to backport to 1.1.3 release branch label Dec 10, 2021
@github-project-automation github-project-automation bot moved this to New Issues in Longhorn Sprint Aug 4, 2024
@derekbit derekbit moved this from New Issues to Closed in Longhorn Sprint Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.1.3 Require to backport to 1.1.3 release branch backport/1.2.1 Require to backport to 1.2.1 release branch component/longhorn-manager Longhorn manager (control plane) kind/bug priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated
Projects
Status: Closed
Development

No branches or pull requests

7 participants