Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] v2 supports volume expansion #8022

Open
derekbit opened this issue Feb 26, 2024 · 16 comments
Open

[FEATURE] v2 supports volume expansion #8022

derekbit opened this issue Feb 26, 2024 · 16 comments
Assignees
Labels
area/v2-data-engine v2 data engine (SPDK) highlight Important feature/issue to highlight kind/feature Feature request, new feature priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/doc Require updating the longhorn.io documentation require/lep Require adding/updating enhancement proposal require/manual-test-plan Require adding/updating manual test cases if they can't be automated
Milestone

Comments

@derekbit
Copy link
Member

Is your feature request related to a problem? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

@derekbit derekbit added kind/feature Feature request, new feature require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/doc Require updating the longhorn.io documentation require/lep Require adding/updating enhancement proposal require/manual-test-plan Require adding/updating manual test cases if they can't be automated area/v2-data-engine v2 data engine (SPDK) labels Feb 26, 2024
@derekbit
Copy link
Member Author

derekbit commented Feb 26, 2024

@AaronDewes
Thanks for your contribution. I created the feature request ticket for tracking the status of the v2 volume expansion and resize. Feel free to discuss and update the status here. Thank you.

cc @innobead @shuo-wu @DamiaSan

@derekbit derekbit added this to the v1.7.0 milestone Feb 26, 2024
@innobead innobead added highlight Important feature/issue to highlight priority/0 Must be implement or fixed in this release (managed by PO) labels Feb 26, 2024
@AaronDewes
Copy link

Thanks!

As I've mentioned before, I have not yet tested this properly yet. I'm planning to run this locally today and make improvements or let you know if I have questions. Because I've only used Longhorn before, but not contributed yet, I'm not familiar with the internal architecture.

@innobead
Copy link
Member

@AaronDewes as this is an upcoming feature in 1.7.0, so I would expect you can actively work on the feature. Please work with @derekbit who is our maintainer as well to try to make the feature happen eventually.

Assigned to you first, as you are working on that. However, if you can't make it, just let us know and we will arrange the assignee. Thanks for your contribution!

@derekbit
Copy link
Member Author

Thanks!

As I've mentioned before, I have not yet tested this properly yet. I'm planning to run this locally today and make improvements or let you know if I have questions. Because I've only used Longhorn before, but not contributed yet, I'm not familiar with the internal architecture.

@AaronDewes
Feel free to ping me on the Slack channel or leave messages here if you have any questions.
BTW, please let us know what docs or materials we can improve for external contributors.
Thank you.

@derekbit
Copy link
Member Author

derekbit commented May 4, 2024

Just found the related PR in spdk side
https://review.spdk.io/gerrit/c/spdk/spdk/+/22619

cc @AaronDewes @DamiaSan

@DamiaSan
Copy link
Contributor

DamiaSan commented May 6, 2024

Just found the related PR in spdk side https://review.spdk.io/gerrit/c/spdk/spdk/+/22619

I think this patch will be merged soon on SPDK master branch.

@derekbit
Copy link
Member Author

@DamiaSan Could you collaborate with @AaronDewes to land the implementation in Longhorn? Thanks.

cc @innobead @shuo-wu

@DamiaSan
Copy link
Contributor

@DamiaSan Could you collaborate with @AaronDewes to land the implementation in Longhorn? Thanks.

cc @innobead @shuo-wu

Yes, sure. @AaronDewes do you already have some questions or something I can do for you?

@DamiaSan
Copy link
Contributor

Hi @AaronDewes , I have just had a look at the work to do:

  • SPDK side, we have simply to start using the upcoming release 24.09, which include the raid1 resize functionality
  • go-spdk-helper side, the bdev_lvol_resize has already been wrapped into the API BdevLvolResize
  • go-spdk-engine side, the work has to be done. We must implement the expand functionality at replica and engine level
  • longhorn-instance-manager, we must implement the function VolumeExpand for the V2DataEngineProxyOps.

I will create a new SPDK branch as soon as the new 24.09 release will be available, meanwhile you could start working on longhorn-instance-manager and longhorn-spdk-engine.
A good way could be to take as example how the function VolumeSnapshot is implemented for both V1DataEngineProxyOps and V2DataEngineProxyOps of longhorn-instance-manager/pkg/proxy/snapshot.go and make something similar for the VolumeExpand in longhorn-instance-manager/pkg/proxy/volume.go. Also in longhorn-spdk-engine, you can "copy" the way a snapshot is done to implement the expansion functionality, the chain of calls is the following:

  • EngineSnapshotCreate of SPDKClient, this is called by longhorn-instance-manager
  • EngineSnapshotCreate of spdkrpc.SPDKServiceClient
  • EngineSnapshotCreate of Server, which is a spdkrpc.UnimplementedSPDKServiceServer
  • SnapshotCreate of Engine
  • snapshotOperation and snapshotOperationWithoutLock of Engine
  • replicaSnapshotOperation of Engine for all the replicas of the volume
  • ReplicaSnapshotCreate of SPDKClient
  • ReplicaSnapshotCreate of spdkrpc.SPDKServiceClient
  • ReplicaSnapshotCreate of Server, which is a spdkrpc.UnimplementedSPDKServiceServer
  • SnapshotCreate of Replica
  • BdevLvolSnapshot, which is the API of go-spdk-helper

WDYT? Tell me if you need other informations or any kind of help.

cc @innobead @derekbit

@shuo-wu
Copy link
Contributor

shuo-wu commented Sep 17, 2024

2 questions:

  1. For a lvol, should the size keep the same as that of its parent or its children?
  2. After resizing a lvol, should we somehow reload/refresh its nvmf used by a RAID?

@DamiaSan
Copy link
Contributor

2 questions:

  1. For a lvol, should the size keep the same as that of its parent or its children?

When you resize a lvol, its snapshot parent isn't affected by the resize operation, only lvol change. The size of a snapshot can't be changed.

  1. After resizing a lvol, should we somehow reload/refresh its nvmf used by a RAID?

No, the nvmf layer is automatically refreshed with the new size.

@shuo-wu
Copy link
Contributor

shuo-wu commented Sep 17, 2024

Then the v2 volume expansion implementation looks less complicated.

@DamiaSan
Copy link
Contributor

Hi @derekbit , what about the suspension of the IO over the volume before to proceed with its expansion, it is not needed. Because SPDK, before to resize a lvol, freeze IO over it. Then, raid resize happens only when all its base bdevs have resized, but in this case the resize is simply a notification, it doesn't need any new space allocation.
I have just made a test resizing a raid1 made of 2 lvols meanwhile IOs were running over it and all worked fine.

@DamiaSan
Copy link
Contributor

Found an issue in SPDK: spdk/spdk#3540

@DamiaSan
Copy link
Contributor

DamiaSan commented Oct 31, 2024

The development in go-spdk-helper is here: https://github.com/DamiaSan/go-spdk-helper/tree/lvol_resize
The development in types is here: https://github.com/DamiaSan/types/tree/volume_expand
The development in longhorn-spdk-engine is here: https://github.com/DamiaSan/longhorn-spdk-engine/tree/volume_expansion
The work to solve the issue in SPDK is here: https://github.com/DamiaSan/spdk/commits/nvmf_lvol_resize/
Working only with SPDK all seems ok, but executing the new test in longhorn-spdk-engine sometimes SPDK get stuck in lvstore creation, so something is still to be fixed.

@DamiaSan DamiaSan moved this from Implement to Icebox in Longhorn Sprint Oct 31, 2024
@innobead innobead modified the milestones: v1.8.0, v1.9.0 Nov 18, 2024
@DamiaSan
Copy link
Contributor

Working only with SPDK all seems ok, but executing the new test in longhorn-spdk-engine sometimes SPDK get stuck in lvstore creation, so something is still to be fixed.

This stuck is not related to this development and has been resolved in ther new longhorn-v24.09 branch with this commit: longhorn/spdk@10463b5
So I can proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/v2-data-engine v2 data engine (SPDK) highlight Important feature/issue to highlight kind/feature Feature request, new feature priority/0 Must be implement or fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/doc Require updating the longhorn.io documentation require/lep Require adding/updating enhancement proposal require/manual-test-plan Require adding/updating manual test cases if they can't be automated
Projects
Status: Icebox
Development

No branches or pull requests

5 participants