-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Backing Image Data Inconsistency if it's Exported from a Backing Image Backed Volume #6899
Comments
cc @shuo-wu |
Some observations with the following steps:
Thanks |
Note: From Longhorn side, we can't reproduce the issue Longhorn side reproduce steps
|
@WebberHuang1118 Please work with @ChanYiLin to reproduce the issue. |
For the above reproduce steps #6899 (comment), After step 6 in #6899 (comment), As I described in step 9 #6899 (comment), If we create a new volume from the synthesized backing image and compare the new volume's content with the source volume, the content is inconsistent. However, if the source volume is not a backing image backed volume, there is no such data inconsistency. Thanks |
I found a method to reproduce a similar case without Harvester-specific operations (no VM operation, only creating PVC from backing image and I/O access) Detailed steps:
PS:
|
I can reproduce the issue, will look into how we export BackingImage cc @shuo-wu |
BackingImage Export MechanismVolume is composed of a series of snapshots and a BackingImage
When Exporting Volume to another BackingImage code
Volume Cloning MechanismIs totally the same as exporting BackingImage except it won't load BackingImage when doing `preload() |
Volume Cloning Test - PASSEDBoth Volumes are consistent
|
BackingImage Export Test (without resize2fs) - FAILEDFollowing the same steps mentioned here, but don't apply
Still inconsistent |
BackingImage Export Test (without
|
Consider Volume Clone(with partition recreation) works but BackingImage export(with partition recreation) doesn't. The only difference between Volume Clone and BackingImage export is that when
Volume Clone won't consider BackingIamge index in this case, so the export won't export BackingImage data |
Sorry...? Do you mean that VolumeExport won't retrieve the backing image fiemap/sector list hence the BackingImage data won't be exported? But why does |
Hi @shuo-wu It will create a replica and preload the replica to build the r.volume.location index map In
In the test, we mount the BackingImage to the Pod. If we just do Volume Clone, then the new target Volume + Original BackingImage will be consistent I am guessing somehow something when wrong when doing preload and syncContent() |
And also we found that
So maybe the partition information corrupted when exporting |
Does the volume clone test and the failed backing image test mean that the new partitioning info can be cloned correctly via |
Yes, that is what I observed here. And the only different between these two is preload() |
I am thinking if it is because
When SyncContent() and GetDataLayout(), we actually check if the sector is Data or Hole by checking if the DiskIndex is 0 or non-0 So in the Case 1, we still punch hole for sector 2, 3 Note: If we don't recreate the partition, the export BackingImage would be consistent...but why? TestCase |
The question is, in case 2, |
Yeah so the problem is still |
This example assumes that the logical (virtual) size of the backing image is equal to the logical size of snapshot:
In reality, they are not:
Revisiting 2 cases:
|
An other testing idea I am thinking about is:
|
This is actually fit the testing results from @WebberHuang1118 testing |
Awesome. It means somehow the region outside of the backing image boundary (3MB offset to 4MB offset region) in the backing image layer is overwriting the snapshot layer. Maybe next step would be adding some logs to the preload() function to see if we can identify the faulted logic (why choosing the backing image layer instead of snapshot layer for the region 3MB-4MB)? |
Hi @PhanLe1010 As you can see the preload location I think is correct, there is a series of 2 which means the data from the snapshot Misleading here, the 4096 here means 4096 Blocks, each Block is 4096Bytes Note: I add the log message here, this function will be executed by multiple goroutine in parallel when sending segment Note:
|
Sorry @ChanYiLin. I will have to continue tomorrow. |
sure no problem, thanks for your help! |
it seems the preload result and the sending segment intervals are correct |
The Sending Batch Size is 2MB, so for 16MB Volume Here is the log lhim2_new2.txt |
Interesting finding @ChanYiLin !
What about when cloning, ReadAt() also returns all 0 in the region [2MB, 4MB] ? By the way, how do you know that ReadAt() return 0 in the region [2MB,4MB]? From https://github.com/longhorn/longhorn/files/13905510/lhim2_new2.txt I only see the
|
@PhanLe1010 you can search Since the sending batch size is 2MB and the sector size is 4KB so the [ 512: 1024] means [2MB: 4MB] |
Hi @PhanLe1010 But Here is the result of using clone I can't see where goes wrong... Note: it also contains syncing the .meta so just look at syncing the image |
I think I found the bug
https://github.com/longhorn/longhorn-engine/blob/deb8b18a1558ddf4f359146b289d5469bef7cc6d/pkg/replica/diff_disk.go#L202-L241
I wonder do we really need to cut it with count? |
TEST (By directly returning
|
Agreeing with removing the cut for On the other hand, I am thinking if we need to improve |
From my perspective, I think we should still return
So it has no knowledge about the snapshot chain underneath. If it reads outside of the boundary of the |
After discussing with @ChanYiLin, we agree that Longhorn should return the length the caller wants to read if there is no error |
Pre Ready-For-Testing Checklist
|
Hi @ChanYiLin , Following test steps, this issue verified pass on master-head, but in v1.6.x-head, the two backing image data were still not consistent. The measter-head engine images build time is later then v1.6.x-head, not sure if this means that v1.6.x engine does not contain your fix, could you help to check? thank you. > docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
longhornio/longhorn-engine master-head 16d097e2f0e6 4 hours ago 343MB
longhornio/longhorn-engine v1.6.x-head 1c7202bbd25e 5 hours ago 343MB
|
@chriscchien 1.6 PR was just merged, so please wait for https://drone-publish.longhorn.io/longhorn/longhorn-engine/810 finished. |
Verified pass on longhorn master(longhorn-engine I could reproduce this issue before the fix, after the fix, the two volumes data are identical. |
Describe the bug (🐛 if you encounter this issue)
The volume encounters a data integrity issue, if it's from a backing image, and such backing image is exported by a volume backed by another backing image.
To Reproduce
$ sudo sync
e2fsck -v
on vol2, fsck will report filesystem errors and require data repairExpected behavior
The data of vol1 and vol2 should be consistent.
Support bundle for troubleshooting
supportbundle_066aa8ed-a12b-4ccc-971c-59404ee44cd4_2023-10-16T07-46-44Z.zip
Environment
Additional context
The text was updated successfully, but these errors were encountered: