Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIL fails to claim when switching machine endianness #5256

Closed
tcaputi opened this issue Oct 10, 2016 · 5 comments
Closed

ZIL fails to claim when switching machine endianness #5256

tcaputi opened this issue Oct 10, 2016 · 5 comments

Comments

@tcaputi
Copy link
Contributor

tcaputi commented Oct 10, 2016

The following output was obtained by running the first half of ziltest.sh (everything before the pool is reimported) on a big endian machine and then transferring the pool files to a little endian machine for import. The debugging was added to zil_read_log_block() just before the checksum comparison.

[ 2077.023433] cksum    : 5703dddd619f1dcc:ba6d7fe57491bf4b:4d:2
[ 2077.023434] blk_cksum: cc1d9f61dddd0357:4bbf9174e57f6dba:4d00000000000000:200000000000000

It seems that the stored checksum is actually correct, but it has not been byteswapped to match the calculated checksum. The fix here should be fairly simply (just adding the byteswap), but this kind of error might be present in more places after the claim is finished.

@tuxoko
Copy link
Contributor

tuxoko commented Oct 12, 2016

@tcaputi
Are you saying the image when imported on little endian machine will have error?
I tried importing it on little endian and it imported fine.

@tcaputi
Copy link
Contributor Author

tcaputi commented Oct 12, 2016

@tuxoko

The pool will import fine because the end of the ZIL is simply defined as "the block where the checksum stops matching." However, any records that are in the ZIL (and therefore that ZFS promised were synced) will be lost.

If you add some debugging of zil_read_log_block() you should be able to see the byteswapped checksum that I was talking about above.

@tuxoko
Copy link
Contributor

tuxoko commented Oct 12, 2016

@tcaputi
No, the ckecksum (actually they are not checksum, they are sequence number) all matched perfectly. Do you run with any patch.

@tcaputi
Copy link
Contributor Author

tcaputi commented Oct 12, 2016

I apologize. This was a mistake on my part. I found this while I was working through byteswap problems in my encryption patch. I thought I had tested it on upstream as well and found the same issue but it seems I did not. Sorry for the noise.

@tcaputi tcaputi closed this as completed Oct 12, 2016
@behlendorf
Copy link
Contributor

On the bright side we now have a big endian pool easily available for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants