-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shield compiled modules from their appended metadata #4609
Conversation
This commit fixes bytecodealliance#4600 in a somewhat roundabout fashion. Currently the `main` branch of Wasmtime exhibits unusual behavior: * If `./ci/run-tests.sh` is run then the `cache_accounts_for_opt_level` test does not fail. * If `cargo test -p wasmtime --lib` is run, however, then the test fails. This test is indeed being run as part of `./ci/run-tests.sh` and it's also passing in CI. The exact failure is that part of the debuginfo support we have takes an existing ELF image, copies it, and then appends some information to inform profilers/gdb about the image. This code is all quite old at this point and not 100% optimal, but that's at least where we're at. The problem is that the appended `ProgramHeader64` is not aligned correctly during `cargo test -p wasmtime --lib`, which is the panic that happens causing the test to fail. The reason, however, that this test passes with `./ci/run-tests.sh` is that the alignment of `ProgramHeader64` is 1 instead of 8. The reason for that is that the `object` crate has an `unaligned` feature which forcibly unaligns all primitives to 1 byte instead of their natural alignment. During `cargo test -p wasmtime --lib` this feature is not enabled but during `./ci/run-tests.sh` this feature is enabled. The feature is currently enabled through inclusion of the `backtrace` crate which only happens for some tests in some crates. The alignment issue explains why the test fails on a single crate test but fails on the whole workspace tests. The next issue I investigated was if this test ever passed. It turns out that on v0.39.0 this test passed, and the regression to main was introduced during bytecodealliance#4571. That PR, however, has nothing to do with any of this! The reason that this showed up as causing a "regression" however is because it changed cranelift settings which changed the size of serialized metadata at the end of a Wasmtime cache object. Wasmtime compiled artifacts are ELF images with Wasmtime-specific metadata appended after them. This appended metadata was making its way all the way through to the gdbjit image itself which mean that while the end of the ELF file itself was properly aligned the space after the Wasmtime metadata was not aligned. This metadata changes in size over time as Cranelift settings change which explains why bytecodealliance#4571 was the "source" of the regression. The fix in this commit is to discard the extra Wasmtime metadata when creating an `MmapVec` representing the underlying ELF image. This is already supported with `MmapVec::drain` so it was relatively easy to insert that. This means that the gdbjit image starts with just the ELF file itself which is always aligned at the end, which gets the test passing with/without the `unaligned` feature in the `object` crate.
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed several read-throughs of both the commit message and the patch to understand this, but eventually I was able to convince myself that this patch does what you said it needed to: limit the returned SerializedModule::artifacts
to refer only to the ELF object, and not any trailers.
I think it was MmapVec::drain
that caused me the most trouble as it only loosely resembles the corresponding method on Vec
. Perhaps a future improvement would be to model that method after Vec::split_off
instead?
As suggested on bytecodealliance#4609
This commit fixes #4600 in a somewhat roundabout fashion. Currently the
main
branch of Wasmtime exhibits unusual behavior:./ci/run-tests.sh
is run then thecache_accounts_for_opt_level
test does not fail.
cargo test -p wasmtime --lib
is run, however, then the testfails.
This test is indeed being run as part of
./ci/run-tests.sh
and it'salso passing in CI. The exact failure is that part of the debuginfo
support we have takes an existing ELF image, copies it, and then appends
some information to inform profilers/gdb about the image. This code is
all quite old at this point and not 100% optimal, but that's at least
where we're at.
The problem is that the appended
ProgramHeader64
is not alignedcorrectly during
cargo test -p wasmtime --lib
, which is the panic thathappens causing the test to fail. The reason, however, that this test
passes with
./ci/run-tests.sh
is that the alignment ofProgramHeader64
is 1 instead of 8. The reason for that is that theobject
crate has anunaligned
feature which forcibly unaligns allprimitives to 1 byte instead of their natural alignment. During
cargo test -p wasmtime --lib
this feature is not enabled but during./ci/run-tests.sh
this feature is enabled. The feature is currentlyenabled through inclusion of the
backtrace
crate which only happensfor some tests in some crates.
The alignment issue explains why the test fails on a single crate test
but fails on the whole workspace tests. The next issue I investigated
was if this test ever passed. It turns out that on v0.39.0 this test
passed, and the regression to main was introduced during #4571. That
PR, however, has nothing to do with any of this! The reason that this
showed up as causing a "regression" however is because it changed
cranelift settings which changed the size of serialized metadata at the
end of a Wasmtime cache object.
Wasmtime compiled artifacts are ELF images with Wasmtime-specific
metadata appended after them. This appended metadata was making its way
all the way through to the gdbjit image itself which mean that while the
end of the ELF file itself was properly aligned the space after the
Wasmtime metadata was not aligned. This metadata changes in size over
time as Cranelift settings change which explains why #4571 was the
"source" of the regression.
The fix in this commit is to discard the extra Wasmtime metadata when
creating an
MmapVec
representing the underlying ELF image. This isalready supported with
MmapVec::drain
so it was relatively easy toinsert that. This means that the gdbjit image starts with just the ELF
file itself which is always aligned at the end, which gets the test
passing with/without the
unaligned
feature in theobject
crate.