Skip to content

Commit

Permalink
Shield compiled modules from their appended metadata (#4609)
Browse files Browse the repository at this point in the history
This commit fixes #4600 in a somewhat roundabout fashion. Currently the
`main` branch of Wasmtime exhibits unusual behavior:

* If `./ci/run-tests.sh` is run then the `cache_accounts_for_opt_level`
  test does not fail.
* If `cargo test -p wasmtime --lib` is run, however, then the test
  fails.

This test is indeed being run as part of `./ci/run-tests.sh` and it's
also passing in CI. The exact failure is that part of the debuginfo
support we have takes an existing ELF image, copies it, and then appends
some information to inform profilers/gdb about the image. This code is
all quite old at this point and not 100% optimal, but that's at least
where we're at.

The problem is that the appended `ProgramHeader64` is not aligned
correctly during `cargo test -p wasmtime --lib`, which is the panic that
happens causing the test to fail. The reason, however, that this test
passes with `./ci/run-tests.sh` is that the alignment of
`ProgramHeader64` is 1 instead of 8. The reason for that is that the
`object` crate has an `unaligned` feature which forcibly unaligns all
primitives to 1 byte instead of their natural alignment. During `cargo
test -p wasmtime --lib` this feature is not enabled but during
`./ci/run-tests.sh` this feature is enabled. The feature is currently
enabled through inclusion of the `backtrace` crate which only happens
for some tests in some crates.

The alignment issue explains why the test fails on a single crate test
but fails on the whole workspace tests. The next issue I investigated
was if this test ever passed. It turns out that on v0.39.0 this test
passed, and the regression to main was introduced during #4571. That
PR, however, has nothing to do with any of this! The reason that this
showed up as causing a "regression" however is because it changed
cranelift settings which changed the size of serialized metadata at the
end of a Wasmtime cache object.

Wasmtime compiled artifacts are ELF images with Wasmtime-specific
metadata appended after them. This appended metadata was making its way
all the way through to the gdbjit image itself which mean that while the
end of the ELF file itself was properly aligned the space after the
Wasmtime metadata was not aligned. This metadata changes in size over
time as Cranelift settings change which explains why #4571 was the
"source" of the regression.

The fix in this commit is to discard the extra Wasmtime metadata when
creating an `MmapVec` representing the underlying ELF image. This is
already supported with `MmapVec::drain` so it was relatively easy to
insert that. This means that the gdbjit image starts with just the ELF
file itself which is always aligned at the end, which gets the test
passing with/without the `unaligned` feature in the `object` crate.
  • Loading branch information
alexcrichton authored Aug 10, 2022
1 parent 7fa89c4 commit fd28d94
Showing 1 changed file with 7 additions and 6 deletions.
13 changes: 7 additions & 6 deletions crates/wasmtime/src/module/serialization.rs
Original file line number Diff line number Diff line change
Expand Up @@ -264,15 +264,15 @@ impl<'a> SerializedModule<'a> {
)
}

pub fn from_mmap(mmap: MmapVec, version_strat: &ModuleVersionStrategy) -> Result<Self> {
pub fn from_mmap(mut mmap: MmapVec, version_strat: &ModuleVersionStrategy) -> Result<Self> {
// First validate that this is at least somewhat an elf file within
// `mmap` and additionally skip to the end of the elf file to find our
// metadata.
let metadata = data_after_elf(&mmap)?;
let elf = take_first_elf(&mut mmap)?;

// The metadata has a few guards up front which we process first, and
// eventually this bottoms out in a `bincode::deserialize` call.
let metadata = metadata
let metadata = mmap
.strip_prefix(HEADER)
.ok_or_else(|| anyhow!("bytes are not a compatible serialized wasmtime module"))?;
if metadata.is_empty() {
Expand Down Expand Up @@ -309,20 +309,21 @@ impl<'a> SerializedModule<'a> {
.context("deserialize compilation artifacts")?;

return Ok(SerializedModule {
artifacts: MyCow::Owned(mmap),
artifacts: MyCow::Owned(elf),
metadata,
});

/// This function will return the trailing data behind the ELF file
/// parsed from `data` which is where we find our metadata section.
fn data_after_elf(data: &[u8]) -> Result<&[u8]> {
fn take_first_elf(mmap: &mut MmapVec) -> Result<MmapVec> {
use object::NativeEndian as NE;
// There's not actually a great utility for figuring out where
// the end of an ELF file is in the `object` crate. In lieu of that
// we build our own which leverages the format of ELF files, which
// is that the header comes first, that tells us where the section
// headers are, and for our ELF files the end of the file is the
// end of the section headers.
let data = &mmap[..];
let mut bytes = Bytes(data);
let header = bytes
.read::<object::elf::FileHeader64<NE>>()
Expand All @@ -334,7 +335,7 @@ impl<'a> SerializedModule<'a> {
.section_headers(NE, data)
.context("failed to read section headers")?;
let range = subslice_range(object::bytes_of_slice(sections), data);
Ok(&data[range.end..])
Ok(mmap.drain(..range.end))
}
}

Expand Down

0 comments on commit fd28d94

Please sign in to comment.