Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift: Drop unused arguments before regalloc #8438

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

jameysharp
Copy link
Contributor

Before this, Cranelift ABI code would emit a stack-load instruction for every stack argument and add all register arguments to the args pseudo-instruction, whether those arguments were used or not.

However, we already know which arguments are used at that point because we need the analysis for load-sinking, so it's easy to filter the unused arguments out.

This avoids generating loads that are immediately dead, which is good for the generated code. It also slightly reduces the size of the register allocation problem, which is a small win in compile time.

This also changes which registers RA2 chooses in some cases because it no longer considers unused defs from the args pseudo-instruction.

There was an existing method named arg_is_needed_in_body which sounded like it should be the right place to implement this. However, that method was only used for Baldrdash integration and has been a stub since that integration was removed in #4571. Also it didn't have access to the value_ir_uses map needed here. But the place where that method was called does have access to that map and was perfect for this.

Thanks to @elliottt for doing the initial investigation of this change with me.

@jameysharp jameysharp requested review from a team as code owners April 22, 2024 19:27
@jameysharp jameysharp requested review from abrown and alexcrichton and removed request for a team April 22, 2024 19:27
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-clean precision change for a nice result -- thanks very much for thinking through this! I'm excited to see the useless loads go away in a bunch of functions.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. labels Apr 22, 2024
@jameysharp
Copy link
Contributor Author

So cargo test -p wasmtime engine::serialization::test::cache_accounts_for_opt_level fails with this PR. The panic message is:

thread '<unnamed>' panicked at cranelift/codegen/src/machinst/compile.rs:76:14:
register allocation: SSA(VReg(vreg = 192, class = Int), Inst(1))

This test simply compiles (module (func)) several times with different optimization settings and with Wasmtime's cache enabled.

From experimentation, this problem only seems to occur if Wasmtime's debug-info is enabled. I guess this is the only test that runs in that mode.

I thought emit_value_label_markers_for_block_args, or the special case to call emit_value_label_marks_for_value on the vmctx parameter, seemed like likely candidates for telling RA2 to use an argument that's not otherwise used. But commenting out those calls doesn't fix this.

I'm not sure how else to investigate this; any suggestions?

@cfallin
Copy link
Member

cfallin commented Apr 22, 2024

Can you dump the vcode from the failure? The most pertinent bit is figuring out what is still using the argument, I guess (as you suggest)...

Before this, Cranelift ABI code would emit a stack-load instruction for
every stack argument and add all register arguments to the `args`
pseudo-instruction, whether those arguments were used or not.

However, we already know which arguments are used at that point because
we need the analysis for load-sinking, so it's easy to filter the unused
arguments out.

This avoids generating loads that are immediately dead, which is good
for the generated code. It also slightly reduces the size of the
register allocation problem, which is a small win in compile time.

This also changes which registers RA2 chooses in some cases because it
no longer considers unused defs from the `args` pseudo-instruction.

There was an existing method named `arg_is_needed_in_body` which sounded
like it should be the right place to implement this. However, that
method was only used for Baldrdash integration and has been a stub since
that integration was removed in bytecodealliance#4571. Also it didn't have access to the
`value_ir_uses` map needed here. But the place where that method was
called does have access to that map and was perfect for this.

Also, don't emit a `dummy_use` pseudo-instruction for the vmctx if it's
otherwise unused everywhere, as we want to drop it from the `args`
instruction in that case and then RA2 complains that it's used without
being defined.

Furthermore, don't emit debug info specially for the vmctx parameter,
because it's already emitted for all block parameters including vmctx.

Thanks to @elliottt for doing the initial investigation of this change
with me, and to @cfallin for helping me track down the `dummy_use` false
dependency.
@jameysharp jameysharp added this pull request to the merge queue Apr 22, 2024
Merged via the queue into bytecodealliance:main with commit 3befbe5 Apr 22, 2024
21 checks passed
@jameysharp jameysharp deleted the drop-unused-args branch April 22, 2024 21:49
jameysharp added a commit to jameysharp/wasmtime that referenced this pull request May 20, 2024
In bytecodealliance#8438 we stopped emitting register bindings for unused arguments,
based on the use-counts from `compute_use_states`. However, that doesn't
count the use of the struct-return argument that's automatically added
after lowering when the `rets` instruction is generated in the epilogue.
As a result, using a struct-return argument caused register allocation
to panic due to the VReg not being defined anywhere.

This commit adds a use to the struct-return argument so that it's always
available in the epilogue.

Fixes bytecodealliance#8659
github-merge-queue bot pushed a commit that referenced this pull request May 20, 2024
* cranelift: Always consider sret arguments used

In #8438 we stopped emitting register bindings for unused arguments,
based on the use-counts from `compute_use_states`. However, that doesn't
count the use of the struct-return argument that's automatically added
after lowering when the `rets` instruction is generated in the epilogue.
As a result, using a struct-return argument caused register allocation
to panic due to the VReg not being defined anywhere.

This commit adds a use to the struct-return argument so that it's always
available in the epilogue.

Fixes #8659

* Review comments
alexcrichton pushed a commit to alexcrichton/wasmtime that referenced this pull request May 20, 2024
* cranelift: Always consider sret arguments used

In bytecodealliance#8438 we stopped emitting register bindings for unused arguments,
based on the use-counts from `compute_use_states`. However, that doesn't
count the use of the struct-return argument that's automatically added
after lowering when the `rets` instruction is generated in the epilogue.
As a result, using a struct-return argument caused register allocation
to panic due to the VReg not being defined anywhere.

This commit adds a use to the struct-return argument so that it's always
available in the epilogue.

Fixes bytecodealliance#8659

* Review comments
alexcrichton added a commit that referenced this pull request May 20, 2024
* cranelift: Always consider sret arguments used

In #8438 we stopped emitting register bindings for unused arguments,
based on the use-counts from `compute_use_states`. However, that doesn't
count the use of the struct-return argument that's automatically added
after lowering when the `rets` instruction is generated in the epilogue.
As a result, using a struct-return argument caused register allocation
to panic due to the VReg not being defined anywhere.

This commit adds a use to the struct-return argument so that it's always
available in the epilogue.

Fixes #8659

* Review comments

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants