-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cranelift: Drop unused arguments before regalloc #8438
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super-clean precision change for a nice result -- thanks very much for thinking through this! I'm excited to see the useless loads go away in a bunch of functions.
So
This test simply compiles From experimentation, this problem only seems to occur if Wasmtime's debug-info is enabled. I guess this is the only test that runs in that mode. I thought I'm not sure how else to investigate this; any suggestions? |
Can you dump the vcode from the failure? The most pertinent bit is figuring out what is still using the argument, I guess (as you suggest)... |
Before this, Cranelift ABI code would emit a stack-load instruction for every stack argument and add all register arguments to the `args` pseudo-instruction, whether those arguments were used or not. However, we already know which arguments are used at that point because we need the analysis for load-sinking, so it's easy to filter the unused arguments out. This avoids generating loads that are immediately dead, which is good for the generated code. It also slightly reduces the size of the register allocation problem, which is a small win in compile time. This also changes which registers RA2 chooses in some cases because it no longer considers unused defs from the `args` pseudo-instruction. There was an existing method named `arg_is_needed_in_body` which sounded like it should be the right place to implement this. However, that method was only used for Baldrdash integration and has been a stub since that integration was removed in bytecodealliance#4571. Also it didn't have access to the `value_ir_uses` map needed here. But the place where that method was called does have access to that map and was perfect for this. Also, don't emit a `dummy_use` pseudo-instruction for the vmctx if it's otherwise unused everywhere, as we want to drop it from the `args` instruction in that case and then RA2 complains that it's used without being defined. Furthermore, don't emit debug info specially for the vmctx parameter, because it's already emitted for all block parameters including vmctx. Thanks to @elliottt for doing the initial investigation of this change with me, and to @cfallin for helping me track down the `dummy_use` false dependency.
d3c3011
to
d1c5ff3
Compare
In bytecodealliance#8438 we stopped emitting register bindings for unused arguments, based on the use-counts from `compute_use_states`. However, that doesn't count the use of the struct-return argument that's automatically added after lowering when the `rets` instruction is generated in the epilogue. As a result, using a struct-return argument caused register allocation to panic due to the VReg not being defined anywhere. This commit adds a use to the struct-return argument so that it's always available in the epilogue. Fixes bytecodealliance#8659
* cranelift: Always consider sret arguments used In #8438 we stopped emitting register bindings for unused arguments, based on the use-counts from `compute_use_states`. However, that doesn't count the use of the struct-return argument that's automatically added after lowering when the `rets` instruction is generated in the epilogue. As a result, using a struct-return argument caused register allocation to panic due to the VReg not being defined anywhere. This commit adds a use to the struct-return argument so that it's always available in the epilogue. Fixes #8659 * Review comments
* cranelift: Always consider sret arguments used In bytecodealliance#8438 we stopped emitting register bindings for unused arguments, based on the use-counts from `compute_use_states`. However, that doesn't count the use of the struct-return argument that's automatically added after lowering when the `rets` instruction is generated in the epilogue. As a result, using a struct-return argument caused register allocation to panic due to the VReg not being defined anywhere. This commit adds a use to the struct-return argument so that it's always available in the epilogue. Fixes bytecodealliance#8659 * Review comments
* cranelift: Always consider sret arguments used In #8438 we stopped emitting register bindings for unused arguments, based on the use-counts from `compute_use_states`. However, that doesn't count the use of the struct-return argument that's automatically added after lowering when the `rets` instruction is generated in the epilogue. As a result, using a struct-return argument caused register allocation to panic due to the VReg not being defined anywhere. This commit adds a use to the struct-return argument so that it's always available in the epilogue. Fixes #8659 * Review comments Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Before this, Cranelift ABI code would emit a stack-load instruction for every stack argument and add all register arguments to the
args
pseudo-instruction, whether those arguments were used or not.However, we already know which arguments are used at that point because we need the analysis for load-sinking, so it's easy to filter the unused arguments out.
This avoids generating loads that are immediately dead, which is good for the generated code. It also slightly reduces the size of the register allocation problem, which is a small win in compile time.
This also changes which registers RA2 chooses in some cases because it no longer considers unused defs from the
args
pseudo-instruction.There was an existing method named
arg_is_needed_in_body
which sounded like it should be the right place to implement this. However, that method was only used for Baldrdash integration and has been a stub since that integration was removed in #4571. Also it didn't have access to thevalue_ir_uses
map needed here. But the place where that method was called does have access to that map and was perfect for this.Thanks to @elliottt for doing the initial investigation of this change with me.