-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x64Emitter: Emit shorter MOVs for 32-bit immediates #8133
Conversation
Prior to this commit, the emitter would emit a 7-byte instruction when loading a 32-bit immediate to a 64-bit register. 0: 48 c7 c0 ff ff ff 7f mov rax,0x7fffffff With this change, it will check if it can instead emit a load to a 32-bit register, which takes only 5 or 6 bytes. 0: b8 ff ff ff 7f mov eax,0x7fffffff
This should work for the full 32-bit range since sign-extension only happens when the operand size is 64 bits, I think? |
32-bit immediates are always sign extended when the destination is 64-bit, so we can only apply this when the immediate's highest bit is zero. Or perhaps I'm misunderstanding and you're actually referring to something else entirely? |
That's the thing though, the destination isn't 64-bit. |
Doesn't |
But that's not what is actually emitted inside the |
Well yes, that's the point of the optimization; we take advantage of x86-64 zero-extending values written to 32-bit registers when the immediate values allow for this. There's three similar optimizations:
|
What I'm saying is the optimization replaces a 64-bit MOV with a 32-bit MOV that moves into a 32-bit register, so there is no 64-bit operand, so there is no sign-extension: https://godbolt.org/z/XZ42cy |
Yes, which is why we can only apply this optimization when the sign-extension (64-bit instruction before optimization) and the zero-extension (32-bit instruction after optimization) give the same result.
I'd be happy to discuss this on IRC or something, because I think we're talking past each other here. :/ |
Oh, I see what you mean now. |
Expands what we did in #7414 so it works for 32-bit immediates as well. We can emit a shorter 32-bit instruction for 32-bit immediates in [0, 0x7FFFFFFF].
We hit this in cmpXX, and possibly some other places too.
Before:
After: