JIT: Improve x86 unsigned to floating cast codegen #111595

saucecontrol · 2025-01-19T19:46:13Z

This improves codegen mostly for unsigned to floating types but catches a few other redundant conversions.

Adds support for using AVX-512 vcvtusi2s[sd] for uint -> float/double (only ulong was handled previously) on both x64 and x86.

-       mov      eax, edx
        vxorps   xmm0, xmm0, xmm0
-       vcvtsi2sd xmm0, xmm0, rax
-       vcvtsd2ss xmm0, xmm0, xmm0
+       vcvtusi2ss xmm0, edx

-       mov      eax, dword ptr [rbp-0x04]
-       mov      eax, eax
        vxorps   xmm0, xmm0, xmm0
-       vcvtsi2sd xmm0, xmm0, rax
+       vcvtusi2sd xmm0, dword ptr [rbp-0x04]

Improves codegen for uint -> float conversions on x64 without AVX-512, removing the intermediate conversion to double.

        mov      eax, edi
        xorps    xmm0, xmm0
-       cvtsi2sd xmm0, rax
-       cvtsd2ss xmm0, xmm0
+       cvtsi2ss xmm0, rax

Adds support for direct ulong -> float cast to the x64 SSE2 fallback, resolving a difference in behavior between hardware with AVX-512 vs without, and saving an extra cvtsd2ss instruction.

        xorps    xmm0, xmm0
        mov      rax, rdi
        shr      rax, 1
        mov      rsi, edi
        and      rsi, 1
        or       rsi, rax
        test     rdi, rdi
        cmovns   rsi, rdi
-       cvtsi2sd xmm0, rsi
+       cvtsi2ss xmm0, rsi
        jns      SHORT G_M37561_IG56
-       addsd    xmm0, xmm0
+       addss    xmm0, xmm0
 G_M37561_IG56:
-       cvtsd2ss xmm0, xmm0

Removes some redundant float -> double -> float casts.

-       vcvtss2sd xmm1, xmm1, xmm1
-       vcvtsd2ss xmm1, xmm1, xmm1
        vbroadcastss xmm1, xmm1

SPMI Diffs

The only code size regressions are the insertion of xorps to clear the upper elements of the target reg for the AVX-512 unsigned conversion instructions. These were previously omitted but should have been there since the unsigned conversions have the same behavior as the signed (i.e. preserving/copying upper elements) and are subject to the same false dependency penalties.

GCC emits the xorps for all conversions; Clang skips it for all conversions in simple examples but may emit it in more complex scenarios.
https://godbolt.org/z/6aY7fdE3d

saucecontrol · 2025-01-19T20:06:57Z

@MihuBot

saucecontrol · 2025-01-19T22:37:45Z

cc @dotnet/jit-contrib this is ready for review

improve x86 integral to floating cast codegen

496e50f

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 19, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 19, 2025

MihuBot mentioned this pull request Jan 19, 2025

[JitDiff X64] [saucecontrol] JIT: Improve x86 integral to floating cast codegen MihuBot/runtime-utils#910

Open

saucecontrol marked this pull request as ready for review January 19, 2025 22:17

saucecontrol changed the title ~~JIT: Improve x86 integral to floating cast codegen~~ JIT: Improve x86 unsigned to floating cast codegen Jan 19, 2025

This was referenced Jan 20, 2025

slow macOS - "##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes." dotnet/dnceng#1883

Open

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Improve x86 unsigned to floating cast codegen #111595

JIT: Improve x86 unsigned to floating cast codegen #111595

saucecontrol commented Jan 19, 2025 •

edited

Loading

saucecontrol commented Jan 19, 2025

saucecontrol commented Jan 19, 2025

JIT: Improve x86 unsigned to floating cast codegen #111595

Are you sure you want to change the base?

JIT: Improve x86 unsigned to floating cast codegen #111595

Conversation

saucecontrol commented Jan 19, 2025 • edited Loading

saucecontrol commented Jan 19, 2025

saucecontrol commented Jan 19, 2025

saucecontrol commented Jan 19, 2025 •

edited

Loading