Skip to content

[naga] Prove to downstream shader compilers that loops terminate #6572

Open
@teoxoy

Description

The Metal compiler and DXC are based on clang and inherit the "Infinite loop without side-effects is UB" from C++. SPIR-V also requires shader invocations to terminate.

The fact that "all loops must terminate" is a requirement of downstream shader compilers, but they might not at runtime gets us into trouble. They are allowed to make the assumption that loops do terminate which has far-reaching consequences. See comments in #6528 for the whole background.

WebGPU requires loops to terminate gpuweb/gpuweb#3126 or the user agent might lose the device. The issue is that it's statically unprovable that a loop terminates (in all cases) so this can't be a check we do. We must emit loops that might not terminate but if we do, we trigger UB in downstream shader compilers.

To avoid triggering UB in downstream shader compilers we must prove to them that loops terminate or that they have side-effects.

The only way to we've found to avoid the UB via side-effects is to loop based on a volatile bool (originally implemented in tint). Open question: Are there other ways we could artificially introduce side-effects that prevent the UB?

This was done for Metal in #6545 but it prevents other meaningful optimizations like inlining. A previous iteration of this where the check was only happening before the loop was found to be very slow #6518 (comment), the new check is probably going to be extremely slow since it's happening on every loop iteration.

I'm proposing that we inject a counter that puts an upper bound on the number of loop iterations so that downstream shader compilers will see that the loop does terminate (even if it will take a really long time). We can start with an upper bound of u64::MAX (using 2 u32s as outlined in #6528 (comment)) and see if we can get away with a single u32 later. We can have this limit even if it's not part of the WGSL spec since drivers will end up terminating the invocations and lose the device after a certain amount of time has passed; which will certainly happen before we loop u64::MAX times.

Doing it this way should be much faster than reading a volatile every loop iteration and still allows other optimizations to see the loop might terminate a lot earlier so that it can even be inlined; see #6528 (comment).


Checklist

  • MSL
  • HLSL
  • SPIR-V
  • GLSL?

Metadata

Assignees

No one assigned

    Labels

    area: naga back-endOutputs of naga shader conversionnagaShader Translator

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions