Skip to content

Commit

Permalink
autovec specialization framework (#3393)
Browse files Browse the repository at this point in the history
Summary:

X-link: facebookresearch/FBGEMM#481

To have auto-vectorized code be competitive with asmjit we need to specialize the generic code to a some fixed parameters. We cannot specialize at runtime, so this introduce a framework to specialize for a given set of parameters at compile time and choose between existing specializations at runtime.

The framework added here allows to specify lines like the following for a given function.
Each parameter can be set to `var` to not specialize it or `fixed(C)` to create a specialized version with that parameter set to the constant value `C`. Example:

```
SPECIALIZE(
      /*BIT_RATE=*/fixed(2),
      /*BLOCK_SIZE=*/var,
      /*HAS_WEIGHT=*/fixed(true),
      /*NORMALIZE_BY_LENGTHS=*/var,
      /*PREFETCH=*/var,
      /*IS_WEIGHT_POSITIONAL=*/var,
      /*USE_OFFSETS=*/var,
      /*OUTPUT_STRIDE=*/fixed(int64_t{-1}),
      /*INPUT_STRIDE=*/fixed(int64_t{-1}),
      /*SCALE_BIAS_LAST=*/fixed(true),
      /*NO_BAG=*/fixed(false),
      /*IS_BF16_OUT=*/var,
      /*IS_BF16_IN=*/var)
```

This diff introduces some exemplary specialization for `GenerateEmbeddingSpMDMWithStrides_autovec` and `GenerateEmbeddingSpMDMNBitWithStrides_autovec` specializing them for bit_rate 2, 4 and block sizes 32, 64, 128.

This framework should make it easy to tune for common use-cases in production by specializing the commonly used parameters or remove specializations to conserve code size.

Differential Revision: D62984408
  • Loading branch information
MatzeB authored and facebook-github-bot committed Nov 19, 2024
1 parent d823097 commit 3d01bc8
Show file tree
Hide file tree
Showing 2 changed files with 569 additions and 102 deletions.
8 changes: 8 additions & 0 deletions include/fbgemm/FbgemmBuild.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,14 @@
#define NO_SANITIZE(what)
#endif

// Ignore __builtin_assume() when not supported by compiler.
#ifndef __has_builtin
#define __has_builtin(x) 0
#endif
#if !__has_builtin(__builtin_assume)
#define __builtin_assume(x) (static_cast<void>(0))
#endif

// Macro for silencing warnings
#ifdef __clang__
// clang-format off
Expand Down
Loading

0 comments on commit 3d01bc8

Please sign in to comment.