autovec specialization framework #3393

MatzeB · 2024-11-19T01:08:02Z

Summary:
To have auto-vectorized code be competitive with asmjit we need to specialize the generic code to a some fixed parameters. We cannot specialize at runtime, so this introduce a framework to specialize for a given set of parameters at compile time and choose between existing specializations at runtime.

The framework added here allows to specify lines like the following for a given function.
Each parameter can be set to var to not specialize it or fixed(C) to create a specialized version with that parameter set to the constant value C. Example:

SPECIALIZE(
      /*BIT_RATE=*/fixed(2),
      /*BLOCK_SIZE=*/var,
      /*HAS_WEIGHT=*/fixed(true),
      /*NORMALIZE_BY_LENGTHS=*/var,
      /*PREFETCH=*/var,
      /*IS_WEIGHT_POSITIONAL=*/var,
      /*USE_OFFSETS=*/var,
      /*OUTPUT_STRIDE=*/fixed(int64_t{-1}),
      /*INPUT_STRIDE=*/fixed(int64_t{-1}),
      /*SCALE_BIAS_LAST=*/fixed(true),
      /*NO_BAG=*/fixed(false),
      /*IS_BF16_OUT=*/var,
      /*IS_BF16_IN=*/var)

This diff introduces some exemplary specialization for GenerateEmbeddingSpMDMWithStrides_autovec and GenerateEmbeddingSpMDMNBitWithStrides_autovec specializing them for bit_rate 2, 4 and block sizes 32, 64, 128.

This framework should make it easy to tune for common use-cases in production by specializing the commonly used parameters or remove specializations to conserve code size.

Differential Revision: D62984408

netlify · 2024-11-19T01:08:23Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`53aec86`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/673e2cd71b55570008d4b105
😎 Deploy Preview	https://deploy-preview-3393--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot · 2024-11-19T01:08:30Z

This pull request was exported from Phabricator. Differential Revision: D62984408

Summary: X-link: facebookresearch/FBGEMM#481 To have auto-vectorized code be competitive with asmjit we need to specialize the generic code to a some fixed parameters. We cannot specialize at runtime, so this introduce a framework to specialize for a given set of parameters at compile time and choose between existing specializations at runtime. The framework added here allows to specify lines like the following for a given function. Each parameter can be set to `var` to not specialize it or `fixed(C)` to create a specialized version with that parameter set to the constant value `C`. Example: ``` SPECIALIZE( /*BIT_RATE=*/fixed(2), /*BLOCK_SIZE=*/var, /*HAS_WEIGHT=*/fixed(true), /*NORMALIZE_BY_LENGTHS=*/var, /*PREFETCH=*/var, /*IS_WEIGHT_POSITIONAL=*/var, /*USE_OFFSETS=*/var, /*OUTPUT_STRIDE=*/fixed(int64_t{-1}), /*INPUT_STRIDE=*/fixed(int64_t{-1}), /*SCALE_BIAS_LAST=*/fixed(true), /*NO_BAG=*/fixed(false), /*IS_BF16_OUT=*/var, /*IS_BF16_IN=*/var) ``` This diff introduces some exemplary specialization for `GenerateEmbeddingSpMDMWithStrides_autovec` and `GenerateEmbeddingSpMDMNBitWithStrides_autovec` specializing them for bit_rate 2, 4 and block sizes 32, 64, 128. This framework should make it easy to tune for common use-cases in production by specializing the commonly used parameters or remove specializations to conserve code size. Differential Revision: D62984408

facebook-github-bot · 2024-11-19T01:24:37Z

This pull request was exported from Phabricator. Differential Revision: D62984408

Summary: X-link: facebookresearch/FBGEMM#481 To have auto-vectorized code be competitive with asmjit we need to specialize the generic code to a some fixed parameters. We cannot specialize at runtime, so this introduce a framework to specialize for a given set of parameters at compile time and choose between existing specializations at runtime. The framework added here allows to specify lines like the following for a given function. Each parameter can be set to `var` to not specialize it or `fixed(C)` to create a specialized version with that parameter set to the constant value `C`. Example: ``` SPECIALIZE( /*BIT_RATE=*/fixed(2), /*BLOCK_SIZE=*/var, /*HAS_WEIGHT=*/fixed(true), /*NORMALIZE_BY_LENGTHS=*/var, /*PREFETCH=*/var, /*IS_WEIGHT_POSITIONAL=*/var, /*USE_OFFSETS=*/var, /*OUTPUT_STRIDE=*/fixed(int64_t{-1}), /*INPUT_STRIDE=*/fixed(int64_t{-1}), /*SCALE_BIAS_LAST=*/fixed(true), /*NO_BAG=*/fixed(false), /*IS_BF16_OUT=*/var, /*IS_BF16_IN=*/var) ``` This diff introduces some exemplary specialization for `GenerateEmbeddingSpMDMWithStrides_autovec` and `GenerateEmbeddingSpMDMNBitWithStrides_autovec` specializing them for bit_rate 2, 4 and block sizes 32, 64, 128. This framework should make it easy to tune for common use-cases in production by specializing the commonly used parameters or remove specializations to conserve code size. Reviewed By: excelle08 Differential Revision: D62984408

facebook-github-bot · 2024-11-20T18:39:28Z

This pull request was exported from Phabricator. Differential Revision: D62984408

facebook-github-bot · 2024-11-21T23:06:01Z

This pull request has been merged in 7c35026.

facebook-github-bot added the cla signed label Nov 19, 2024

facebook-github-bot added the fb-exported label Nov 19, 2024

MatzeB force-pushed the export-D62984408 branch from 44dd829 to 3d01bc8 Compare November 19, 2024 01:23

MatzeB force-pushed the export-D62984408 branch from 3d01bc8 to 53aec86 Compare November 20, 2024 18:39

facebook-github-bot closed this in 7c35026 Nov 21, 2024

facebook-github-bot added the Merged label Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autovec specialization framework #3393

autovec specialization framework #3393

MatzeB commented Nov 19, 2024

netlify bot commented Nov 19, 2024 •

edited

Loading

facebook-github-bot commented Nov 19, 2024

facebook-github-bot commented Nov 19, 2024

facebook-github-bot commented Nov 20, 2024

facebook-github-bot commented Nov 21, 2024

autovec specialization framework #3393

autovec specialization framework #3393

Conversation

MatzeB commented Nov 19, 2024

netlify bot commented Nov 19, 2024 • edited Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

facebook-github-bot commented Nov 19, 2024

facebook-github-bot commented Nov 19, 2024

facebook-github-bot commented Nov 20, 2024

facebook-github-bot commented Nov 21, 2024

netlify bot commented Nov 19, 2024 •

edited

Loading