[CUDA] Build nhwc ops by default #22648

tianleiwu · 2024-10-29T22:15:11Z

Description

Build cuda nhwc ops by default.
Deprecate --enable_cuda_nhwc_ops in build.py and add --disable_cuda_nhwc_ops option

Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops will be disabled automatically.

Motivation and Context

In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with Tensor Cores, and this could improve performance for vision models.

This is the first step to prefer NHWC for CUDA in 1.21 release. Next step is to do some tests on popular vision models. If it help in most models and devices, set prefer_nhwc=1 as default cuda provider option.

### Description * Build cuda nhwc ops by default. * Deprecate `--enable_cuda_nhwc_ops` in build.py and add `--disable_cuda_nhwc_ops` option Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops will be disabled automatically. ### Motivation and Context In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with Tensor Cores, and this could improve performance for vision models. This is the first step to prefer NHWC for CUDA in 1.21 release. Next step is to do some tests on popular vision models. If it help in most models and devices, set `prefer_nhwc=1` as default cuda provider option.

build cuda nhwc ops by default

cf4460e

tianleiwu requested a review from a team as a code owner October 29, 2024 22:15

tianleiwu marked this pull request as draft October 29, 2024 22:15

tianleiwu added 2 commits October 29, 2024 16:18

update doc

4c40c23

update test

fb77a65

tianleiwu marked this pull request as ready for review October 31, 2024 02:39

tianleiwu changed the title ~~Build cuda nhwc ops by default~~ [CUDA] Build nhwc ops by default Oct 31, 2024

tianleiwu requested review from jywu-msft and snnn October 31, 2024 17:54

jywu-msft requested a review from chilo-ms November 5, 2024 18:08

jywu-msft approved these changes Nov 6, 2024

View reviewed changes

tianleiwu merged commit 72186bb into main Nov 6, 2024
90 of 91 checks passed

tianleiwu deleted the tlwu/build_cuda_nhwc_ops_by_default branch November 6, 2024 17:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Build nhwc ops by default #22648

[CUDA] Build nhwc ops by default #22648

tianleiwu commented Oct 29, 2024 •

edited

Loading

[CUDA] Build nhwc ops by default #22648

[CUDA] Build nhwc ops by default #22648

Conversation

tianleiwu commented Oct 29, 2024 • edited Loading

Description

Motivation and Context

tianleiwu commented Oct 29, 2024 •

edited

Loading