-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CUDA custom op unit tests to account for recent ORT change #6971
Conversation
Assuming doco on https://www.onnxruntime.ai/docs/how-to/add-custom-op.html will be updated separately to describe what's necessary to create a working CUDA based custom op and why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will come in a separate change |
@@ -319,7 +319,9 @@ set (onnxruntime_shared_lib_test_SRC | |||
${ONNXRUNTIME_SHARED_LIB_TEST_SRC_DIR}/test_model_loading.cc | |||
${ONNXRUNTIME_SHARED_LIB_TEST_SRC_DIR}/test_ort_format_models.cc | |||
${ONNXRUNTIME_SHARED_LIB_TEST_SRC_DIR}/utils.h | |||
${ONNXRUNTIME_SHARED_LIB_TEST_SRC_DIR}/utils.cc) | |||
${ONNXRUNTIME_SHARED_LIB_TEST_SRC_DIR}/utils.cc | |||
${ONNXRUNTIME_SHARED_LIB_TEST_SRC_DIR}/custom_op_utils.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 2 tabs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will combine with another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description: #3723 made a change to move the hand-crafted ORT CUDA kernels from the per-thread default stream to a compute stream that ORT created by default or a user provided compute stream. This change will require exposing an interface in custom ops to obtain ORT's compute stream or at the very least update the CUDA custom op unit tests so as to show that a user needs to use the same compute stream for both the ORT kernels (passed in via session options) and their custom kernels for synchronization. This change updates the unit tests.
Motivation and Context
The unit tests will work as is as they are just single node toy models but if users were to adapt existing custom op implementations for their models that will require interleaving with ORT's existing CUDA kernels, they will lack synchronization and result in garbage output. Hence, this change for illustrative purposes.
(Takeaway from #6952)