Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Implementation of fallback assert #3767

Merged
merged 205 commits into from
Sep 1, 2021

Conversation

s-kanaev
Copy link
Contributor

@s-kanaev s-kanaev commented May 17, 2021

Review documentation in #3461
Dependencies:

Regressions are workarounded/fixed in intel/llvm-test-suite#332, intel/llvm-test-suite#357, intel/llvm-test-suite#360
Exceptions report by host-task is included here from #4087

Sergey Kanaev added 30 commits March 31, 2021 17:07
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
@s-kanaev
Copy link
Contributor Author

@vzakhari , @jinge90 , could you, please, review devicelib changes?

@dm-vodopyanov , @romanovvlad , could you, please, review DPCPP RT changes?

vzakhari
vzakhari previously approved these changes Aug 27, 2021
Sergey Kanaev added 2 commits August 30, 2021 14:51
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
@s-kanaev s-kanaev dismissed stale reviews from vzakhari and gmlueck via a2f2473 August 30, 2021 11:56
@s-kanaev
Copy link
Contributor Author

I couldn't reproduce the spec-consts caching regression locally. Hence, intel/llvm-test-suite#427.

@s-kanaev s-kanaev requested review from gmlueck and vzakhari August 30, 2021 11:57
@gmlueck
Copy link
Contributor

gmlueck commented Aug 30, 2021

I couldn't reproduce the spec-consts caching regression locally. Hence, intel/llvm-test-suite#427.

Can you explain more about the rationale here? It looks like there is a test that fails because of the new assertion fall-back code, and our solution is to disable assertion checks in that test? Doesn't this failing test point to some problem that needs to be fixed?

@s-kanaev
Copy link
Contributor Author

Can you explain more about the rationale here? It looks like there is a test that fails because of the new assertion fall-back code, and our solution is to disable assertion checks in that test? Doesn't this failing test point to some problem that needs to be fixed?

Disabling of fallback assert in the test is temporary and failure is under active investigation at the moment.

@s-kanaev
Copy link
Contributor Author

Failure on CUDA isn't related with the changes as it's observed during pull-down also: #4388

@s-kanaev
Copy link
Contributor Author

@gmlueck , @vzakhari , @jinge90 , @bader , do you have any objections for this patch to be merged?

FYI, failure on CUDA isn't relevant to this patch and seems to be known issue.

@@ -6,19 +6,19 @@ int main() {
sycl::queue Q;

Q.single_task([] {
// CHECK: tail call spir_func void @_Z21__spirv_MemoryBarrierjj(i32 2, i32 896) #2
// CHECK: tail call spir_func void @_Z21__spirv_MemoryBarrierjj(i32 2, i32 896) #{{.*}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK: tail call spir_func void @_Z21__spirv_MemoryBarrierjj(i32 2, i32 896) #{{.*}}
// CHECK: tail call spir_func void @_Z21__spirv_MemoryBarrierjj(i32 2, i32 896)

@bader bader merged commit 56c9ec4 into intel:sycl Sep 1, 2021
aelovikov-intel added a commit to aelovikov-intel/llvm that referenced this pull request Jan 29, 2024
intel#6837 enabled asynchronous
buffer destruction for buffers constructed without host data. However,
initial fallback assert implementation in
intel#3767 predates it and as such had to
place the buffer inside `queue_impl` to avoid unintended synchronization
point. I don't know if there was the same crash observed on the
end-to-end test added as part of this PR prior to
intel#3767, but it doesn't even matter
because the "new" implementation is both simpler and doesn't result in a
crash.

I suspect that without it (with the buffer for fallback assert
implementation being a data member of `sycl::queue_impl`) we had a
cyclic dependency somewhere leading to resource leak and ultimately to
the assert in `DeviceGlobalUSMMem::~DeviceGlobalUSMMem()`.
aelovikov-intel added a commit that referenced this pull request Jan 31, 2024
#6837 enabled asynchronous buffer
destruction for buffers constructed without host data. However, initial
fallback assert implementation in
#3767 predates it and as such had to
place the buffer inside `queue_impl` to avoid unintended synchronization
point. I don't know if there was the same crash observed on the
end-to-end test added as part of this PR prior to
#3767, but it doesn't even matter
because the "new" implementation is both simpler and doesn't result in a
crash.

I suspect that without it (with the buffer for fallback assert
implementation being a data member of `sycl::queue_impl`) we had a
cyclic dependency somewhere leading to resource leak and ultimately to
the assert in `DeviceGlobalUSMMem::~DeviceGlobalUSMMem()`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants