-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][CUDA] Update program manager and queue to resolve multi-targeting issues #4921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PIDeviceHandle, &DevBin, | ||
/*num bin images = */ (cl_uint)1, &SuitableImageID); | ||
if (Error != PI_SUCCESS && Error != PI_INVALID_BINARY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are effects of allowing invalid binary for the caller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PI_INVALID_BINARY
is returned when piextDeviceSelectBinary
cannot find any suitable image within the passed list. This is a valid response, in this case, as it is checking if the binary is suitable for the plugin.
@@ -67,7 +67,7 @@ | |||
|
|||
// Helper macro to identify if fallback assert is needed | |||
// FIXME remove __NVPTX__ condition once devicelib supports CUDA | |||
#if !defined(SYCL_DISABLE_FALLBACK_ASSERT) && !defined(__NVPTX__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition shouldn't be modified at the moment as it's due CUDA native support of assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was something I was struggling with for a while. When building for both cuda and opencl it can fail, based on the target order (#3631), because there is a kernel in spirv64
which is not in nvptx64
_ZTSN2cl4sycl6detail23__sycl_service_kernel__16AssertInfoCopierE
. The program_manager requires binary images to either be identical or disjoint. So I am proposing removing this to create a version of this kernel which does not perform any action; similar to amdgcn
which will support native asserts but generates this kernel. I am not seeing any failing tests as a result of this change. Please let me know if there is a better way of achieving this. Many thanks!
@s-kanaev, are you ok with the responses to your comments? |
@s-kanaev, ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the changes.
PR to add device testing to llvm-test-suite: intel/llvm-test-suite#593 |
This PR makes two changes, the first is it moves the macro which prevents
__devicelib_assert_read
being used fornvptx64
devices. This is done to resolve an issue where the binary images ofspirv64
andnvptx64
are neither identical nor disjoint (have no kernels in common). The program manager needs binary images to be identical or disjoint, it fails otherwise. This creates a kernel of the same name as when building forspirv64
but it does not use__devicelib_assert_read
.The second it prevents errors being thrown in the program manager when the binaries compatibility check returns false. This is to allow for multi-targeting to be used with module splitting.
A cuda and hip only regression test is added to check for successful compilation with multi-targeting and module splitting options.
Proposed solution to: #3631