Skip to content

[SYCL][CUDA] Update program manager and queue to resolve multi-targeting issues #4921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 30, 2021

Conversation

AidanBeltonS
Copy link
Contributor

This PR makes two changes, the first is it moves the macro which prevents __devicelib_assert_read being used for nvptx64 devices. This is done to resolve an issue where the binary images of spirv64 and nvptx64 are neither identical nor disjoint (have no kernels in common). The program manager needs binary images to be identical or disjoint, it fails otherwise. This creates a kernel of the same name as when building for spirv64 but it does not use __devicelib_assert_read.
The second it prevents errors being thrown in the program manager when the binaries compatibility check returns false. This is to allow for multi-targeting to be used with module splitting.
A cuda and hip only regression test is added to check for successful compilation with multi-targeting and module splitting options.

Proposed solution to: #3631

PIDeviceHandle, &DevBin,
/*num bin images = */ (cl_uint)1, &SuitableImageID);
if (Error != PI_SUCCESS && Error != PI_INVALID_BINARY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are effects of allowing invalid binary for the caller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PI_INVALID_BINARY is returned when piextDeviceSelectBinary cannot find any suitable image within the passed list. This is a valid response, in this case, as it is checking if the binary is suitable for the plugin.

@@ -67,7 +67,7 @@

// Helper macro to identify if fallback assert is needed
// FIXME remove __NVPTX__ condition once devicelib supports CUDA
#if !defined(SYCL_DISABLE_FALLBACK_ASSERT) && !defined(__NVPTX__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition shouldn't be modified at the moment as it's due CUDA native support of assertions.

Copy link
Contributor Author

@AidanBeltonS AidanBeltonS Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was something I was struggling with for a while. When building for both cuda and opencl it can fail, based on the target order (#3631), because there is a kernel in spirv64 which is not in nvptx64 _ZTSN2cl4sycl6detail23__sycl_service_kernel__16AssertInfoCopierE. The program_manager requires binary images to either be identical or disjoint. So I am proposing removing this to create a version of this kernel which does not perform any action; similar to amdgcn which will support native asserts but generates this kernel. I am not seeing any failing tests as a result of this change. Please let me know if there is a better way of achieving this. Many thanks!

@vladimirlaz vladimirlaz requested a review from s-kanaev November 17, 2021 08:38
@vladimirlaz
Copy link
Contributor

@s-kanaev, are you ok with the responses to your comments?

@bader
Copy link
Contributor

bader commented Nov 30, 2021

@s-kanaev, ping.

Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the changes.

@AidanBeltonS
Copy link
Contributor Author

PR to add device testing to llvm-test-suite: intel/llvm-test-suite#593

@AidanBeltonS AidanBeltonS deleted the multi-targeting branch December 1, 2021 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants