[SYCL][ROCm] Add HIP NVIDIA support #4049

npmiller · 2021-07-02T16:19:43Z

This PR enables building the PI ROCm plugin for NVIDIA rather than AMD, has ROCm has some NVIDIA support.

It should allow doing some testing of the ROCm PI plugin on NVIDIA hardware.

This PR includes:

Refactoring of the ROCm PI plugin CMake and wiring for ROCm NVIDIA
Buildbot configure script update to add --rocm-platform flag
Changes to the PI ROCm plugin that was using HIP in an AMD specific way in some places

We are still running some testing on this, however it looks to be similar to the current ROCm plugin on AMD, and a bit better on NVIDIA since the SYCL compiler for it is more mature.

bader

Changes in buildbot/configure.py and sycl/doc/GetStartedGuide.md look good to me.

bader · 2021-07-05T14:39:46Z

@malixian, FYI.

bader · 2021-07-07T06:43:51Z

@smaslov-intel, @intel/llvm-reviewers-runtime, ping.

bader · 2021-07-07T15:58:10Z

@npmiller, how can I run sycl/test/on-device tests using ROCm plug-in on NVIDIA HW?
NOTE: there is a special CMake target for running in-tree LIT tests on CUDA back-end - https://github.com/intel/llvm/blob/sycl/sycl/test/CMakeLists.txt#L66-L79 (on-device tests are added to this target here: https://github.com/intel/llvm/blob/sycl/sycl/test/on-device/CMakeLists.txt#L30-L42).

I would expect that there should be CMake target like this for running lit tests with ROCm plug-in on AMD/NVIDIA HW. Right?
Is there another way to run these tests?

npmiller · 2021-07-07T16:09:53Z

@npmiller, how can I run sycl/test/on-device tests using ROCm plug-in on NVIDIA HW?
NOTE: there is a special CMake target for running in-tree LIT tests on CUDA back-end - https://github.com/intel/llvm/blob/sycl/sycl/test/CMakeLists.txt#L66-L79 (on-device tests are added to this target here: https://github.com/intel/llvm/blob/sycl/sycl/test/on-device/CMakeLists.txt#L30-L42).

I would expect that there should be CMake target like this for running lit tests with ROCm plug-in on AMD/NVIDIA HW. Right?
Is there another way to run these tests?

This is not wired up for the ROCm backend yet. It's definitely something we need however this PR was already getting pretty massive, so I was thinking about adding it in a follow up PR. It'll take a bit of time but if you prefer I can add it in here instead.

bader · 2021-07-07T16:15:05Z

@npmiller, how can I run sycl/test/on-device tests using ROCm plug-in on NVIDIA HW?
NOTE: there is a special CMake target for running in-tree LIT tests on CUDA back-end - https://github.com/intel/llvm/blob/sycl/sycl/test/CMakeLists.txt#L66-L79 (on-device tests are added to this target here: https://github.com/intel/llvm/blob/sycl/sycl/test/on-device/CMakeLists.txt#L30-L42).
I would expect that there should be CMake target like this for running lit tests with ROCm plug-in on AMD/NVIDIA HW. Right?
Is there another way to run these tests?

This is not wired up for the ROCm backend yet. It's definitely something we need however this PR was already getting pretty massive, so I was thinking about adding it in a follow up PR. It'll take a bit of time but if you prefer I can add it in here instead.

Separate PR sounds good to me. I just wanted to make sure that there is some way to test the ROCm backend. Thanks!

alexbatashev · 2021-07-07T16:39:46Z

Just curious. @npmiller would it make sense to build rocm plugin for both platforms at the same time and present them as different backends?

npmiller · 2021-07-08T09:52:55Z

Just curious. @npmiller would it make sense to build rocm plugin for both platforms at the same time and present them as different backends?

It could be, although I think that might make the CMake more awkward and require a bunch of glue in the C++ as well, like all the rocm_pi functions would probably need different prefixes depending on the platform and so on.

Ultimately the NVIDIA support through ROCm is mostly useful for debugging and testing at the moment, for running actual applications on NVIDIA GPUs the CUDA backend should definitely be preferred, HIP goes right back to CUDA anyway.

And building both the CUDA and ROCm backend (for either platform) works fine, so you can have a build that can target both platforms already.

I guess being able to build both of the platforms at the same time would make it easier to validate that a patch doesn't break building either, but overall I'm not too sure it's worth the extra complexity in the plugin or in the CMake.

bader · 2021-07-09T06:19:06Z

@smaslov-intel, @intel/llvm-reviewers-runtime, ping.

sycl/plugins/rocm/CMakeLists.txt

sycl/tools/CMakeLists.txt

Co-authored-by: vladimirlaz <[email protected]>

The double quotes around AMD and NVIDIA were incorrect, they were getting processed as part of the string thus breaking the string equality check.

bader · 2021-07-13T10:02:56Z

@smaslov-intel, ping.

smaslov-intel · 2021-07-13T16:26:58Z

This PR includes:

Refactoring of the ROCm PI plugin CMake and wiring for ROCm NVIDIA

Buildbot configure script update to add --rocm-platform flag

Changes to the PI ROCm plugin that was using HIP in an AMD specific way in some places

I approve the changes as is, but would prefer these be broken into multiple change-sets.

I'm not sure why these were left to `1` but this patch fixes some of the tests in [oneAPI-DirectProgramming ](https://github.com/zjin-lcf/oneAPI-DirectProgramming), such as the matrix multiply and mandelbrot samples. With this patch the samples are now giving correct results on both AMD GPU, and on Nvidia GPU with the ROCm backend (using #4049).

[SYCL][ROCm] Add HIP NVIDIA support

d178145

npmiller requested review from bader, pvchupin, smaslov-intel and a team as code owners July 2, 2021 16:19

bader previously approved these changes Jul 5, 2021

View reviewed changes

pvchupin previously approved these changes Jul 6, 2021

View reviewed changes

npmiller mentioned this pull request Jul 6, 2021

[SYCL][ROCm] Fix kernel launch with multiple dimensions #4063

Merged

vladimirlaz suggested changes Jul 9, 2021

View reviewed changes

sycl/plugins/rocm/CMakeLists.txt Outdated Show resolved Hide resolved

sycl/plugins/rocm/CMakeLists.txt Outdated Show resolved Hide resolved

sycl/tools/CMakeLists.txt Outdated Show resolved Hide resolved

sycl/tools/CMakeLists.txt Outdated Show resolved Hide resolved

Apply suggestions from code review to fix casing of ROCm

fc486ab

Co-authored-by: vladimirlaz <[email protected]>

npmiller dismissed stale reviews from pvchupin and bader via fc486ab July 9, 2021 09:28

[SYCL][ROCm] Add comments and fix generator expressions

7d85b49

The double quotes around AMD and NVIDIA were incorrect, they were getting processed as part of the string thus breaking the string equality check.

bader requested a review from vladimirlaz July 9, 2021 10:55

vladimirlaz approved these changes Jul 9, 2021

View reviewed changes

bader approved these changes Jul 9, 2021

View reviewed changes

smaslov-intel approved these changes Jul 13, 2021

View reviewed changes

bader merged commit bfbd5af into intel:sycl Jul 14, 2021

bader added the hip Issues related to execution on HIP backend. label Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][ROCm] Add HIP NVIDIA support #4049

[SYCL][ROCm] Add HIP NVIDIA support #4049

Uh oh!

npmiller commented Jul 2, 2021

Uh oh!

bader left a comment

Uh oh!

bader commented Jul 5, 2021

Uh oh!

bader commented Jul 7, 2021

Uh oh!

bader commented Jul 7, 2021

Uh oh!

npmiller commented Jul 7, 2021

Uh oh!

bader commented Jul 7, 2021

Uh oh!

alexbatashev commented Jul 7, 2021

Uh oh!

npmiller commented Jul 8, 2021

Uh oh!

bader commented Jul 9, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bader commented Jul 13, 2021

Uh oh!

smaslov-intel commented Jul 13, 2021

Uh oh!

Uh oh!

[SYCL][ROCm] Add HIP NVIDIA support #4049

[SYCL][ROCm] Add HIP NVIDIA support #4049

Uh oh!

Conversation

npmiller commented Jul 2, 2021

Uh oh!

bader left a comment

Choose a reason for hiding this comment

Uh oh!

bader commented Jul 5, 2021

Uh oh!

bader commented Jul 7, 2021

Uh oh!

bader commented Jul 7, 2021

Uh oh!

npmiller commented Jul 7, 2021

Uh oh!

bader commented Jul 7, 2021

Uh oh!

alexbatashev commented Jul 7, 2021

Uh oh!

npmiller commented Jul 8, 2021

Uh oh!

bader commented Jul 9, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bader commented Jul 13, 2021

Uh oh!

smaslov-intel commented Jul 13, 2021

Uh oh!

Uh oh!