Skip to content

[SYCL][ROCm] Add HIP NVIDIA support #4049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 14, 2021
Merged

[SYCL][ROCm] Add HIP NVIDIA support #4049

merged 3 commits into from
Jul 14, 2021

Conversation

npmiller
Copy link
Contributor

@npmiller npmiller commented Jul 2, 2021

This PR enables building the PI ROCm plugin for NVIDIA rather than AMD, has ROCm has some NVIDIA support.

It should allow doing some testing of the ROCm PI plugin on NVIDIA hardware.

This PR includes:

  • Refactoring of the ROCm PI plugin CMake and wiring for ROCm NVIDIA
  • Buildbot configure script update to add --rocm-platform flag
  • Changes to the PI ROCm plugin that was using HIP in an AMD specific way in some places

We are still running some testing on this, however it looks to be similar to the current ROCm plugin on AMD, and a bit better on NVIDIA since the SYCL compiler for it is more mature.

@npmiller npmiller requested review from bader, pvchupin, smaslov-intel and a team as code owners July 2, 2021 16:19
bader
bader previously approved these changes Jul 5, 2021
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in buildbot/configure.py and sycl/doc/GetStartedGuide.md look good to me.

@bader
Copy link
Contributor

bader commented Jul 5, 2021

@malixian, FYI.

pvchupin
pvchupin previously approved these changes Jul 6, 2021
@bader
Copy link
Contributor

bader commented Jul 7, 2021

@smaslov-intel, @intel/llvm-reviewers-runtime, ping.

@bader
Copy link
Contributor

bader commented Jul 7, 2021

@npmiller, how can I run sycl/test/on-device tests using ROCm plug-in on NVIDIA HW?
NOTE: there is a special CMake target for running in-tree LIT tests on CUDA back-end - https://github.com/intel/llvm/blob/sycl/sycl/test/CMakeLists.txt#L66-L79 (on-device tests are added to this target here: https://github.com/intel/llvm/blob/sycl/sycl/test/on-device/CMakeLists.txt#L30-L42).

I would expect that there should be CMake target like this for running lit tests with ROCm plug-in on AMD/NVIDIA HW. Right?
Is there another way to run these tests?

@npmiller
Copy link
Contributor Author

npmiller commented Jul 7, 2021

@npmiller, how can I run sycl/test/on-device tests using ROCm plug-in on NVIDIA HW?
NOTE: there is a special CMake target for running in-tree LIT tests on CUDA back-end - https://github.com/intel/llvm/blob/sycl/sycl/test/CMakeLists.txt#L66-L79 (on-device tests are added to this target here: https://github.com/intel/llvm/blob/sycl/sycl/test/on-device/CMakeLists.txt#L30-L42).

I would expect that there should be CMake target like this for running lit tests with ROCm plug-in on AMD/NVIDIA HW. Right?
Is there another way to run these tests?

This is not wired up for the ROCm backend yet. It's definitely something we need however this PR was already getting pretty massive, so I was thinking about adding it in a follow up PR. It'll take a bit of time but if you prefer I can add it in here instead.

@bader
Copy link
Contributor

bader commented Jul 7, 2021

@npmiller, how can I run sycl/test/on-device tests using ROCm plug-in on NVIDIA HW?
NOTE: there is a special CMake target for running in-tree LIT tests on CUDA back-end - https://github.com/intel/llvm/blob/sycl/sycl/test/CMakeLists.txt#L66-L79 (on-device tests are added to this target here: https://github.com/intel/llvm/blob/sycl/sycl/test/on-device/CMakeLists.txt#L30-L42).
I would expect that there should be CMake target like this for running lit tests with ROCm plug-in on AMD/NVIDIA HW. Right?
Is there another way to run these tests?

This is not wired up for the ROCm backend yet. It's definitely something we need however this PR was already getting pretty massive, so I was thinking about adding it in a follow up PR. It'll take a bit of time but if you prefer I can add it in here instead.

Separate PR sounds good to me. I just wanted to make sure that there is some way to test the ROCm backend. Thanks!

@alexbatashev
Copy link
Contributor

Just curious. @npmiller would it make sense to build rocm plugin for both platforms at the same time and present them as different backends?

@npmiller
Copy link
Contributor Author

npmiller commented Jul 8, 2021

Just curious. @npmiller would it make sense to build rocm plugin for both platforms at the same time and present them as different backends?

It could be, although I think that might make the CMake more awkward and require a bunch of glue in the C++ as well, like all the rocm_pi functions would probably need different prefixes depending on the platform and so on.

Ultimately the NVIDIA support through ROCm is mostly useful for debugging and testing at the moment, for running actual applications on NVIDIA GPUs the CUDA backend should definitely be preferred, HIP goes right back to CUDA anyway.

And building both the CUDA and ROCm backend (for either platform) works fine, so you can have a build that can target both platforms already.

I guess being able to build both of the platforms at the same time would make it easier to validate that a patch doesn't break building either, but overall I'm not too sure it's worth the extra complexity in the plugin or in the CMake.

@bader
Copy link
Contributor

bader commented Jul 9, 2021

@smaslov-intel, @intel/llvm-reviewers-runtime, ping.

@npmiller npmiller dismissed stale reviews from pvchupin and bader via fc486ab July 9, 2021 09:28
The double quotes around AMD and NVIDIA were incorrect, they were
getting processed as part of the string thus breaking the string
equality check.
@bader bader requested a review from vladimirlaz July 9, 2021 10:55
@bader
Copy link
Contributor

bader commented Jul 13, 2021

@smaslov-intel, ping.

@smaslov-intel
Copy link
Contributor

This PR includes:

  • Refactoring of the ROCm PI plugin CMake and wiring for ROCm NVIDIA
  • Buildbot configure script update to add --rocm-platform flag
  • Changes to the PI ROCm plugin that was using HIP in an AMD specific way in some places

I approve the changes as is, but would prefer these be broken into multiple change-sets.

@bader bader merged commit bfbd5af into intel:sycl Jul 14, 2021
bader pushed a commit that referenced this pull request Jul 20, 2021
I'm not sure why these were left to `1` but this patch fixes some of the tests in [oneAPI-DirectProgramming
](https://github.com/zjin-lcf/oneAPI-DirectProgramming), such as the matrix multiply and mandelbrot samples.

With this patch the samples are now giving correct results on both AMD GPU, and on Nvidia GPU with the ROCm backend (using #4049).
@bader bader added the hip Issues related to execution on HIP backend. label Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hip Issues related to execution on HIP backend.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants