[SYCL][CUDA] Add a CUDA compatibilty mode #12757

Naghasan · 2024-02-19T11:41:27Z

This patch enables CUDA mode at the same time as the SYCL mode. This allows SYCL code to interact with CUDA code more closely:

A user can call a CUDA device function from a SYCL device one (follow up of [SYCL][CUDA] Support CUDA and SYCL in the same TU #7352)
- The PR fixes overload resolution as the resolution ranking was ambiguous in some cases
- The PR fixes error reporting, some cuda specific delayed diags weren't reported (filtered out)
Defines __CUDA_ARCH__, enabling functions to assume NVPTX is the target

To enable the mode the user adds -fsycl-cuda-compat to the command line. By default this mode is set to off. The flag is only used for the NVPTX backend.

The intent is to help to transition from CUDA to SYCL. Using this mode enable a SYCL application to reuse CUDA functionalities, especially fast paths that are guarded by __CUDA_ARCH__.

This patch enables CUDA mode at the same time as the SYCL mode. This allows the compiler to define CUDA macros and add implicit defines. To enable the mode the user adds -fsycl-cuda-compat to the command line. By default this mode is set to off. The flag is only used for the NVPTX backend. The intent is to help to transition from CUDA to SYCL. Using this mode enable a SYCL application to reuse CUDA functionnalities, especially fast paths that are guarded by __CUDA_ARCH__. Signed-off-by: Victor Lomuller <[email protected]>

jinz2014 · 2024-02-19T21:30:37Z

Users may call CUDA kernels in a SYCL program. Is that right ? Thanks

Naghasan · 2024-02-19T21:37:36Z

Part of the idea is to allow user to call CUDA device functions from a SYCL kernel. The underlying motivation is actually to have a mode that would support the definition of __CUDA_ARCH__.

jinz2014 · 2024-02-20T01:16:11Z

Okay. Would the compiler allow a SYCL program to mix CUDA and SYCL APIs ? Some CUDA APIs have no SYCL equivalents.

When both SYCL and CUDA are enabled, we need to favor SYCL checks in order to avoid false positive. Signed-off-by: Victor Lomuller <[email protected]>

github-actions · 2024-11-18T02:04:44Z

This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

github-actions · 2024-12-19T02:03:34Z

This pull request was closed because it has been stalled for 30 days with no activity.

tahonermann

Completed code review. I haven't looked at the last few tests that exercise overload resolution. I'll look at those early tomorrow.

clang/lib/Basic/Targets/NVPTX.cpp

clang/lib/CodeGen/CodeGenFunction.cpp

clang/lib/Frontend/InitPreprocessor.cpp

clang/lib/Sema/SemaCUDA.cpp

clang/lib/Sema/SemaDecl.cpp

clang/test/SemaSYCL/CudaCompat/delayed-diags.cu

clang/lib/Driver/ToolChains/Clang.cpp

clang/lib/Sema/SemaCUDA.cpp

clang/test/SemaSYCL/CudaCompat/overloads.h

tahonermann

I've completed review. Just waiting to resolve open comments before approving.

tahonermann

Looks good. Thank you for sticking with me through all the comments and helping me to understand the changes!

Naghasan · 2025-03-19T09:04:45Z

@mdtoguchi more comment or is it good for you ?

@elizabethandrews @premanandrao Can you approve ? Tom doesn't have the actual power yet :)

mdtoguchi

LGTM

Naghasan · 2025-03-20T09:12:17Z

@intel/llvm-gatekeepers this is ready to be merged, Jenkins jobs failure is unrelated to the patch (it can't create a node ...)

Naghasan temporarily deployed to WindowsCILock February 19, 2024 11:58 — with GitHub Actions Inactive

Naghasan temporarily deployed to WindowsCILock February 19, 2024 13:10 — with GitHub Actions Inactive

Set nvptx for the host side as well as the sdk version

2219d9a

Naghasan temporarily deployed to WindowsCILock February 26, 2024 21:23 — with GitHub Actions Inactive

Naghasan temporarily deployed to WindowsCILock February 26, 2024 22:32 — with GitHub Actions Inactive

Naghasan added 5 commits March 25, 2024 14:02

Merge branch 'sycl' into victor/sycl-cuda-compat

b0ee79a

Merge branch 'sycl' into victor/sycl-cuda-compat

7ee4507

Fix emission status check ordering

505d87e

When both SYCL and CUDA are enabled, we need to favor SYCL checks in order to avoid false positive. Signed-off-by: Victor Lomuller <[email protected]>

Merge branch 'sycl' into victor/sycl-cuda-compat

a2b7961

Merge branch 'sycl' into victor/sycl-cuda-compat

435af57

Naghasan had a problem deploying to WindowsCILock May 20, 2024 13:24 — with GitHub Actions Failure

Naghasan temporarily deployed to WindowsCILock May 20, 2024 14:28 — with GitHub Actions Inactive

Naghasan added 2 commits May 21, 2024 22:03

Fix merge conflict resolution issue

0ab3529

clang format

fe6b2cc

Naghasan had a problem deploying to WindowsCILock May 21, 2024 22:16 — with GitHub Actions Failure

Naghasan temporarily deployed to WindowsCILock May 21, 2024 22:54 — with GitHub Actions Inactive

github-actions bot added the Stale label Nov 18, 2024

github-actions bot closed this Dec 19, 2024

Naghasan added 3 commits January 13, 2025 15:34

Merge branch 'sycl' into victor/sycl-cuda-compat

e038f12

fix build after rebase

e63fa5d

Merge branch 'sycl' into victor/sycl-cuda-compat

d0019f5

Naghasan mentioned this pull request Jan 15, 2025

Enable cuda compatibilty mode with SYCL codeplaysoftware/cutlass-sycl#185

Closed

Naghasan added 2 commits January 20, 2025 13:43

Merge branch 'sycl' into victor/sycl-cuda-compat

1129db5

fix broken cuda build mode

958c48a