Skip to content

[SYCL][CUDA] Add a CUDA compatibilty mode #12757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Mar 20, 2025

Conversation

Naghasan
Copy link
Contributor

@Naghasan Naghasan commented Feb 19, 2024

This patch enables CUDA mode at the same time as the SYCL mode. This allows SYCL code to interact with CUDA code more closely:

  • A user can call a CUDA device function from a SYCL device one (follow up of [SYCL][CUDA] Support CUDA and SYCL in the same TU #7352)
    • The PR fixes overload resolution as the resolution ranking was ambiguous in some cases
    • The PR fixes error reporting, some cuda specific delayed diags weren't reported (filtered out)
  • Defines __CUDA_ARCH__, enabling functions to assume NVPTX is the target

To enable the mode the user adds -fsycl-cuda-compat to the command line. By default this mode is set to off. The flag is only used for the NVPTX backend.

The intent is to help to transition from CUDA to SYCL. Using this mode enable a SYCL application to reuse CUDA functionalities, especially fast paths that are guarded by __CUDA_ARCH__.

This patch enables CUDA mode at the same time as the SYCL mode.
This allows the compiler to define CUDA macros and add implicit defines.

To enable the mode the user adds -fsycl-cuda-compat to the command line.
By default this mode is set to off. The flag is only used for the NVPTX backend.

The intent is to help to transition from CUDA to SYCL. Using this mode enable
a SYCL application to reuse CUDA functionnalities, especially fast paths that
are guarded by __CUDA_ARCH__.

Signed-off-by: Victor Lomuller <[email protected]>
@jinz2014
Copy link
Contributor

Users may call CUDA kernels in a SYCL program. Is that right ? Thanks

@Naghasan
Copy link
Contributor Author

Part of the idea is to allow user to call CUDA device functions from a SYCL kernel. The underlying motivation is actually to have a mode that would support the definition of __CUDA_ARCH__.

@jinz2014
Copy link
Contributor

Okay. Would the compiler allow a SYCL program to mix CUDA and SYCL APIs ? Some CUDA APIs have no SYCL equivalents.

Copy link
Contributor

This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

@github-actions github-actions bot added the Stale label Nov 18, 2024
Copy link
Contributor

This pull request was closed because it has been stalled for 30 days with no activity.

@github-actions github-actions bot closed this Dec 19, 2024
Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed code review. I haven't looked at the last few tests that exercise overload resolution. I'll look at those early tomorrow.

Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed review. Just waiting to resolve open comments before approving.

Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank you for sticking with me through all the comments and helping me to understand the changes!

@Naghasan
Copy link
Contributor Author

@mdtoguchi more comment or is it good for you ?

@elizabethandrews @premanandrao Can you approve ? Tom doesn't have the actual power yet :)

Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Naghasan
Copy link
Contributor Author

@intel/llvm-gatekeepers this is ready to be merged, Jenkins jobs failure is unrelated to the patch (it can't create a node ...)

@sommerlukas sommerlukas merged commit 37a7b45 into intel:sycl Mar 20, 2025
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants