Skip to content

[CI] Run build on Ubuntu 22, pre-commit if CUDA adapter changes. #17757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2025

Conversation

uditagarwal97
Copy link
Contributor

@uditagarwal97 uditagarwal97 commented Mar 31, 2025

Our Ubuntu 22.04 container has CUDA 12.1 installed while Ubuntu 24.04 image has CUDA 12.6.1 installed.
If a PR changes CUDA adapter, we should test the change with both CUDA versions.

@uditagarwal97 uditagarwal97 self-assigned this Mar 31, 2025
@uditagarwal97 uditagarwal97 requested a review from a team as a code owner March 31, 2025 20:38
@@ -52,6 +52,22 @@ jobs:
changes: ${{ needs.detect_changes.outputs.filters }}
e2e_binaries_artifact: sycl_e2e_bin_default

# If a PR changes CUDA adapter, run the build on Ubuntu 22.04 as well.
# Ubuntu 22.04 container has CUDA 12.1 installed while Ubuntu 24.0 image
# has CUDA 12.6.1 installed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get it right, patches with changes in CUDA adaptor will be tested with both CUDA 12.6.1 and CUDA 12.1 (at least build + LIT?).
In addition, I saw CUDA 12.8 is installed on (at least) one of Windows runners.
Do intentionally use different CUDA versions in different workflows (or different machines)?

@npmiller, FYI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get it right, patches with changes in CUDA adaptor will be tested with both CUDA 12.6.1 and CUDA 12.1 (at least build + LIT?).

Yes, for Linux.

In addition, I saw CUDA 12.8 is installed on (at least) one of Windows runners.
Do intentionally use different CUDA versions in different workflows (or different machines)?

@sarnex Since you installed CUDA on Windows runners, do we plan to have CUDA 12.8 uniformly on all Windows runners? CodePlay's UR CUDA adapter workflow uses CUDA 12.4, so I'm not sure if CUDA 12.8 is even extensively tested

Copy link
Contributor

@sarnex sarnex Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah im installing cuda 12.8 on all windows runners, but we dont have any nvidia windows gpu machines, so its basically build only testing

@uditagarwal97 uditagarwal97 merged commit 5b35190 into sycl Apr 4, 2025
24 checks passed
@aarongreig
Copy link
Contributor

I'm seeing a fail that I think might be related to this change, this error state https://github.com/intel/llvm/actions/runs/14263534349/job/39981394795?pr=17571#step:17:323 is reached based on the definition of CUDA_VERSION

if (Limits->maxPoolableSize > 0) {
which seems to differ from what's reported by the device at runtime https://github.com/intel/llvm/actions/runs/14263534349/job/39981394795?pr=17571#step:14:25

@Seanst98
Copy link
Contributor

Seanst98 commented Apr 4, 2025

I'm seeing a fail that I think might be related to this change, this error state https://github.com/intel/llvm/actions/runs/14263534349/job/39981394795?pr=17571#step:17:323 is reached based on the definition of CUDA_VERSION

if (Limits->maxPoolableSize > 0) {

which seems to differ from what's reported by the device at runtime https://github.com/intel/llvm/actions/runs/14263534349/job/39981394795?pr=17571#step:14:25

We decided that in this circumstance, it would be simplest to warn the user that the property will not be used instead of throwing an error. See this PR which changes to a warning: #17863

@bader bader deleted the cuda_adapter_ci branch April 4, 2025 22:43
kbenzie pushed a commit that referenced this pull request Apr 7, 2025
…7863)

Warn users over setting the memory pool maximum size property instead of
error.

This should aid CI which is affected by the memory_pool.cpp e2e test
which expects to be able to set this property despite DPC++ being built
with a CUDA version that is too old.

See this CI error:
#17757 (comment)

This PR relies on #17095 to be merged
for the printing of the warning to be enabled.
kbenzie pushed a commit to oneapi-src/unified-runtime that referenced this pull request Apr 8, 2025
Warn users over setting the memory pool maximum size property instead of
error.

This should aid CI which is affected by the memory_pool.cpp e2e test
which expects to be able to set this property despite DPC++ being built
with a CUDA version that is too old.

See this CI error:
intel/llvm#17757 (comment)

This PR relies on intel/llvm#17095 to be merged
for the printing of the warning to be enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants