Skip to content

[E2E][CUDA][HIP][level_zero] Make P2P tests portable. #14095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 19, 2024

Conversation

JackAKirk
Copy link
Contributor

This enables the p2p tests on hip and level_zero.

Now that cuda also supports the multi-device context, P2P programming should be 100% portable across these backends.

@JackAKirk JackAKirk requested a review from nrspruit June 7, 2024 15:47
@JackAKirk JackAKirk marked this pull request as ready for review June 7, 2024 17:05
@JackAKirk JackAKirk requested a review from a team as a code owner June 7, 2024 17:05
@JackAKirk JackAKirk requested a review from dm-vodopyanov June 7, 2024 17:05
@ldrumm ldrumm merged commit ab6c0f5 into intel:sycl Jun 19, 2024
14 checks passed
Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackAKirk , we have internal report of flaky failures on level_zero for this test. I'd like to understand if there is some UB happening here...

for (const auto &value : input) {
h_sum += value;
}

double *d_sum = malloc_shared<double>(1, Queues[0]);
double *d_in = malloc_device<double>(N, Queues[0]);
int *d_sum = malloc_shared<int>(1, Queues[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is P2P allowed for Shared USM allocations? The extensions says

In particular, this allows one device to access USM Device allocations
for a peer device. This extension does not apply to USM Shared allocations.
so either this code is wrong or the extension wasn't updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the extension says

In particular, this allows one device to directly access USM Device
allocations for a peer device in the same context.
, yet we use sycl::queue{D} on line 32. Are we relying on default context here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is P2P allowed for Shared USM allocations? The extensions says

In particular, this allows one device to access USM Device allocations
for a peer device. This extension does not apply to USM Shared allocations.

so either this code is wrong or the extension wasn't updated.

You're right, I should have used device memory. shared memory works in the circumstance for nvidia and amd, but like you point out the extension doesn't support it. I imagine this might cause flaky failures on l0 if it doesn't support it.

Jack

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the extension says

In particular, this allows one device to directly access USM Device
allocations for a peer device in the same context.

, yet we use sycl::queue{D} on line 32. Are we relying on default context here?

Yeah this test relies on the default context and in this sense I believe it is valid code?
Also I think that in real world codes the default context is used virtually always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants