-
Notifications
You must be signed in to change notification settings - Fork 787
[E2E][CUDA][HIP][level_zero] Make P2P tests portable. #14095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: JackAKirk <[email protected]>
Signed-off-by: JackAKirk <[email protected]>
This reverts commit 9c84333.
Signed-off-by: JackAKirk <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JackAKirk , we have internal report of flaky failures on level_zero for this test. I'd like to understand if there is some UB happening here...
for (const auto &value : input) { | ||
h_sum += value; | ||
} | ||
|
||
double *d_sum = malloc_shared<double>(1, Queues[0]); | ||
double *d_in = malloc_device<double>(N, Queues[0]); | ||
int *d_sum = malloc_shared<int>(1, Queues[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is P2P allowed for Shared USM allocations? The extensions says
In particular, this allows one device to access USM Device allocations | |
for a peer device. This extension does not apply to USM Shared allocations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the extension says
In particular, this allows one device to directly access USM Device | |
allocations for a peer device in the same context. |
sycl::queue{D}
on line 32. Are we relying on default context here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is P2P allowed for Shared USM allocations? The extensions says
In particular, this allows one device to access USM Device allocations for a peer device. This extension does not apply to USM Shared allocations.
so either this code is wrong or the extension wasn't updated.
You're right, I should have used device memory. shared memory works in the circumstance for nvidia and amd, but like you point out the extension doesn't support it. I imagine this might cause flaky failures on l0 if it doesn't support it.
Jack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the extension says
In particular, this allows one device to directly access USM Device allocations for a peer device in the same context.
, yet we usesycl::queue{D}
on line 32. Are we relying on default context here?
Yeah this test relies on the default context and in this sense I believe it is valid code?
Also I think that in real world codes the default context is used virtually always.
This enables the p2p tests on hip and level_zero.
Now that cuda also supports the multi-device context, P2P programming should be 100% portable across these backends.