Skip to content

[SYCL][E2E] Fix invalid discard_write in atomic_memory_order_seq_cst #10900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 22, 2023

Conversation

steffenlarsen
Copy link
Contributor

The atomic_memory_order_seq_cst incorrectly uses discard_write access after initializing its memory. This causes failures on some hardware. This commit fixes this invalid access mode.

The atomic_memory_order_seq_cst incorrectly uses discard_write access
after initializing its memory. This causes failures on some hardware.
This commit fixes this invalid access mode.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen steffenlarsen requested a review from a team as a code owner August 21, 2023 11:26
@@ -40,7 +37,7 @@ void check(queue &q, buffer<int, 2> &res_buf, size_t N_iters) {
q.submit([&](handler &cgh) {
auto res = res_buf.template get_access<access::mode::read>(cgh);
auto checked =
checked_buf.template get_access<access::mode::discard_write>(cgh);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a temporary problem with discard_write caused by, e.g., some bug in GPU driver, or we should simply always use write instead of discard_write in such scenarios? If this is temporary problem in GPU driver, should we wait until GPU driver is updated on CI machines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is more of a misunderstanding of what discard_write does. It is now deprecated, but is the same as write and no_init. The problem here is that since we tell the device allocation that we don't care about the initial value of the memory, it is free to write back whatever value we don't write, while for this test we actually initialize the memory ourselves earlier.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen
Copy link
Contributor Author

HIP failures reported in #10460.

@steffenlarsen steffenlarsen merged commit f69dae7 into intel:sycl Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants