-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][HIP] Change atomic sub to atomic add (-val) #11251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][HIP] Change atomic sub to atomic add (-val) #11251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making this change allows atomic_ref::fetch_sub to work in a wider variety of cases
Does that mean that there is still a subset of devices for which the bug exists.
Also, if I remember it right, it's the first commit message (not the PR info) that gets added to intel/llvm, so I would reword it to be more like your PR's title and the explanation for the change.
First comment again: When using HIP atomics with PCIe support for |
@jchlanda Atomic add is guaranteed to work for PCIe 3. And that dates from 2010 I think. See here https://www.intel.sg/content/dam/doc/white-paper/pci-express3-accelerator-white-paper.pdf |
Ping @intel/llvm-gatekeepers |
Testing is old and the change is AMD GPU specific. We seem to have it broken currently, and I'm not sure if anybody is looking into it. I wouldn't want to merge until CI is fixed. |
@hdelan I have a question. Is double-precision floating-point subtract implemented elsewhere ? |
Hi @jinz2014 that's a good question. |
No problem. I didn't catch it. So fp32 and fp64 are implemented in another file in libclc. |
Well they aren't actually implemented at all for floating point. If floating point sub is needed then you need to rely on |
@aelovikov-intel it looks like CI is working again? Can this be merged? |
Quick question first. Any chance |
So it's only different for unsigned types where |
The link gives me some light of the HIP implementation. Thanks. |
This PR changes the use of
atomic_sub
toatomic_add
with a negatedval
. Doing so extends the number of cases in which the atomic operation will work, since support foratomic_add
is more common thanatomic_sub
, especially at the PCIe level.