[SYCL][ESIMD] Support 64-bit offsets for stateless accessor gather/scatter #9462

sarnex · 2023-05-15T17:24:10Z

I manually ran the test on PVC.

After this change, I'll add support for more APIs per-PR.

fineg74 · 2023-05-16T23:17:05Z

sycl/include/sycl/ext/intel/esimd/memory.hpp

-scatter(AccessorTy acc, simd<uint32_t, N> offsets, simd<T, N> vals,
-        uint32_t glob_offset = 0, simd_mask<N> mask = 1) {
+scatter(AccessorTy acc,
+#ifdef __ESIMD_FORCE_STATELESS_MEM


The test below compiled for me without -fsycl-esimd-force-stateless-mem switch. I believe the intention was to have 64 bit offsets for -fsycl-esimd-force-stateless-mem and 32 bit for pure accessors. If so why not to have offset type as template parameter and impose the conditions using static_assert ? In this case you would also allow 32 bit offset for -fsycl-esimd-force-stateless-mem switch. It seems simd just silently converts data so you don't actually impose any conditions on offset type.

Originally I thought that 32-bit offsets for stateless variant could be more efficient, but if 32-bit vector still meets this line: https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/intel/esimd/memory.hpp#L136 for gathers and https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/intel/esimd/memory.hpp#L214 for scatters.
If so, then having simd<uint64_t,N> type for offsets in stateless mode seems reasonable to me.

Maybe it was a previous mistake having simd<Toffset,N> type for 'offsets' here: https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/intel/esimd/memory.hpp#L131

If conversion to vector of uint64_t happens anyway, then why not simplify that prototype and specify it there as:

template <typename Tx, int N> __ESIMD_API simd<Tx, N> gather(const Tx *p, simd<uint64_t, N> offsets, simd_mask<N> mask = 1) {

and then remove the line https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/intel/esimd/memory.hpp#L136

Gregory, do you mean by this: "It seems simd just silently converts data so you don't actually impose any conditions on offset type" that 64-bit offset passed to stateful gather is silently truncated to 32-bit offset, which doesn't happen for stateless gather?

The approach used here is consistent with previous PR: #9160
AFAIR, we discussed this during teams meeting and it did not meet objections. Do you think it is a mistake in 9160 not reporting an static_error on attempt to pass uint64_t offset to stateful block load?

thanks for the review.
in our discussion we thought for the simd vector of offsets case it would compile time error if you try to use that stateful api with 64bit offsets because of template conversion rules, but it seems for this api it does convert, so imo we should make sure some error/warning is thrown. the glob_offset parameter does warn as expected though.

note the stateful case goes to gather_impl which does not hit the uint64 convert linked above.

let me add a commit that does this, i think it does what gregory is suggesting

Do you think it is a mistake in 9160 not reporting an static_error on attempt to pass uint64_t offset to stateful block load?

We throw a warning in that case, so IMO it's fine. The problem is that we don't throw any warning or error in this case. My latest commit adds an error

if we look at the CI failure, we see we actually rely on the conversion today:

https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/ESIMD/accessor_gather_scatter.hpp#L37

the result of

simd<uint32_t, VL> offsets(0, STRIDE); ... offsets * sizeof(T)

is
simd<unsigned long, N>

so my change causes an error when it used to work.

maybe we should just update the test, but maybe users are relying on this behavior working and we should not error?

Maybe it was a previous mistake having simd<Toffset,N> type for 'offsets' here: https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/intel/esimd/memory.hpp#L131

If conversion to vector of uint64_t happens anyway, then why not simplify that prototype and specify it there as:

template <typename Tx, int N> __ESIMD_API simd<Tx, N> gather(const Tx *p, simd<uint64_t, N> offsets, simd_mask<N> mask = 1) {

and then remove the line https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/intel/esimd/memory.hpp#L136
The pleasant side effect from using templated offset type is catching errors where users flip values and offsets parameters and end up passing float offsets and the API is not able to catch it. And there were actual tests that had mistakes like that that were discovered only after introducing templated offset parameter

Gregory, do you mean by this: "It seems simd just silently converts data so you don't actually impose any conditions on offset type" that 64-bit offset passed to stateful gather is silently truncated to 32-bit offset, which doesn't happen for stateless gather?

The approach used here is consistent with previous PR: #9160 AFAIR, we discussed this during teams meeting and it did not meet objections. Do you think it is a mistake in 9160 not reporting an static_error on attempt to pass uint64_t offset to stateful block load?

As far as I remember the reason why it was done in the previous PR was to preserve some warning we were emitting when passing 64 bit offsets to stateful version. Without going to discussion about correctness/need for that approach, we don't have such an issue here i.e. no warning is emitted when we are passing 64 bit offsets so technically, we don't have to follow the same pattern here

As far as I remember the reason why it was done in the previous PR was to preserve some warning we were emitting when passing 64 bit offsets to stateful version.

Yes, that's right.

Here there are two items to look at:

The global offset

In the current version of the PR, we preserve the warning in stateful mode with 64-bit offset because we use an ifdef for param type to be 32-bit. Note that we already have code that passes 64 bit global offset in stateful mode, in copy_to for accessors.

The per-element offset

In the current version of the PR, we do not error or warn about 32 vs 64 bit. This is because conversion happened anyway (32->64 and 64->32) in both stateful and stateless today, and we have at least one test that does this today (in stateful mode), so my intuition is that users may be doing it too.

fineg74 · 2023-05-16T23:23:07Z

sycl/test-e2e/ESIMD/accessor_gather_scatter_stateless_64.cpp

+    q.submit([&](handler &cgh) {
+       auto PA = bufa.get_access<access::mode::read_write>(cgh);
+       cgh.single_task<class Test>([=]() SYCL_ESIMD_KERNEL {
+         uint64_t offsetStart = (Size - VL) * sizeof(uint64_t);


I believe here and in 2 lines below it should be sizeof(uint32_t) as you manipulating 32 bit data unless the intent is to skip some data

yeah i did intend to skip some data, right now the test verifies the beginning and end of the allocated uint64 memory, and if the uint64 isnt applied, we write to the beginning and the test will fail because that's supposed to have a different value

we can't use 64-bit vectors, so im loading/storing 4 bytes and keeping the 64-bit offset so we get the above behavior. in the verify part of the test we cast the data to uint32_t before doing the mathematical operation that the device was supposed to do.

if there's a simpler way to do this and still make sure the test will fail if the uint64 offset doesn't work please let me know

v-klochkov · 2023-05-19T19:38:48Z

@sarnex - something is wrong with Jenkins/Precommit. Can you please do 'git merge' with head?
@fineg74 - please re-review the updated fix. Is your comment addressed fully?

…atter After this change, I'll add support for more APIs in a single PR. Signed-off-by: Sarnie, Nick <[email protected]>

Signed-off-by: Sarnie, Nick <[email protected]>

sarnex · 2023-05-19T19:41:01Z

done thx

sarnex force-pushed the offset2 branch from e066291 to 9bdf06b Compare May 15, 2023 17:28

sarnex marked this pull request as ready for review May 15, 2023 17:41

sarnex requested a review from a team as a code owner May 15, 2023 17:41

sarnex temporarily deployed to aws May 15, 2023 19:44 — with GitHub Actions Inactive

sarnex temporarily deployed to aws May 15, 2023 21:26 — with GitHub Actions Inactive

fineg74 reviewed May 16, 2023

View reviewed changes

v-klochkov self-requested a review May 17, 2023 02:20

sarnex temporarily deployed to aws May 17, 2023 15:12 — with GitHub Actions Inactive

sarnex temporarily deployed to aws May 17, 2023 15:53 — with GitHub Actions Inactive

sarnex temporarily deployed to aws May 17, 2023 16:29 — with GitHub Actions Inactive

sarnex temporarily deployed to aws May 17, 2023 18:13 — with GitHub Actions Inactive

sarnex temporarily deployed to aws May 17, 2023 18:54 — with GitHub Actions Inactive

v-klochkov approved these changes May 19, 2023

View reviewed changes

sarnex added 5 commits May 19, 2023 12:40

[SYCL][ESIMD] Support 64-bit offsets for stateless accessor gather/sc…

975e54c

…atter After this change, I'll add support for more APIs in a single PR. Signed-off-by: Sarnie, Nick <[email protected]>

attempt to add error

26883f7

Signed-off-by: Sarnie, Nick <[email protected]>

remove global offset error

f6b143a

Signed-off-by: Sarnie, Nick <[email protected]>

preserve global offset warning

5249ebb

Signed-off-by: Sarnie, Nick <[email protected]>

remove static assert

ed63609

Signed-off-by: Sarnie, Nick <[email protected]>

sarnex force-pushed the offset2 branch from 84bb551 to ed63609 Compare May 19, 2023 19:40

sarnex temporarily deployed to aws May 19, 2023 20:40 — with GitHub Actions Inactive

fineg74 self-requested a review May 19, 2023 21:33

sarnex temporarily deployed to aws May 19, 2023 21:34 — with GitHub Actions Inactive

fineg74 approved these changes May 19, 2023

View reviewed changes

v-klochkov merged commit d04ebb0 into intel:sycl May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][ESIMD] Support 64-bit offsets for stateless accessor gather/scatter #9462

[SYCL][ESIMD] Support 64-bit offsets for stateless accessor gather/scatter #9462

Uh oh!

sarnex commented May 15, 2023 •

edited

Loading

Uh oh!

fineg74 May 16, 2023

Uh oh!

v-klochkov May 17, 2023

Uh oh!

v-klochkov May 17, 2023 •

edited

Loading

Uh oh!

v-klochkov May 17, 2023

Uh oh!

sarnex May 17, 2023

Uh oh!

sarnex May 17, 2023

Uh oh!

sarnex May 17, 2023 •

edited

Loading

Uh oh!

fineg74 May 17, 2023

Uh oh!

fineg74 May 17, 2023

Uh oh!

sarnex May 17, 2023 •

edited

Loading

Uh oh!

fineg74 May 16, 2023

Uh oh!

sarnex May 17, 2023 •

edited

Loading

Uh oh!

v-klochkov commented May 19, 2023

Uh oh!

sarnex commented May 19, 2023

Uh oh!

Uh oh!

[SYCL][ESIMD] Support 64-bit offsets for stateless accessor gather/scatter #9462

[SYCL][ESIMD] Support 64-bit offsets for stateless accessor gather/scatter #9462

Uh oh!

Conversation

sarnex commented May 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

v-klochkov May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sarnex May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sarnex May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sarnex May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

v-klochkov commented May 19, 2023

Uh oh!

sarnex commented May 19, 2023

Uh oh!

Uh oh!

sarnex commented May 15, 2023 •

edited

Loading

v-klochkov May 17, 2023 •

edited

Loading

sarnex May 17, 2023 •

edited

Loading

sarnex May 17, 2023 •

edited

Loading

sarnex May 17, 2023 •

edited

Loading