-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][ESIMD] Implement unified memory API - block_store(acc,...) #11830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b8fa875
to
1ac4095
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Nick, I finished the 1st code-review pass. Please see my comments.
|
||
auto surf_ind = __esimd_get_surface_index( | ||
detail::AccessorPrivateProxy::getQualifiedPtrOrImageObj(acc)); | ||
__esimd_oword_st<Tx, N>(surf_ind, byte_offset >> 4, vals.data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block_store(acc) has some difference comparing to block_load(acc).
The load may be lowered to __esimd_oword_ld, or __esimd_oword_ld_unaligned, while store can only be stored using aligned store.
This code expects from user that byte_offset
is 16-byte aligned as it was previously (it was not explicitly told in the old comment section).
This place is a bit tricky.
We can choose to say that if (!(b) && (c)) then the function assumes the default alignment is 16 if no alignment props passed. Also, if 'alignment' property 4 or 8 is passed, then this function generates code requiring DG2 or PVC device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the in-depth review. I updated the function and doc to expect 16-byte alignment if the property is not passed and !b && C, and if alignment of < 16 is passed we require PVC/DG2, hopefully I did it correctly, please take a look
|
||
if constexpr (sizeof(T) >= 4) | ||
Passed &= | ||
testACC<T, 4, !CheckMask, CheckProperties>(Q, 2, 4, AlignElemProps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the usage of 'AlignElemProps' is user's error here (because this code is targeted to Gen12 too). The test works because the actual alignment is 16: sizeof(T) >=4 and N=4 gives 16 or more bytes.
If user cannot guarantee the byte offsets to be 16-byte aligned, then the only option for us is to generate LSC call, but this part of the test is supposed to pass on Gen12.
Same for the next call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had in mind to make these tests more strict by doing something like:
If user says the alignment is 4-bytes, then intentionally manage the offset to be 4-bytes aligned and not 16-bytes aligned.
So, for these tests it would mean allocating a bit bigger buffers and add start reading with extra offset=4 (+4 to some 16-byte aligned boundary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the Gen12 ones to just use 16-byte alignment to not complicate the test even more
This change implements the new compile time properties API for block_store with accessors. I also updated the old E2E test to fix alignment issues, added compile-time tests and updated the naming of the old functions. Signed-off-by: Sarnie, Nick <[email protected]>
Signed-off-by: Sarnie, Nick <[email protected]>
e5389fd
to
87b895a
Compare
Sorry for force push, I had to rebase |
CI failures not related |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had only 2 comments after the 2nd review pass.
Signed-off-by: Sarnie, Nick <[email protected]>
This change implements the new compile time properties API for block_store with accessors.
I also updated the old E2E test to fix alignment issues, added compile-time tests and updated the naming of the old functions.