-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][ESIMD][EMU] LSC support for ESIMD_EMULATOR backend #6099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
36 out of 40 LSC tests are passing
static_assert(ImmOffset == 0); | ||
static_assert(DS != __ESIMD_ENS::lsc_data_size::u16u32h); | ||
|
||
constexpr uint MASK = loadstoreAlignMask<Ty, VS, DS, N>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this down to the place of use. Same is true for Output, ElemCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
constexpr uint MASK = loadstoreAlignMask<Ty, VS, DS, N>(); | ||
__ESIMD_DNS::vector_type_t<Ty, N * __ESIMD_EDNS::to_int<VS>()> Output = 0; | ||
|
||
constexpr int ElemCount = __ESIMD_EDNS::to_int<VS>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think N's definition /// @tparam N is the number of channels (platform dependent).
is wrong here and in many other places in this file. This is actually the simd size - the number of chunks of VS*sizeof(T) size. If we take rgba load/strore as reference, then VS would be the number of channels, not N. So ElemCount
is actually the number of channels. It would be great if you could change N
's definition to something like "the SIMD size of operation (the number of addresses to access)".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment updated.
ByteDistance += rawAddressIncrement<Ty, DS>(), | ||
VecIdx += vectorIndexIncrement<N, _Transposed>()) { | ||
|
||
Output[VecIdx] = *((Ty *)(addrs[AddrIdx] + ByteDistance)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please hoist addrs[AddrIdx]
out of the loop
BaseAddr = addrs[AddrIdx];
assert((BaseAddr & MASK) == 0 && "Address Alignment Error!!");
for (...) {
Output[VecIdx] = *((Ty *)(BaseAddr + ByteDistance));
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- Updating comments for definition of 'N' : SIMD operationsize - addrs[] to BaseAddr - Relocating variables close to the place of its use - Elem* to Chanl*
@@ -260,7 +657,8 @@ __ESIMD_INTRIN void __esimd_raw_send_nbarrier_signal( | |||
/// @tparam DS is the data size. | |||
/// @tparam VS is the number of elements to load per address. | |||
/// @tparam Transposed indicates if the data is transposed during the transfer. | |||
/// @tparam N is the number of channels (platform dependent). | |||
/// @tparam N is the SIMD size of operation (the number of addresses to access, | |||
/// platform dependent). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what "platform-dependent" means for N
, as this is what user passes in the code. Restrictions maybe platform-dependent, but that's normal situation for many other ESIMD API parameters, so I suggest to remove it here and on all other similar places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed 'platform dependent' comment.
Unrelated failures from HIP AMDGPU LLVM Test Suite
|
@dongkyunahn-intel, @kbobrovs, please check post-commit fails on Windows: https://github.com/intel/llvm/actions/runs/2416580740 |
Posted PR for fix - #6223 |
36 out of 40 LSC tests are passing