-
Notifications
You must be signed in to change notification settings - Fork 787
[ESIMD] Implement unified memory API - part2 - block_load(acc, ...) #11545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fc989de
to
1ec774a
Compare
1ec774a
to
21f7f62
Compare
21f7f62
to
004697d
Compare
7dc87a3
to
d3da039
Compare
d3da039
to
f9750d1
Compare
f9750d1
to
60cf644
Compare
Signed-off-by: Vyacheslav N Klochkov <[email protected]>
60cf644
to
91e8d45
Compare
…thru Signed-off-by: Vyacheslav N Klochkov <[email protected]>
Signed-off-by: Vyacheslav N Klochkov <[email protected]>
__ESIMD_DNS::lsc_data_size DS = __ESIMD_DNS::lsc_data_size::default_size, | ||
cache_hint L1H = cache_hint::none, cache_hint L2H = cache_hint::none, | ||
typename FlagsT> | ||
template <typename T, int NElts, cache_hint L1H = cache_hint::none, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this block_load_impl() function is actually the copy of the old experimental::lsc_block_load().
Please pay attention that this function here doesn't have the 'lsc_data_size' template parameter that was used in the old experimental::lsc_block_load().
The reason for that is that the data-size is always deduced from the loaded element type.
template <typename Tx, int N, | ||
typename Flags = overaligned_tag<detail::OperandSize::OWORD>> | ||
__ESIMD_API std::enable_if_t<is_simd_flag_type_v<Flags>, simd<Tx, N>> | ||
block_load(const Tx *addr, Flags) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function and the next 2 are not removed - they are just moved, grouped with other user-visible block_load() functions (and block_store() functions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only minor comments!
@@ -1475,6 +1348,7 @@ lsc_block_load(AccessorTy acc, uint32_t offset, __ESIMD_NS::simd_mask<1> pred, | |||
return lsc_slm_block_load<T, NElts, DS>( | |||
offset + __ESIMD_DNS::localAccessorToOffset(acc), pred, old_values); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove newline if no other changes
__ESIMD_API std::enable_if_t< | ||
ext::oneapi::experimental::is_property_list_v<PropertyListT> && | ||
sycl::detail::acc_properties::is_accessor_v<AccessorT> && | ||
!sycl::detail::acc_properties::is_local_accessor_v<AccessorT>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a dumb question but we don't allow local accessors for a couple of the functions here, why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is such function in experimental/memory.hpp. It is just not ported by this patch yet:
template <typename T, int NElts, lsc_data_size DS = lsc_data_size::default_size,
cache_hint L1H = cache_hint::none, cache_hint L3H = cache_hint::none,
typename AccessorTy,
typename FlagsT = __ESIMD_DNS::dqword_element_aligned_tag>
__ESIMD_API std::enable_if_t<
sycl::detail::acc_properties::is_local_accessor_v<AccessorTy> &&
__ESIMD_NS::is_simd_flag_type_v<FlagsT>,
__ESIMD_NS::simd<T, NElts>>
lsc_block_load(AccessorTy acc, uint32_t offset,
__ESIMD_NS::simd_mask<1> pred = 1, FlagsT flags = FlagsT{}) {
return lsc_slm_block_load<T, NElts, DS>(
offset + __ESIMD_DNS::localAccessorToOffset(acc), pred);
}
@sarnex - Thank you for quick code-review. Because the comments required only few NFC changes - I will merge the current change and upload a separate NFC fix now. |
Signed-off-by: Vyacheslav N Klochkov <[email protected]>
/// Allowed \c NElts values for 8 bit data are 4, 8, 12, 16, 32, 64, 128, 256, | ||
/// Instruction can load max: DG2(64xD32 or 32xD64), PVC(64xD32 or 64xD64). | ||
/// | ||
/// Accesses contiguous block of memory of `NElts * sizeof(T)` bytes starting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Accesses contiguous block of memory of `NElts * sizeof(T)` bytes starting | |
/// Accesses contiguous block of memory of `NElts * sizeof(T)` bytes starting |
…11594) Signed-off-by: Vyacheslav N Klochkov <[email protected]>
Add new ESIMD/unified_memory_api/block_load_acc.cpp test (added by intel#11545) to XFAIL list. Signed-off-by: Victor Perez <[email protected]>
No description provided.