Skip to content

[SYCL][ESIMD] Update ESIMD tests and add raw send support. #2482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
304 changes: 304 additions & 0 deletions sycl/include/CL/sycl/INTEL/esimd/detail/esimd_memory_intrin.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,151 @@ __esimd_media_block_store(unsigned modififer, TACC handle, unsigned plane,
unsigned width, unsigned x, unsigned y,
sycl::INTEL::gpu::vector_type_t<Ty, M * N> vals);

/// \brief esimd_get_value
///
/// @param sid the SYCL accessor.
///
/// Returns the binding table index value.
///
template <typename SurfIndAliasTy>
SYCL_EXTERNAL uint32_t __esimd_get_value(SurfIndAliasTy sid);

/// \brief Raw sends load.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param numSrc1 the number of GRFs for source-1, which must be a compile time
/// constant.
///
/// @param numDst the number of GRFs for destination, which must be a compile
/// time constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
/// @param msgSrc1 the second source operand of send message.
///
/// @param msgDst the destination operand of send message.
///
/// Returns a simd vector of type Ty1 and size N1.
///
template <typename Ty1, int N1, typename Ty2, int N2, typename Ty3, int N3,
int N = 16>
SYCL_EXTERNAL sycl::INTEL::gpu::vector_type_t<Ty1, N1>
__esimd_raw_sends_load(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t numSrc1, uint8_t numDst,
uint8_t sfid, uint32_t exDesc, uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty2, N2> msgSrc0,
sycl::INTEL::gpu::vector_type_t<Ty3, N3> msgSrc1,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgDst);

/// \brief Raw send load.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param numDst the number of GRFs for destination, which must be a compile
/// time constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
/// @param msgDst the destination operand of send message.
///
/// Returns a simd vector of type Ty1 and size N1.
///
template <typename Ty1, int N1, typename Ty2, int N2, int N = 16>
SYCL_EXTERNAL sycl::INTEL::gpu::vector_type_t<Ty1, N1>
__esimd_raw_send_load(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t numDst, uint8_t sfid,
uint32_t exDesc, uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty2, N2> msgSrc0,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgDst);

/// \brief Raw sends store.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param numSrc1 the number of GRFs for source-1, which must be a compile time
/// constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
/// @param msgSrc1 the second source operand of send message.
///
template <typename Ty1, int N1, typename Ty2, int N2, int N = 16>
SYCL_EXTERNAL void
__esimd_raw_sends_store(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t numSrc1, uint8_t sfid,
uint32_t exDesc, uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgSrc0,
sycl::INTEL::gpu::vector_type_t<Ty2, N2> msgSrc1);

/// \brief Raw send store.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
template <typename Ty1, int N1, int N = 16>
SYCL_EXTERNAL void
__esimd_raw_send_store(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t sfid, uint32_t exDesc,
uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgSrc0);
#ifndef __SYCL_DEVICE_ONLY__

template <typename Ty, int N, int NumBlk, sycl::INTEL::gpu::CacheHint L1H,
Expand Down Expand Up @@ -660,4 +805,163 @@ __esimd_block_write(SurfIndAliasTy surf_ind, uint32_t offset,
throw cl::sycl::feature_not_supported();
}

/// \brief esimd_get_value
///
/// @param acc the SYCL accessor.
///
/// Returns the binding table index value.
///
template <typename AccessorTy>
SYCL_EXTERNAL uint32_t __esimd_get_value(AccessorTy acc) {
throw cl::sycl::feature_not_supported();
return 0;
}

/// \brief Raw sends load.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param numSrc1 the number of GRFs for source-1, which must be a compile time
/// constant.
///
/// @param numDst the number of GRFs for destination, which must be a compile
/// time constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
/// @param msgSrc1 the second source operand of send message.
///
/// @param msgDst the destination operand of send message.
///
/// Returns a simd vector of type Ty1 and size N1.
///
template <typename Ty1, int N1, typename Ty2, int N2, typename Ty3, int N3,
int N>
SYCL_EXTERNAL sycl::INTEL::gpu::vector_type_t<Ty1, N1>
__esimd_raw_sends_load(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t numSrc1, uint8_t numDst,
uint8_t sfid, uint32_t exDesc, uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty2, N2> msgSrc0,
sycl::INTEL::gpu::vector_type_t<Ty3, N3> msgSrc1,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgDst) {
throw cl::sycl::feature_not_supported();
return 0;
}

/// \brief Raw send load.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param numDst the number of GRFs for destination, which must be a compile
/// time constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
/// @param msgDst the destination operand of send message.
///
/// Returns a simd vector of type Ty1 and size N1.
///
template <typename Ty1, int N1, typename Ty2, int N2, int N>
SYCL_EXTERNAL sycl::INTEL::gpu::vector_type_t<Ty1, N1>
__esimd_raw_send_load(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t numDst, uint8_t sfid,
uint32_t exDesc, uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty2, N2> msgSrc0,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgDst) {
throw cl::sycl::feature_not_supported();
return 0;
}

/// \brief Raw sends store.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param numSrc1 the number of GRFs for source-1, which must be a compile time
/// constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
/// @param msgSrc1 the second source operand of send message.
///
template <typename Ty1, int N1, typename Ty2, int N2, int N>
SYCL_EXTERNAL void
__esimd_raw_sends_store(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t numSrc1, uint8_t sfid,
uint32_t exDesc, uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgSrc0,
sycl::INTEL::gpu::vector_type_t<Ty2, N2> msgSrc1) {
throw cl::sycl::feature_not_supported();
}

/// \brief Raw send store.
///
/// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
///
/// @param execSize the execution size, which must be a compile time constant.
///
/// @param pred the predicate to specify enabled channels.
///
/// @param numSrc0 the number of GRFs for source-0, which must be a compile time
/// constant.
///
/// @param sfid the shared function ID, which must be a compile time constant.
///
/// @param exDesc the extended message descriptor.
///
/// @param msgDesc the message descriptor.
///
/// @param msgSrc0 the first source operand of send message.
///
template <typename Ty1, int N1, int N>
SYCL_EXTERNAL void
__esimd_raw_send_store(uint8_t modifier, uint8_t execSize,
sycl::INTEL::gpu::vector_type_t<uint16_t, N> pred,
uint8_t numSrc0, uint8_t sfid, uint32_t exDesc,
uint32_t msgDesc,
sycl::INTEL::gpu::vector_type_t<Ty1, N1> msgSrc0) {
throw cl::sycl::feature_not_supported();
}

#endif // __SYCL_DEVICE_ONLY__
6 changes: 3 additions & 3 deletions sycl/include/CL/sycl/INTEL/esimd/esimd_math.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1838,7 +1838,7 @@ template <typename T0, typename T1, int SZ> struct esimd_apply_reduced_min {

template <typename T0, typename T1, int SZ,
template <typename RT, typename T, int N> class OpType>
T1 esimd_reduce_single(simd<T1, SZ> v) {
T0 esimd_reduce_single(simd<T1, SZ> v) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not seem to be related to raw_send.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, as you can see from the change list, this patch includes other local changes including ESIMD tests update and a reduction test fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: it's good to have proper and meaningful names for template types instead of indexed ones.

if constexpr (SZ == 1) {
return v[0];
} else {
Expand All @@ -1854,7 +1854,7 @@ T1 esimd_reduce_single(simd<T1, SZ> v) {

template <typename T0, typename T1, int N1, int N2,
template <typename RT, typename T, int N> class OpType>
T1 esimd_reduce_pair(simd<T1, N1> v1, simd<T1, N2> v2) {
T0 esimd_reduce_pair(simd<T1, N1> v1, simd<T1, N2> v2) {
if constexpr (N1 == N2) {
simd<T0, N1> tmp = OpType<T0, T1, N1>()(v1, v2);
return esimd_reduce_single<T0, T0, N1, OpType>(tmp);
Expand All @@ -1879,7 +1879,7 @@ T1 esimd_reduce_pair(simd<T1, N1> v1, simd<T1, N2> v2) {

template <typename T0, typename T1, int SZ,
template <typename RT, typename T, int N> class OpType>
T1 esimd_reduce(simd<T1, SZ> v) {
T0 esimd_reduce(simd<T1, SZ> v) {
constexpr bool isPowerOf2 = __esimd::isPowerOf2(SZ);
if constexpr (isPowerOf2) {
return esimd_reduce_single<T0, T1, SZ, OpType>(v);
Expand Down
Loading