Skip to content

[ESIMD] Add ext::intel::experimental::esimd::wait() #8434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions llvm/lib/SYCLLowerIR/ESIMD/LowerESIMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -505,6 +505,7 @@ class ESIMDIntrinDescTable {
{"raw_send2_noresult",
{"raw.send2.noresult",
{a(0), a(1), ai1(2), a(3), a(4), a(5), a(6), a(7)}}},
{"wait", {"dummy.mov", {a(0)}}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the llvm-test-suite test but should we also add a lit test for the lowering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test intel/llvm-test-suite#1615 does verify the lowering.
Please see the "// CHECK: " lines there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there's some tradeoff because that's technically out of tree and doesn't get run with check-llvm, but I don't think it's a big deal

{"dpas2",
{"dpas2", {a(0), a(1), a(2), t(0), t(1), t(2), t(3), t(11), t(12)}}},
{"dpas_nosrc0", {"dpas.nosrc0", {a(0), a(1), t(0)}}},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@ __ESIMD_INTRIN void __esimd_sbarrier(__ESIMD_ENS::split_barrier_action flag)
}
#endif // __SYCL_DEVICE_ONLY__

#ifdef __SYCL_DEVICE_ONLY__
// Create an explicit data and GPU scoreboard dependency.
__ESIMD_INTRIN void __esimd_wait(uint16_t value);
#endif // __SYCL_DEVICE_ONLY__

// \brief Raw sends load.
//
// @param modifier the send message flags (Bit-0: isSendc, Bit-1: isEOT).
Expand Down
25 changes: 25 additions & 0 deletions sycl/include/sycl/ext/intel/experimental/esimd/memory.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,31 @@ __ESIMD_API void named_barrier_signal(uint8_t barrier_id,
0 /*sendc*/, gateway, descriptor, payload, 1 /*pred*/);
}

/// Create explicit scoreboard dependency to avoid device code motion
/// across this call and preserve the \p value computation even
/// if it is unused.
template <typename T, int N>
__ESIMD_API std::enable_if_t<(sizeof(T) * N >= 2)>
wait(__ESIMD_NS::simd<T, N> value) {
#ifdef __SYCL_DEVICE_ONLY__
uint16_t Word = value.template bit_cast_view<uint16_t>()[0];
__esimd_wait(Word);
#endif // __SYCL_DEVICE_ONLY__
}

/// Create explicit scoreboard dependency to avoid device code motion
/// across this call and preserve the \p value computation even
/// if it is unused.
template <typename T, typename RegionT>
__ESIMD_API std::enable_if_t<
(RegionT::length * sizeof(typename RegionT::element_type) >= 2)>
wait(__ESIMD_NS::simd_view<T, RegionT> value) {
#ifdef __SYCL_DEVICE_ONLY__
uint16_t Word = value.template bit_cast_view<uint16_t>()[0];
__esimd_wait(Word);
#endif // __SYCL_DEVICE_ONLY__
}

/// @} sycl_esimd_memory_nbarrier

/// @defgroup sycl_esimd_memory_lsc LSC memory access APIs.
Expand Down