-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][FPGA] Add compute units library. #2395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add a header library that allows the user to duplicate a single-task kernel k times, for any k>=1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is it expected to work with pipes? Will a programmer read/writer from different pipes within these kernels (aka by using templates in pipes naming) or we expect some sort of serialization in the runtime?
We have a working example that uses pipes. Yes, the intended use is for each compute unit to read/write to particular pipes, depending on the compute unit's ID. |
#include <CL/sycl.hpp> | ||
#include <utility> | ||
|
||
namespace intelfpga { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a proper namespace? I mean, it's only ::intelfpga
, not smth like cl::sycl::intelfpga
. Is it OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not OK. It shall be at leastcl::sycl::INTEL
. And I'm not sure about intelfpga
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think it should be cl::sycl::INTEL.
ONEAPI namespace serves for DPCPP language extensions (in theory anyone outside of Intel can add an extension here), while INTEL namespace is for Intel hardware extensions, each extension is for the specific hardware (f. e. pipes are for FPGA and not used anywhere else and ESIMD is for GEN and not used anywhere else). So I'd say this extension belongs to INTEL namespace as it is FPGA H/W extension. Some comments for the patch. Functional changes are LGTM, but please add a test. Please also consider to add the extension specification. I think, it can be done separately and after the initial commit (yet it's just INTEL extension, not ONEAPI, where the rules are more strict). |
Add a test for the compute units header. Also add the pipe array header, since both headers can be easily tested at the same time.
@@ -0,0 +1,78 @@ | |||
//==------------ compute_units.cpp - SYCL FPGA compute units test ----------==// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add compile and run string for the test. You can take one from pipes.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
||
#pragma once | ||
|
||
#include <CL/sycl/intel/pipes.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <CL/sycl/intel/pipes.hpp> | |
#include <CL/sycl/INTEL/pipes.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for not doing this with #pragma
, devconstexpr
or [[attribute]]
! :-)
SubmitOneComputeUnit(Func &&f, sycl::queue &q) { | ||
|
||
q.submit([&](sycl::handler &h) { | ||
h.single_task<Name<index>>([=]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h.single_task<Name<index>>([=]() { | |
h.single_task<Name<index>>([=] { |
void SourceKernel(queue &q, float data) { | ||
|
||
q.submit([&](handler &h) { | ||
h.single_task<source_kernel>([=]() { Pipes::PipeAt<0>::write(data); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h.single_task<source_kernel>([=]() { Pipes::PipeAt<0>::write(data); }); | |
h.single_task<source_kernel>([=] { Pipes::PipeAt<0>::write(data); }); |
Otherwise, I was thinking to this alternative API:
h.single_task<source_kernel>([=] { Pipes::write<0>(data); });
} // namespace | ||
|
||
template <class Id, typename BaseTy, size_t depth, size_t... dims> | ||
struct PipeArray { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really an array... But I cannot find a better name.
pipe_array
looks more similar to SYCL coding style, otherwise.
}; | ||
|
||
template <size_t... idxs> | ||
using PipeAt = typename VerifyIndices<idxs...>::VerifiedPipe; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps get
to be closer to std::tuple
and even std::array
through the tuple interface?
Or just at
?
I am not convinced you need to repeat pipe
in the name.
|
||
#include <CL/sycl/INTEL/fpga_extensions.hpp> | ||
|
||
using namespace sycl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad coding style for a tutorial but probably OK in a unit test...
q.submit([&](handler &h) { | ||
auto out_accessor = out_buf.get_access<access::mode::write>(h); | ||
h.single_task<sink_kernel>( | ||
[=]() { out_accessor[0] = Pipes::PipeAt<kComputeUnits>::read(); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[=]() { out_accessor[0] = Pipes::PipeAt<kComputeUnits>::read(); }); | |
[=] { out_accessor[0] = Pipes::PipeAt<kComputeUnits>::read(); }); |
Otherwise, I was thinking to this alternative API:
[=] { out_accessor[0] = Pipes::read<kComputeUnits>(); });
In translation from __spirv_AtomicCompareExchange to OpenCL builtin atomic_compare_exchange_strong_explicit, a new alloca `expected` is created and read/written in the OpenCL builtin. The OpenCL builtin call can't have tail marker since the marker requires that callee doesn't access alloca from the caller. Otherwise llvm alias analysis deduces that the alloca isn't accessed by the call, and instcombine pass replaces the load from the alloca after the call with the value stored to the alloca before the call. Original commit: KhronosGroup/SPIRV-LLVM-Translator@1ff4a764cd0f97c
Add a header library that allows the user to duplicate a single-task kernel k times, for any k>=1.
For spatial architectures such as FPGA, users often wish to duplicate their kernels several times in order to fill the available area on the chip. Code duplication is easy to achieve using C++ template metaprogramming. We designed this API to improve the usability.
Usage: