Skip to content

[SYCL][FPGA] Add compute units library. #2395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

jessicadavies-intel
Copy link
Contributor

Add a header library that allows the user to duplicate a single-task kernel k times, for any k>=1.

For spatial architectures such as FPGA, users often wish to duplicate their kernels several times in order to fill the available area on the chip. Code duplication is easy to achieve using C++ template metaprogramming. We designed this API to improve the usability.

Usage:

// The name of the kernel is templated on the compute unit ID,
// so each compute unit receives a unique name, e.g., foo_kernel<0>
template <std::size_t ID> class foo_kernel;

// Create 5 compute units whose functionality is determined by the generic lambda
intelfpga::submit_compute_units<5, foo_kernel>(q, [=](auto ID) {
   // kernel function body
  // compute unit functionality can be specialized on ID
});

Add a header library that allows the user to duplicate a single-task kernel k times, for any k>=1.
@jessicadavies-intel jessicadavies-intel requested a review from a team as a code owner August 31, 2020 21:17
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it expected to work with pipes? Will a programmer read/writer from different pipes within these kernels (aka by using templates in pipes naming) or we expect some sort of serialization in the runtime?

@jessicadavies-intel
Copy link
Contributor Author

How is it expected to work with pipes? Will a programmer read/writer from different pipes within these kernels (aka by using templates in pipes naming) or we expect some sort of serialization in the runtime?

We have a working example that uses pipes. Yes, the intended use is for each compute unit to read/write to particular pipes, depending on the compute unit's ID.

#include <CL/sycl.hpp>
#include <utility>

namespace intelfpga {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a proper namespace? I mean, it's only ::intelfpga, not smth like cl::sycl::intelfpga. Is it OK?

Copy link
Contributor

@MrSidims MrSidims Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not OK. It shall be at leastcl::sycl::INTEL. And I'm not sure about intelfpga.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think it should be cl::sycl::INTEL.

@jessicadavies-intel
Copy link
Contributor Author

@MrSidims @s-kanaev do you think this should go in ONEAPI instead of INTEL? It isn't really specific to Intel hardware, so maybe it belongs in ONEAPI instead?

@MrSidims
Copy link
Contributor

MrSidims commented Sep 1, 2020

@MrSidims @s-kanaev do you think this should go in ONEAPI instead of INTEL? It isn't really specific to Intel hardware, so maybe it belongs in ONEAPI instead?

ONEAPI namespace serves for DPCPP language extensions (in theory anyone outside of Intel can add an extension here), while INTEL namespace is for Intel hardware extensions, each extension is for the specific hardware (f. e. pipes are for FPGA and not used anywhere else and ESIMD is for GEN and not used anywhere else). So I'd say this extension belongs to INTEL namespace as it is FPGA H/W extension.

Some comments for the patch. Functional changes are LGTM, but please add a test. Please also consider to add the extension specification. I think, it can be done separately and after the initial commit (yet it's just INTEL extension, not ONEAPI, where the rules are more strict).

Add a test for the compute units header. Also add the pipe array header, since both headers can be easily tested at the same time.
@@ -0,0 +1,78 @@
//==------------ compute_units.cpp - SYCL FPGA compute units test ----------==//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add compile and run string for the test. You can take one from pipes.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


#pragma once

#include <CL/sycl/intel/pipes.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include <CL/sycl/intel/pipes.hpp>
#include <CL/sycl/INTEL/pipes.hpp>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for not doing this with #pragma, devconstexpr or [[attribute]]! :-)

SubmitOneComputeUnit(Func &&f, sycl::queue &q) {

q.submit([&](sycl::handler &h) {
h.single_task<Name<index>>([=]() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
h.single_task<Name<index>>([=]() {
h.single_task<Name<index>>([=] {

void SourceKernel(queue &q, float data) {

q.submit([&](handler &h) {
h.single_task<source_kernel>([=]() { Pipes::PipeAt<0>::write(data); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
h.single_task<source_kernel>([=]() { Pipes::PipeAt<0>::write(data); });
h.single_task<source_kernel>([=] { Pipes::PipeAt<0>::write(data); });

Otherwise, I was thinking to this alternative API:

    h.single_task<source_kernel>([=] { Pipes::write<0>(data); });

} // namespace

template <class Id, typename BaseTy, size_t depth, size_t... dims>
struct PipeArray {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really an array... But I cannot find a better name.
pipe_array looks more similar to SYCL coding style, otherwise.

};

template <size_t... idxs>
using PipeAt = typename VerifyIndices<idxs...>::VerifiedPipe;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps get to be closer to std::tuple and even std::array through the tuple interface?
Or just at?
I am not convinced you need to repeat pipe in the name.


#include <CL/sycl/INTEL/fpga_extensions.hpp>

using namespace sycl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad coding style for a tutorial but probably OK in a unit test...

q.submit([&](handler &h) {
auto out_accessor = out_buf.get_access<access::mode::write>(h);
h.single_task<sink_kernel>(
[=]() { out_accessor[0] = Pipes::PipeAt<kComputeUnits>::read(); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[=]() { out_accessor[0] = Pipes::PipeAt<kComputeUnits>::read(); });
[=] { out_accessor[0] = Pipes::PipeAt<kComputeUnits>::read(); });

Otherwise, I was thinking to this alternative API:

        [=] { out_accessor[0] = Pipes::read<kComputeUnits>(); });

@jessicadavies-intel
Copy link
Contributor Author

@keryell thank you for the review. I will use your suggestions to simplify the read/write API.

After talking to @jbrodman we are planning to move these headers to a separate repo. Therefore, I'm closing this PR.

jsji pushed a commit that referenced this pull request Mar 7, 2024
In translation from  __spirv_AtomicCompareExchange to OpenCL builtin
atomic_compare_exchange_strong_explicit, a new alloca `expected` is
created and read/written in the OpenCL builtin.
The OpenCL builtin call can't have tail marker since the marker requires
that callee doesn't access alloca from the caller.
Otherwise llvm alias analysis deduces that the alloca isn't accessed by
the call, and instcombine pass replaces the load from the alloca after
the call with the value stored to the alloca before the call.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@1ff4a764cd0f97c
martygrant added a commit to martygrant/llvm that referenced this pull request Dec 2, 2024
Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants