Skip to content

[SYCL] Avoid using intrinsics to get global id in stream #2318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 6 additions & 7 deletions sycl/include/CL/sycl/stream.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -811,13 +811,12 @@ class __SYCL_EXPORT stream {

#ifdef __SYCL_DEVICE_ONLY__
void __init() {
// Calculate work item's global id, this should be done once, that
// is why this is done in _init method, call to __init method is generated
// by frontend. As a result each work item will write to its own section
// in the flush buffer

id<1> GlobalId = __spirv::initGlobalInvocationId<1, id<1>>();
WIOffset = (unsigned int)GlobalId[0] * FlushBufferSize;
// Calculate offset in the flush buffer for each work item in the global
// work space. We need to avoid calling intrinsics to get global id because
// when stream is used in a single_task kernel this could cause some
// overhead on FPGA target. That is why use global atomic variable to
// calculate offsets.
WIOffset = GlobalOffset[1].fetch_add(FlushBufferSize);
}

void __finalize() {
Expand Down
11 changes: 5 additions & 6 deletions sycl/source/detail/stream_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,12 @@ class __SYCL_EXPORT stream_impl {
}

// Method to provide an atomic access to the offset in the global stream
// buffer
// buffer and offset in the flush buffer
GlobalOffsetAccessorT accessGlobalOffset(handler &CGH) {
auto OffsetSubBuf = buffer<char, 1>(Buf, id<1>(0), range<1>(OffsetSize));
auto ReinterpretedBuf = OffsetSubBuf.reinterpret<unsigned, 1>(range<1>(1));
auto ReinterpretedBuf = OffsetSubBuf.reinterpret<unsigned, 1>(range<1>(2));
return ReinterpretedBuf.get_access<cl::sycl::access::mode::atomic>(
CGH, range<1>(1), id<1>(0));
CGH, range<1>(2), id<1>(0));
}

// Copy stream buffer to the host and print the contents
Expand All @@ -61,10 +61,9 @@ class __SYCL_EXPORT stream_impl {
// statement till the semicolon
unsigned MaxStatementSize_;

// Size of the variable which is used as an offset in the stream buffer.
// Additinonal memory is allocated in the beginning of the stream buffer for
// this variable.
static const size_t OffsetSize = sizeof(unsigned);
// 2 variables: offset in the stream buffer and offset in the flush buffer.
static const size_t OffsetSize = 2 * sizeof(unsigned);

// Vector on the host side which is used to initialize the stream buffer
std::vector<char> Data;
Expand Down
32 changes: 32 additions & 0 deletions sycl/test/basic_tests/stream/no_intrinsic.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// RUN: %clangxx -S -emit-llvm -fsycl-device-only %s -o - -Xclang -disable-llvm-passes | FileCheck %s

//==------------------ no_intrinsics.cpp - SYCL stream test ----------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// Test to check that intrinsics to get a global id are not generated for the
// stream.

// CHECK-NOT: call spir_func void @{{.*}}__spirvL22initGlobalInvocationId{{.*}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. I think you should check that there's no call to spirv initGlobalInvocationId inside the kernel. That means, that the kernel should still be there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this comment. This check covers all call instructions i.e. inside the kernel as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that if the frontend doesn't generate this instruction for some reason we have to be sure that the kernel is still generated. I believe that there are proper checks for that in frontend part of tests.


#include <CL/sycl.hpp>

using namespace cl::sycl;

int main() {
{
queue Queue;

Queue.submit([&](handler &CGH) {
stream Out(1024, 80, CGH);
CGH.single_task<class integral>([=]() { Out << "Hello, World!\n"; });
});
Queue.wait();
}

return 0;
}