-
Notifications
You must be signed in to change notification settings - Fork 789
[SYCL][L0] Implementation of ext_oneapi_prod API #13555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0ed4727
64b05b3
82b0f3b
f7a8b81
564710b
0bd1486
5b893f4
5824a3d
7ba9aef
de11c48
4945716
752d0da
7a273eb
04f51e9
50231c3
39b8b5e
8b9be33
0a98e8e
9c79a77
97c1968
2cdc9b3
80cfbf3
4cda8a8
f89fde0
e09659d
ede6359
06ec39d
7b3f037
3b8f838
fbfe848
bdcd475
8d8bbb3
d0c7860
94b535f
4c55051
4377c8c
28e031a
d200b7c
d2e286b
24eb52a
c274a1c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -388,6 +388,20 @@ class queue_impl { | |
template <typename Param> | ||
typename Param::return_type get_backend_info() const; | ||
|
||
/// Provides a hint to the backend to execute previously issued commands on | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is not a guarantee then it is useless users like OpenMP will be assuming the job is submitted and inserting barriers to add more dependent tasks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I follow. The spec of the extension simply says that eventual execution is not guaranteed by calling ext_oneapi_prod(). Whether an app uses or not this API should not affect the correctness of the program. Barriers or higher level calls like wait should still be used to sync with commands. This is merely an encouragement to increase performance. Perhaps @gmlueck can help interpret the spec and its intended usage if I'm misunderstanding! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lbushi25's interpretation of the spec is correct. @RaviNarayanaswamy, I don't understand your concern with barriers and dependent tasks. How does that relate to whether the commands submitted prior to the "prod" call are guaranteed to start executing? The thing that "prod" does not guarantee is that the commands are necessarily running at the same time as the host thread. For example, it would be invalid for the host thread to spin reading host memory waiting until the kernel wrote some value there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gmlueck when we call fflush we want it to submit the previous commands before returning. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @RaviNarayanaswamy But I don't think there is a need for that. The queue class itself is an abstraction that allows you to submit commands and get an event back for synchronization. From your end you can think of the kernel as submitted. The flush functionality in this PR is concerned with encouraging execution of the kernel on the device itself, a lower level than what SYCL exposes.. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lbushi25 If so OpenMP which calls MKL which submits the kernel and returns to OpenMP. Now OpenMP needs to make sure the command is submitted, we cannot use SYCL events to order the scheduling in OpenMP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
/// this queue. Overrides normal batching behaviour. Note that this is merely | ||
/// a hint and not a guarantee. | ||
void flush() { | ||
if (MGraph.lock()) { | ||
throw sycl::exception(make_error_code(errc::invalid), | ||
"flush cannot be called for a queue which is " | ||
"recording to a command graph."); | ||
} | ||
for (const auto &queue : MQueues) { | ||
getPlugin()->call<PiApiKind::piQueueFlush>(queue); | ||
} | ||
} | ||
|
||
using SubmitPostProcessF = std::function<void(bool, bool, event &)>; | ||
|
||
/// Submits a command group function object to the queue, in order to be | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -106,6 +106,7 @@ inline namespace _V1 { | |
#define SYCL_EXT_ONEAPI_PRIVATE_ALLOCA 1 | ||
#define SYCL_EXT_ONEAPI_FORWARD_PROGRESS 1 | ||
#define SYCL_EXT_ONEAPI_FREE_FUNCTION_KERNELS 1 | ||
#define SYCL_EXT_ONEAPI_PROD 1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As far as I understand, defining this macro indicates to the user that feature is supported. But it is not supported yet because you have to bring UR changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll wait until the UR changes are merged and then merge this one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feature is a hint and implementation may decide to ignore it completely. Therefore, I think that it is fine to say that this is supported, even if we have some bugs on UR side |
||
|
||
#ifndef __has_include | ||
#define __has_include(x) 0 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
// RUN: %{build} -o %t.out | ||
// RUN: %t.out | ||
|
||
#include <stdlib.h> | ||
#include <sycl/sycl.hpp> | ||
#include <vector> | ||
|
||
using namespace sycl; | ||
using namespace std; | ||
|
||
// This test checks the ext_oneapi_prod extension. This extension introduces | ||
// only one new function, queue::ext_oneapi_prod(), which serves as a hint to | ||
// the compiler to flush the queue. Since it is simply a hint, we cannot really | ||
// test what the backend is doing but we can at least make sure that code | ||
// involving this function compiles and runs successfully in various contexts. | ||
int main() { | ||
sycl::queue q; | ||
|
||
// Test on an empty queue multiple times. | ||
q.ext_oneapi_prod(); | ||
q.ext_oneapi_prod(); | ||
|
||
// Test on a queue after we've submitted a kernel in various contexts. | ||
q.single_task([]() {}); | ||
q.ext_oneapi_prod(); | ||
|
||
q.parallel_for(range<1>{}, [=](auto &idx) {}); | ||
q.ext_oneapi_prod(); | ||
q.wait(); | ||
|
||
// Test that the result of an in-progress addition kernel is not affected by | ||
// calling ext_oneapi_prod. | ||
srand(time(0)); | ||
constexpr int N = 16; | ||
int A[N]; | ||
int B[N]; | ||
int add[N]; | ||
int mult[N]; | ||
for (int i = 0; i < N; ++i) { | ||
A[i] = rand(); | ||
B[i] = rand(); | ||
} | ||
{ | ||
buffer<int> bufA{A, N}; | ||
buffer<int> bufB{B, N}; | ||
buffer<int> bufadd{add, N}; | ||
|
||
q.submit([&](handler &cgh) { | ||
accessor accA{bufA, cgh}; | ||
accessor accB{bufB, cgh}; | ||
accessor accadd{bufadd, cgh}; | ||
cgh.parallel_for(N, [=](id<1> i) { accadd[i] = accA[i] + accB[i]; }); | ||
}); | ||
q.ext_oneapi_prod(); | ||
} | ||
for (int i = 0; i < N; ++i) { | ||
assert(add[i] == (A[i] + B[i])); | ||
} | ||
|
||
// Test that the result of an in-progress addition and multiplication kernel | ||
// is not affected by calling ext_oneapi_prod | ||
{ | ||
buffer<int> bufA{A, N}; | ||
buffer<int> bufB{B, N}; | ||
buffer<int> bufadd{add, N}; | ||
buffer<int> bufmult{mult, N}; | ||
q.submit([&](handler &cgh) { | ||
accessor accA{bufA, cgh}; | ||
accessor accB{bufB, cgh}; | ||
accessor accadd{bufadd, cgh}; | ||
cgh.parallel_for(N, [=](id<1> i) { accadd[i] = accA[i] + accB[i]; }); | ||
}); | ||
|
||
q.submit([&](handler &cgh) { | ||
accessor accA{bufA, cgh}; | ||
accessor accB{bufB, cgh}; | ||
accessor accmult{bufmult, cgh}; | ||
cgh.parallel_for(N, [=](id<1> i) { accmult[i] = accA[i] * accB[i]; }); | ||
}); | ||
q.ext_oneapi_prod(); | ||
} | ||
for (int i = 0; i < N; ++i) { | ||
assert(add[i] == (A[i] + B[i]) && mult[i] == (A[i] * B[i])); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
//==-------- OneAPIProd.cpp --- sycl_ext_oneapi_prod unit tests ------------==// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include <sycl/sycl.hpp> | ||
|
||
#include <helpers/PiMock.hpp> | ||
|
||
#include <gtest/gtest.h> | ||
|
||
using namespace sycl; | ||
|
||
static bool QueueFlushed = false; | ||
|
||
static pi_result redefinedQueueFlush(pi_queue Queue) { | ||
QueueFlushed = true; | ||
return PI_SUCCESS; | ||
} | ||
|
||
TEST(OneAPIProdTest, PiQueueFlush) { | ||
sycl::unittest::PiMock Mock(backend::ext_oneapi_level_zero); | ||
sycl::platform Plt = Mock.getPlatform(); | ||
Mock.redefine<detail::PiApiKind::piQueueFlush>(redefinedQueueFlush); | ||
context Ctx{Plt}; | ||
queue Queue{Ctx, default_selector_v}; | ||
Queue.ext_oneapi_prod(); | ||
EXPECT_TRUE(QueueFlushed); | ||
sycl::ext::oneapi::experimental::command_graph Graph(Ctx, Queue.get_device()); | ||
Graph.begin_recording(Queue); | ||
try { | ||
Queue.ext_oneapi_prod(); // flushing while graph is recording is not allowed | ||
FAIL() << "Expected exception when calling ext_oneapi_prod() during graph " | ||
"recording not seen."; | ||
} catch (exception &ex) { | ||
EXPECT_EQ(ex.code(), make_error_code(errc::invalid)); | ||
} | ||
lbushi25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Graph.end_recording(Queue); | ||
} |
Uh oh!
There was an error while loading. Please reload this page.