Skip to content

Commit 3244eb2

Browse files
EwanCmartygrant
authored andcommitted
Update Documentation with OpenCL limiations
Describe user facing limitations of OpenCL graphs backend and document failing tests this directly affects.
1 parent 7189586 commit 3244eb2

18 files changed

+104
-20
lines changed

sycl/doc/design/CommandGraph.md

Lines changed: 52 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -149,8 +149,8 @@ yet been implemented.
149149
Implementation of UR command-buffers
150150
for each of the supported SYCL 2020 backends.
151151

152-
Backends which are implemented currently are: Level Zero, CUDA and OpenCL.
153-
More sub-sections will be added here as other backends are supported.
152+
Backends which are implemented currently are: [Level Zero](#level-zero),
153+
[CUDA](#cuda), and partial support for [OpenCL](#opencl).
154154

155155
### Level Zero
156156

@@ -249,9 +249,41 @@ graph resubmission.
249249

250250
### OpenCL
251251

252-
Command Buffers are defined in the OpenCL spec in the [cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer) extension.
253-
254-
There are some gaps in both the OpenCL and UR specifications for Command
252+
SYCL-Graphs is only enabled for an OpenCL backend when the
253+
[cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer)
254+
extension is available, however this information isn't available until runtime
255+
due to OpenCL implementations being loaded through an ICD.
256+
257+
The `ur_exp_command_buffer` string is conditionally returned from the OpenCL
258+
command-buffer UR backend at runtime based on `cl_khr_command_buffer` support
259+
to indicate that the graphs extension should be enabled. This is information
260+
is propagated to the SYCL user via the
261+
`device.get_info<info::device::graph_support>()` query for graphs extension
262+
support.
263+
264+
#### Limitations
265+
266+
Due to the API mapping gaps documented in the following section, OpenCL as a
267+
SYCL backend cannot fully support the graphs API. Instead there are
268+
limitations in the types on nodes which a user can add to a graph, using
269+
an unsupported node type will cause an abort in graph finalization with the
270+
message
271+
`ur_die: Experimental Command-buffer entry point is not implemented for OpenCL adapter.`.
272+
273+
The types of commands which are unsupported, and lead to this exception are:
274+
* `handler::copy(src, dest)` - Where `src` is an accessor and `dest` is a pointer.
275+
This corresponds to a memory buffer read command.
276+
* `handler::copy(src, dest)` - Where `src` is an pointer and `dest` is an accessor.
277+
This corresponds to a memory buffer write command.
278+
* `handler::copy(src, dest)` or `handler::memcpy(dest, src)` - Where both `src` and
279+
`dest` are USM pointers. This corresponds to a USM copy command.
280+
281+
Note that `handler::copy(src, dest)` where both `src` and `dest` are an accessor
282+
is supported, as a memory buffer copy command exists in the OpenCL extension.
283+
284+
#### UR API Mapping
285+
286+
There are some gaps in both the OpenCL and UR specifications for Command
255287
Buffers shown in the list below. There are implementations in the UR OpenCL
256288
adapter where there is matching support for each function in the list.
257289

@@ -280,27 +312,29 @@ adapter where there is matching support for each function in the list.
280312
| | clCommandSVMMemcpyKHR | No |
281313
| | clCommandSVMMemFillKHR | No |
282314

315+
#### UR Command-Buffer Implementation
316+
283317
Many of the OpenCL functions take a `cl_command_queue` parameter which is not
284318
present in most of the UR functions. Instead, when a new command buffer is
285-
created in `urCommandBufferCreateExp` we also create and maintain a new
286-
internal `ur_queue_handle_t` with a reference stored inside of the
287-
`ur_exp_command_buffer_handle_t_` struct. This internal queue is then used with
288-
the various append functions. The internal queue is retained and released
289-
whenever the owning command buffer is retained or released.
319+
created in `urCommandBufferCreateExp` we also create and maintain a new
320+
internal `ur_queue_handle_t` with a reference stored inside of the
321+
`ur_exp_command_buffer_handle_t_` struct. The internal queue is retained and
322+
released whenever the owning command buffer is retained or released.
290323

291324
With command buffers being an OpenCL extension, each function is accessed by
292325
loading a function pointer to its implementation. These are defined in a common
293326
header file in the UR OpenCL adapter. The symbols for the functions are however
294-
defined in [OpenCL-Headers](https://github.com/KhronosGroup/OpenCL-Headers/blob/main/CL/cl_ext.h) but it is not known at this time what version of the headers will be used in the UR GitHub CI configuration, so loading the function
295-
pointers will be used until this can be verified. A future piece of work would
296-
be replacing the custom defined symbols with the ones from OpenCL-Headers.
327+
defined in [OpenCL-Headers](https://github.com/KhronosGroup/OpenCL-Headers/blob/main/CL/cl_ext.h)
328+
but it is not known at this time what version of the headers will be used in
329+
the UR GitHub CI configuration, so loading the function pointers will be used
330+
until this can be verified. A future piece of work would be replacing the
331+
custom defined symbols with the ones from OpenCL-Headers.
332+
333+
#### Available OpenCL Command-Buffer Implementations
297334

298-
The `UR_DEVICE_INFO_EXTENSIONS` enum can be used with `urDeviceGetInfo` to
299-
query if a specified device supports OpenCL command buffers. This will append
300-
`ur_exp_command_buffer` to a string pointer passed to the function if the
301-
extension is supported.
335+
Publicly available implementations of `cl_khr_command_buffer` that can be used
336+
to enable the graphs extension in OpenCL:
302337

303-
Known implementations of cl_khr_command_buffer:
304338
- [OneAPI-Construction-Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building)
305339
- [PoCL](http://portablecl.org/)
306340
- [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu)

sycl/doc/design/images/SYCL-Graph-Architecture.svg

Lines changed: 1 addition & 1 deletion
Loading

sycl/test-e2e/Graph/Explicit/buffer_copy_host2target.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Host to device copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_EXPLICIT
1215

sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_2d.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Host to device copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_EXPLICIT
1215

sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_offset.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Host to device copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_EXPLICIT
1215

sycl/test-e2e/Graph/Explicit/buffer_copy_target2host.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Device to host copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_EXPLICIT
1215

sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_2d.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Device to host copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_EXPLICIT
1215

sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_offset.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Device to host copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_EXPLICIT
1215

sycl/test-e2e/Graph/Explicit/usm_copy.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@
44
// RUN: %if ext_oneapi_level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
55
//
66
// CHECK-NOT: LEAK
7+
//
8+
// USM copy command not supported for OpenCL
9+
// UNSUPPORTED: opencl
710

811
#define GRAPH_E2E_EXPLICIT
912

sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Host to device copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_RECORD_REPLAY
1215

sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_2d.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Host to device copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_RECORD_REPLAY
1215

sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_offset.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Host to device copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_RECORD_REPLAY
1215

sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Device to host copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_RECORD_REPLAY
1215

sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_2d.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Device to host copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_RECORD_REPLAY
1215

sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_offset.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
//
88
// TODO enable cuda once buffer issue investigated and fixed
99
// UNSUPPORTED: cuda
10+
//
11+
// Device to host copy command not supported for OpenCL
12+
// UNSUPPORTED: opencl
1013

1114
#define GRAPH_E2E_RECORD_REPLAY
1215

sycl/test-e2e/Graph/RecordReplay/usm_copy.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@
44
// RUN: %if ext_oneapi_level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
55
//
66
// CHECK-NOT: LEAK
7+
//
8+
// USM copy command not supported for OpenCL
9+
// UNSUPPORTED: opencl
710

811
#define GRAPH_E2E_RECORD_REPLAY
912

sycl/test-e2e/Graph/RecordReplay/usm_copy_in_order.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@
44
// RUN: %if ext_oneapi_level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
55
//
66
// CHECK-NOT: LEAK
7+
//
8+
// USM copy command not supported for OpenCL
9+
// UNSUPPORTED: opencl
710

811
// Tests memcpy operation using device USM and an in-order queue.
912

sycl/test-e2e/Graph/device_query.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,13 @@ int main() {
2121
auto Backend = Device.get_backend();
2222

2323
if ((Backend == backend::ext_oneapi_level_zero) ||
24-
(Backend == backend::ext_oneapi_cuda) || (Backend == backend::opencl)) {
24+
(Backend == backend::ext_oneapi_cuda)) {
2525
assert(SupportsGraphs == exp_ext::graph_support_level::native);
26+
} else if (Backend == backend::opencl) {
27+
// OpenCL backend support is conditional on the cl_khr_command_buffer
28+
// extension being available
29+
assert(SupportsGraphs == exp_ext::graph_support_level::native ||
30+
SupportsGraphs == exp_ext::graph_support_level::unsupported);
2631
} else {
2732
assert(SupportsGraphs == exp_ext::graph_support_level::unsupported);
2833
}

0 commit comments

Comments
 (0)