Skip to content

Commit d57994c

Browse files
committed
[SYCL][Doc] Use secnum option
Signed-off-by: Lukas Sommer <[email protected]>
1 parent 5576f40 commit d57994c

File tree

1 file changed

+33
-31
lines changed

1 file changed

+33
-31
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph_fusion.asciidoc

Lines changed: 33 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,16 @@
1010
:encoding: utf-8
1111
:lang: en
1212
:dpcpp: pass:[DPC++]
13-
:stem: asciimath
13+
:sectnums:
14+
:sectnumlevels: 4
1415

1516
// Set the default source code type in this document to C++,
1617
// for syntax highlighting purposes. This is needed because
1718
// docbook uses c++ and html5 uses cpp.
1819
:language: {basebackend@docbook:c++:cpp}
1920

2021

21-
== 1. Notice
22+
== Notice
2223

2324
[%hardbreaks]
2425
Copyright (C) Codeplay Software Limited. All rights reserved.
@@ -28,14 +29,14 @@ of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
2829
permission by Khronos.
2930

3031

31-
== 2. Contact
32+
== Contact
3233

3334
To report problems with this extension, please open a new issue at:
3435

3536
https://github.com/intel/llvm/issues
3637

3738

38-
== 3. Dependencies
39+
== Dependencies
3940

4041
This extension is written against the SYCL 2020 revision 6 specification. All
4142
references below to the "core SYCL specification" or to section numbers in the
@@ -45,15 +46,15 @@ This extension builds on top of the proposed SYCL graphs
4546
https://github.com/reble/llvm/blob/sycl-graph-update/sycl/doc/extensions/proposed/sycl_ext_oneapi_graph.asciidoc[extension
4647
proposal]. All references to the "graphs proposal" refer to this proposal.
4748

48-
== 4. Status
49+
== Status
4950

5051
This is a proposed extension specification, intended to gather community
5152
feedback. Interfaces defined in this specification may not be implemented yet
5253
or may be in a preliminary state. The specification itself may also change in
5354
incompatible ways before it is finalized. *Shipping software products should
5455
not rely on APIs defined in this specification.*
5556

56-
== 5. Overview
57+
== Overview
5758

5859
The SYCL graph
5960
https://github.com/reble/llvm/blob/sycl-graph-update/sycl/doc/extensions/proposed/sycl_ext_oneapi_graph.asciidoc[extension
@@ -87,9 +88,9 @@ fusion of two or more kernels in a SYCL graph into a single kernel **at
8788
runtime**. This requires the extension of the runtime with some sort of JIT
8889
compiler to allow for the fusion of kernel functions at runtime.
8990

90-
== 6. Specification
91+
== Specification
9192

92-
=== 6.1. Feature test macro
93+
=== Feature test macro
9394

9495
This extension provides a feature-test macro as described in the core SYCL
9596
specification. An implementation supporting this extension must predefine the
@@ -108,11 +109,11 @@ supports.
108109
|Initial version of this extension.
109110
|===
110111

111-
=== 6.2. API modifications
112+
=== API modifications
112113

113-
==== 6.2.1. Properties
114+
==== Properties
114115

115-
===== 6.2.1.1. Graph Fusion Property
116+
===== Graph Fusion Property
116117

117118
The API for `command_graph<graph_state::modifiable>::finalize()` includes a
118119
`property_list` parameter. The following property, defined by this extension,
@@ -124,7 +125,7 @@ sycl::ext::oneapi::experimental::property::command_graph::perform_fusion
124125
```
125126

126127
The property is not prescriptive. Implementations are free to not perform fusion
127-
if it is not possible (see below section <<_6_5_limitations>>), fusion is not
128+
if it is not possible (see below section <<_limitations>>), fusion is not
128129
supported by the implementation, or the implementation decides not to perform
129130
fusion for other reasons. It is not an error if an implementation does not
130131
perform fusion even though the property is passed.
@@ -133,7 +134,7 @@ Implementations can provide a diagnostic message in case fusion was not
133134
performed through an implementation-specified mechanism, but are not required to
134135
do so.
135136

136-
===== 6.2.1.2. Barrier property
137+
===== Barrier property
137138

138139
The following property can be added to the `property_list` of the
139140
`command_graph<graph_state::modifiable>::finalize()` API.
@@ -150,7 +151,7 @@ The property only takes effect if the
150151
property is also part of the `property_list` of the same invocation of
151152
`command_graph<...>::finalize()`.
152153

153-
===== 6.2.1.3. Local internalization property
154+
===== Local internalization property
154155

155156
The following property can be passed to three different APIs, namely:
156157

@@ -187,7 +188,7 @@ no error if they do not perform internalization. Implementations can provide a
187188
diagnostic message in case internalization was not performed through an
188189
implementation-specified mechanism, but are not required to do so.
189190

190-
===== 6.2.1.4. Private internalization property
191+
===== Private internalization property
191192

192193
The following property can be passed to three different APIs, namely:
193194

@@ -224,7 +225,7 @@ no error if they do not perform internalization. Implementations can provide a
224225
diagnostic message in case internalization was not performed through an
225226
implementation-specified mechanism, but are not required to do so.
226227

227-
==== 6.2.2. Device information descriptors
228+
==== Device information descriptors
228229

229230
To support querying whether a SYCL device and the underlying platform support
230231
kernel fusion for graphs, the following device information descriptor is added
@@ -238,7 +239,7 @@ When passed to `device::get_info<...>()`, the function returns `true` if the
238239
SYCL `device` and the underlying `platform` support kernel fusion for graphs.
239240

240241

241-
=== 6.3. Linearization
242+
=== Linearization
242243

243244
In order to be able to perform kernel fusion, the commands in a graph must be
244245
arranged in a valid sequential order.
@@ -252,7 +253,7 @@ partial order) is implementation defined. The linearization should be
252253
deterministic, i.e. it should yield the same sequence when presented with the
253254
same DAG.
254255

255-
=== 6.4. Synchronization in kernels
256+
=== Synchronization in kernels
256257

257258
Group barriers semantics do not change in the fused kernel and barriers already
258259
in the unfused kernels are preserved in the fused kernel. Despite this, it is
@@ -261,7 +262,7 @@ same work-group executing a fused kernel, a barrier is added between each of the
261262
kernels being fused. This automatic insertion of additional barriers can be
262263
deactivated through the property defined above.
263264

264-
=== 6.5. Limitations
265+
=== Limitations
265266

266267
Some scenarios might require fusion to be cancelled if some undesired scenarios
267268
arise.
@@ -278,29 +279,29 @@ The following sections describe a number of scenarios that might require to
278279
cancel fusion. Note that some implementations might be more capable/permissive
279280
and might not abort fusion in all of these cases.
280281

281-
==== 6.5.1. Hierarchical Parallelism
282+
==== Hierarchical Parallelism
282283

283284
The extension does not support kernels using hierarchical parallelism. Although
284285
some implementations might want to add support for this kind of kernels.
285286

286-
==== 6.5.2. Incompatible ND-ranges of the kernels to fuse
287+
==== Incompatible ND-ranges of the kernels to fuse
287288

288289
Incompatibility of ND-ranges will be determined by the kernel fusion
289290
implementation. All implementations should support fusing kernels with the exact
290291
same ND-ranges, but implementations might cancel fusion as soon as a kernel with
291292
a different ND-range is submitted.
292293

293-
==== 6.5.3. Kernels with different dimensions
294+
==== Kernels with different dimensions
294295

295296
Similar to the previous one, it is implementation-defined whether or not to
296297
support fusing kernels with different dimensionality.
297298

298-
==== 6.5.4. No intermediate representation
299+
==== No intermediate representation
299300

300301
In case any of the kernels to be fused does not come with an accessible
301302
suitable intermediate representation, kernel fusion is canceled.
302303

303-
==== 6.5.5. Explicit memory operations and host tasks
304+
==== Explicit memory operations and host tasks
304305

305306
The graph proposal allows graphs to contain, next to device kernels, explicit
306307
memory operations and host tasks. As both of these types of commands cannot be
@@ -311,13 +312,13 @@ It is valid to execute some memory operations and host tasks before all device
311312
kernels and some after all device kernels, as long as that sequence is a valid
312313
linearization.
313314

314-
==== 6.5.6. Multi-device graph
315+
==== Multi-device graph
315316

316317
Attempting to fuse a graph containing device kernels for more than one device
317318
may lead to fusion being cancelled, as kernel fusion across multiple devices
318319
and/or backends is generally not possible.
319320

320-
=== 6.6. Internalization
321+
=== Internalization
321322

322323
While avoiding repeated kernel launch overheads will most likely already improve
323324
application performance, kernel fusion can deliver even higher performance gains
@@ -368,7 +369,7 @@ no internalization were to happen.
368369
In sum this allows users to trigger internalization of a buffer or allocated
369370
device memory by just specifying a single property.
370371

371-
==== 6.6.1. Buffer internalization
372+
==== Buffer internalization
372373

373374
In some cases, the user will specify different internalization targets for a
374375
buffer and accessors to such buffer. When incompatible combinations are used, an
@@ -440,9 +441,9 @@ performed. If there is a mismatch between the two accessors (access
440441
range, access offset, number of dimensions, data type), no
441442
internalization is performed.
442443

443-
== 7. Examples
444+
== Examples
444445

445-
=== 7.1. Buffer-based example
446+
=== Buffer-based example
446447

447448
```c++
448449
#include <sycl/sycl.hpp>
@@ -530,7 +531,7 @@ int main() {
530531
}
531532
```
532533

533-
=== 7.2. USM-based example
534+
=== USM-based example
534535

535536
```c++
536537
#include <sycl/sycl.hpp>
@@ -617,11 +618,12 @@ int main() {
617618
}
618619
```
619620

620-
== 8. Contributors
621+
== Contributors
621622

622623
Lukas Sommer, Codeplay +
623624
Victor Lomüller, Codeplay +
624625
Victor Perez, Codeplay +
626+
Ewan Crawford, Codeplay +
625627

626628
== Revision History
627629

0 commit comments

Comments
 (0)