Skip to content

Commit edccb9b

Browse files
committed
[SYCL][DOC] Fix warnings after upgrading sphinx
New sphinx/myst emits more bad cross-reference targets. Warnings like: :'myst' cross-reference target not found: 'prog-scope-var-decl' [myst.xref_missing]
1 parent e7c0b89 commit edccb9b

17 files changed

+47
-37
lines changed

sycl/doc/GetStartedGuide.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
4545
* `ninja` -
4646
[Download](https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages)
4747
* C++ compiler
48-
* See LLVM's [host compiler toolchain requirements](../../llvm/docs/GettingStarted.rst#host-c-toolchain-both-compiler-and-standard-library)
48+
* See LLVM's [host compiler toolchain requirements](https://github.com/intel/llvm/blob/sycl/llvm/docs/GettingStarted.rst#host-c-toolchain-both-compiler-and-standard-library)
4949

5050
Alternatively, you can use a Docker image that has everything you need for
5151
building pre-installed:
@@ -543,7 +543,7 @@ AOT compiler for each device type:
543543
#### CPU
544544
545545
* CPU AOT compiler `opencl-aot` is enabled by default. For more, see
546-
[opencl-aot documentation](../../opencl/opencl-aot/README.md).
546+
[opencl-aot documentation](https://github.com/intel/llvm/blob/sycl/opencl/opencl-aot/README.md).
547547
548548
#### Accelerator
549549
@@ -709,7 +709,7 @@ ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe
709709
710710
**NOTE**: oneAPI DPC++/SYCL developers can specify SYCL device for execution
711711
using device selectors (e.g. `sycl::cpu_selector_v`, `sycl::gpu_selector_v`,
712-
[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.md))
712+
[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.asciidoc))
713713
as explained in following section
714714
[Code the program for a specific GPU](#code-the-program-for-a-specific-gpu).
715715

sycl/doc/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
]
3838

3939
# Implicit targets for cross reference
40-
myst_heading_anchors = 4
40+
myst_heading_anchors = 5
4141

4242
# The name of the Pygments (syntax highlighting) style to use.
4343
pygments_style = 'friendly'

sycl/doc/cuda/opencl-subgroup-vs-cuda-crosslane-op.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# CUDA crosslane vs OpenCL sub-groups
22

33
## Sub-group function mapping
4-
This document describes the mapping of the SYCL subgroup operations (based on the proposal [SYCL subgroup proposal](../extensions/sub_group_ndrange/sub_group_ndrange.md)) to CUDA (queries responses and PTX instruction mapping)
4+
This document describes the mapping of the SYCL subgroup operations (based on the proposal SYCL subgroup proposal) to CUDA (queries responses and PTX instruction mapping)
55

66
### Sub-group device Queries
77

sycl/doc/design/Assert.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ int main() {
4141
In this use-case every work-item with even index along 0 dimension will trigger
4242
assertion failure. Assertion failure should trigger a call to `std::abort()` at
4343
host as described in
44-
[extension](../extensions/supported/SYCL_EXT_ONEAPI_ASSERT.asciidoc).
44+
[extension](../extensions/supported/sycl_ext_oneapi_assert.asciidoc).
4545
Even though multiple failures of the same or different assertions can happen in
4646
multiple work-items, implementation is required to deliver at least one
4747
assertion. The assertion failure message is printed to `stderr` by DPCPP
@@ -81,7 +81,7 @@ practical cases.
8181
## How it works?
8282
8383
`assert(expr)` macro ends up in call to `__devicelib_assert_fail`. This function
84-
is part of [Device library extension](DeviceLibExtensions.rst#cl_intel_devicelib_cassert).
84+
is part of [Device library extension](https://github.com/intel/llvm/blob/sycl/doc/design/DeviceLibExtensions.rst#cl_intel_devicelib_cassert).
8585
8686
The format of the assert message is unspecified, but it will always include the
8787
text of the failing expression, the values of the standard macros `__FILE__` and
@@ -168,6 +168,7 @@ image. All of them should have `extern` declaration of program scope variable
168168
available. Definition of the variable is only available within devicelib in the
169169
same binary image where fallback `__devicelib_assert_fail` resides.
170170
171+
(prog-scope-var-decl)=
171172
<a name="prog-scope-var-decl">The variable has the following structure and
172173
declaration:</a>
173174

sycl/doc/design/CommandGraph.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Command-Graph Extension
22

33
This document describes the implementation design of the
4-
[SYCL Graph Extension](../extensions/proposed/sycl_ext_oneapi_graph.asciidoc).
4+
[SYCL Graph Extension](../extensions/experimental/sycl_ext_oneapi_graph.asciidoc).
55

66
A related presentation can be found
77
[here](https://www.youtube.com/watch?v=aOTAmyr04rM).
@@ -121,14 +121,14 @@ proposal. Memory operations will be supported subsequently by the current
121121
implementation starting with `memcpy`.
122122

123123
Buffers and accessors are supported in a command-graph. There are
124-
[spec restrictions](../extensions/proposed/sycl_ext_oneapi_graph.asciidoc#storage-lifetimes)
124+
[spec restrictions](../extensions/experimental/sycl_ext_oneapi_graph.asciidoc#storage-lifetimes)
125125
on buffer usage in a graph so that their lifetime semantics are compatible with
126126
a lazy work execution model. However these changes to storage lifetimes have not
127127
yet been implemented.
128128

129129
## Backend Implementation
130130

131-
Implementation of [UR command-buffers](#UR-command-buffer-experimental-feature)
131+
Implementation of UR command-buffers
132132
for each of the supported SYCL 2020 backends.
133133

134134
This is currently only Level Zero but more sub-sections will be added here as

sycl/doc/design/CompileTimeProperties.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ One use for compile-time properties is with types that are used exclusively
4040
for declaring global variables. One such example is the
4141
[sycl\_ext\_oneapi\_device\_global][2] extension:
4242

43-
[2]: <../extensions/proposed/sycl_ext_oneapi_device_global.asciidoc>
43+
[2]: <../extensions/experimental/sycl_ext_oneapi_device_global.asciidoc>
4444

4545
```
4646
namespace sycl::ext::oneapi {
@@ -271,7 +271,7 @@ proposed in the [sycl\_ext\_oneapi\_kernel\_properties][8] extension. There
271271
are two ways the application can specify these properties. The first is by
272272
passing a `properties` parameter to the function that submits the kernel:
273273

274-
[8]: <../extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc>
274+
[8]: <../extensions/experimental/sycl_ext_oneapi_kernel_properties.asciidoc>
275275

276276
```
277277
namespace sycl {

sycl/doc/design/CompilerAndRuntimeDesign.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -484,7 +484,7 @@ list coming either from `llvm-spirv` or from the AOT backend.
484484
Targeting PTX currently only accepts a single input file for processing, so
485485
`file-table-tform` is used to extract the code file from the file table, which
486486
is then processed by the
487-
["PTX target processing" step](#device-code-post-link-step-for-CUDA).
487+
["PTX target processing" step](#device-code-post-link-step-for-cuda).
488488
The resulting device binary is inserted back into the file table in place of the
489489
extracted code file using `file-table-tform`. If `-fno-sycl-rdc` is specified,
490490
all shown tools are invoked multiple times, once per translation unit rather than
@@ -556,7 +556,7 @@ TBD
556556

557557
##### Specialization constants lowering
558558

559-
See [corresponding documentation](SpecializationConstants.md)
559+
See corresponding documentation
560560

561561
#### CUDA support
562562

@@ -1011,4 +1011,4 @@ with any other address space (including default).
10111011
10121012
## DPC++ Language extensions to SYCL
10131013
1014-
List of language extensions can be found at [extensions](../extensions)
1014+
List of language extensions can be found at [extensions](https://github.com/intel/llvm/blob/sycl/doc/extensions/)

sycl/doc/design/DeviceAspectTraitDesign.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,6 @@ This relies on the fact that unspecialized variants of `any_device_has` and
125125

126126
[1]: <https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:device-aspects>
127127
[2]: <../extensions/proposed/sycl_ext_oneapi_device_if.asciidoc>
128-
[3]: <../extensions/proposed/sycl_ext_oneapi_device_architecture.asciidoc>
128+
[3]: <../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc>
129129
[4]: <DeviceIf.md>
130130
[5]: <OptionalDeviceFeatures.md>

sycl/doc/design/DeviceConfigFile.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@ in more detail.
274274

275275
### Changes to Build Infrastructure
276276
We need the information about the targets in multiple tools and compiler
277-
modules listed in [Requirements](#Requirements). Thus, we need to make sure
277+
modules listed in [Requirements](#requirements). Thus, we need to make sure
278278
that the generation of the `.inc` file out of the `.td` file is done in time
279279
for all the consumers. The command we need to run for TableGen is `llvm-tblgen
280280
-gen-dynamic-tables -I /llvm-root/llvm/include/ input.td -o output.inc`.
@@ -302,7 +302,7 @@ the Device Configuration File (e.g. `sycl-post-link`) so that each of the
302302
tools can modify the map according to the user extensions described in the
303303
`.yaml` file.
304304

305-
As mentioned in [Requirements](#Requirements), there is an auto-detection
305+
As mentioned in [Requirements](#requirements), there is an auto-detection
306306
mechanism for `aot-toolchain` and `aot-toolchain-options` that is able to
307307
infer these from the target name. In the `.yaml` example shown above the target
308308
name is `intel_gpu_skl`. From that name, we can infer that `aot-toolchain` is

sycl/doc/design/DeviceGlobal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This document describes the implementation design for the DPC++ extension
44
[sycl\_ext\_oneapi\_device\_global][1], which allows applications to declare
55
global variables in device code.
66

7-
[1]: <../extensions/proposed/sycl_ext_oneapi_device_global.asciidoc>
7+
[1]: <../extensions/experimental/sycl_ext_oneapi_device_global.asciidoc>
88

99

1010
## Requirements

sycl/doc/design/DeviceIf.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ This document describes the design for the DPC++ implementation of the
55
[sycl\_ext\_oneapi\_device\_architecture][2] extensions.
66

77
[1]: <../extensions/proposed/sycl_ext_oneapi_device_if.asciidoc>
8-
[2]: <../extensions/proposed/sycl_ext_oneapi_device_architecture.asciidoc>
8+
[2]: <../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc>
99

1010

1111
## Phased implementation

sycl/doc/design/KernelProgramCache.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ predefined HW configuration(s). As a general solution it is reasonable to have
8181
program persistent cache which works between application restarts (e.g. cache
8282
on disk for device code built for specific HW/SW configuration).
8383

84+
(what-is-program)=
8485
<a name="what-is-program">1</a>: Here "program" means an internal SYCL runtime
8586
object corresponding to a device code module or native binary defining a set of
8687
SYCL kernels and/or device functions.
@@ -112,9 +113,11 @@ The kernels map's key consists of two components:
112113
- the program the kernel belongs to,
113114
- kernel name<sup>[3](#what-is-kname)</sup>.
114115

116+
(what-is-ksid)=
115117
<a name="what-is-ksid">1</a>: Kernel set id is an ordinal number of the device
116118
binary image the kernel is contained in.
117119

120+
(what-is-bopts)=
118121
<a name="what-is-bopts">2</a>: The concatenation of build options (both compile
119122
and link options) set in application or environment variables. There are three
120123
sources of build options that the cache is aware of:
@@ -131,6 +134,7 @@ values (e.g. IGC has
131134
which affect JIT process). Changing such configuration will invalidate cache and
132135
manual cache cleanup should be done.
133136

137+
(what-is-kname)=
134138
<a name="what-is-kname">3</a>: Kernel name is a kernel ID mangled class' name
135139
which is provided to methods of `sycl::handler` (e.g. `parallel_for` or
136140
`single_task`).
@@ -162,9 +166,11 @@ stored on disk (in every <n>.src file located in the cache item directory):
162166
containing 2 files: <max_n+1>.src for key values and <max_n+1>.bin for
163167
built image.
164168

169+
(what-is-diid)=
165170
<a name="what-is-diid">1</a>: Hash out of the device code image used as input
166171
for the build.
167172

173+
(what-is-did)=
168174
<a name="what-is-did">2</a>: Hash out of the string which is concatenation of
169175
values for `info::platform::name`, `info::device::name`,
170176
`info::device::version`, `info::device::driver_version` parameters to
@@ -321,9 +327,11 @@ condition variable. We employ them to signal waiting threads that the build
321327
process for this kernel/program is finished (either successfully or with a
322328
failure).
323329

330+
(remove-pointer)=
324331
<a name="remove-pointer">1</a>: The use of `std::remove_pointer` was omitted for
325332
the sake of simplicity here.
326333

334+
(exception-data)=
327335
<a name="exception-data">2</a>: Actually, we store contents of the exception:
328336
its message and error code.
329337

@@ -387,6 +395,7 @@ in a directory, the directory should be locked until file creation is done.
387395
Advisory locking <sup>[1](#advisory-lock)</sup> is used to ensure that the
388396
user/OS tools are able to manage files.
389397

398+
(advisory-lock)=
390399
<a name="advisory-lock">1.</a> Advisory locks work only when a process
391400
explicitly acquires and releases locks, and are ignored if a process is not
392401
aware of locks.

sycl/doc/design/OffloadDesign.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ the DPC++ Compiler. This leverages the existing community Offloading
77
design [OffloadingDesign][1] which covers the Clang driver and code generation
88
steps for creating offloading applications.
99

10-
[1]: <../../../clang/docs/OffloadingDesign.rst>
10+
[1]: <https://github.com/intel/llvm/blob/clang/docs/OffloadingDesign.rst>
1111

1212
The current offloading model is completely encapsulated within the Clang
1313
Compiler Driver requiring the driver to perform all of the additional steps

sycl/doc/design/OptionalDeviceFeatures.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -266,7 +266,7 @@ non-FPGA users may want to use the `device_global` property
266266
[`device_image_scope`][5], which requires even non-FPGA users to have precise
267267
control over the way kernels are bundled into device images.
268268

269-
[5]: <../extensions/proposed/sycl_ext_oneapi_device_global.asciidoc#properties-for-device-global-variables>
269+
[5]: <../extensions/experimental/sycl_ext_oneapi_device_global.asciidoc#properties-for-device-global-variables>
270270

271271
The new definition of `-fsycl-device-code-split` is as follows:
272272

@@ -1091,10 +1091,10 @@ The "name" column in this table lists the possible target names. Since not all
10911091
targets have a corresponding enumerator in the `architecture` enumeration, the
10921092
second column tells when there is such an enumerator. The last row in this
10931093
table corresponds to all of the architecture names listed in the
1094-
[sycl\_ext\_intel\_device\_architecture][8] extension whose name starts with
1094+
[sycl\_ext\_oneapi\_device\_architecture][8] extension whose name starts with
10951095
`intel_gpu_`.
10961096

1097-
[8]: <../extensions/proposed/sycl_ext_intel_device_architecture.asciidoc>
1097+
[8]: <../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc>
10981098

10991099
TODO: This table needs to be filled out for the CPU variants supported by the
11001100
`opencl-aot` tool (avx512, avx2, avx, sse4.2) and for the FPGA targets. We

sycl/doc/design/SYCLNativeCPU.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ In order to execute kernels compiled for `native-cpu`, we provide a PI Plugin. T
3131

3232
# Supported features and current limitations
3333

34-
The SYCL Native CPU flow is still WIP, not optimized and several core SYCL features are currently unsupported. Currently `barrier` and several math builtins are not supported, and attempting to use those will most likely fail with an `undefined reference` error at link time. Examples of supported applications can be found in the [runtime tests](sycl/test/native_cpu).
34+
The SYCL Native CPU flow is still WIP, not optimized and several core SYCL features are currently unsupported. Currently `barrier` and several math builtins are not supported, and attempting to use those will most likely fail with an `undefined reference` error at link time. Examples of supported applications can be found in the [runtime tests](https://github.com/intel/llvm/blob/sycl/sycl/test/native_cpu).
3535

3636

3737
To execute the `e2e` tests on the Native CPU, configure the test suite with:
@@ -93,13 +93,13 @@ entry:
9393
}
9494
```
9595

96-
For the Native CPU target, the device compiler is in charge of materializing the SPIRV builtins (such as `@__spirv_BuiltInGlobalInvocationId`), so that they can be correctly updated by the runtime when executing the kernel. This is performed by the [PrepareSYCLNativeCPU pass](llvm/lib/SYCLLowerIR/PrepareSYCLNativeCPU.cpp).
96+
For the Native CPU target, the device compiler is in charge of materializing the SPIRV builtins (such as `@__spirv_BuiltInGlobalInvocationId`), so that they can be correctly updated by the runtime when executing the kernel. This is performed by the [PrepareSYCLNativeCPU pass](https://github.com/intel/llvm/blob/sycl/llvm/lib/SYCLLowerIR/PrepareSYCLNativeCPU.cpp).
9797
The PrepareSYCLNativeCPUPass also emits a `subhandler` function, which receives the kernel arguments from the SYCL runtime (packed in a vector), unpacks them, and forwards only the used ones to the actual kernel.
9898

9999

100100
## PrepareSYCLNativeCPU Pass
101101

102-
This pass will add a pointer to a `nativecpu_state` struct as kernel argument to all the kernel functions, and it will replace all the uses of SPIRV builtins with the return value of appropriately defined functions, which will read the requested information from the `__nativecpu_state` struct. The `__nativecpu_state` struct and the builtin functions are defined in [native_cpu.hpp](sycl/include/sycl/detail/native_cpu.hpp).
102+
This pass will add a pointer to a `nativecpu_state` struct as kernel argument to all the kernel functions, and it will replace all the uses of SPIRV builtins with the return value of appropriately defined functions, which will read the requested information from the `__nativecpu_state` struct. The `__nativecpu_state` struct and the builtin functions are defined in [native_cpu.hpp](https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/detail/native_cpu.hpp).
103103

104104

105105
The resulting IR is:
@@ -160,7 +160,7 @@ Each entry in the array contains the kernel name as a string, and a pointer to t
160160

161161
## Kernel lowering and execution
162162

163-
The information produced by the device compiler is then employed to correctly lower the kernel LLVM-IR module to the target ISA (this is performed by the driver when `-fsycl-targets=native_cpu` is set). The object file containing the kernel code is linked with the host object file (and libsycl and any other needed library) and the final executable is ran using the Native CPU PI Plug-in, defined in [pi_native_cpu.cpp](sycl/plugins/native_cpu/pi_native_cpu.cpp).
163+
The information produced by the device compiler is then employed to correctly lower the kernel LLVM-IR module to the target ISA (this is performed by the driver when `-fsycl-targets=native_cpu` is set). The object file containing the kernel code is linked with the host object file (and libsycl and any other needed library) and the final executable is ran using the Native CPU PI Plug-in, defined in [pi_native_cpu.cpp](https://github.com/intel/llvm/blob/sycl/sycl/plugins/native_cpu/pi_native_cpu.cpp).
164164

165165
## Ongoing work
166166

sycl/doc/design/SharedLibraries.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -351,7 +351,7 @@ of defined symbols. If this assumption is not correct, there can be two cases:
351351
device image is taken to use duplicated symbol
352352
- Same symbols have different definitions. In this case ODR violation takes
353353
place, such situation leads to undefined behaviour. For more details refer
354-
to [ODR violations](#ODR-violations) section.
354+
to [ODR violations](#odr-violations) section.
355355
- The situation when two device images of different formats define the same
356356
symbols with two different definitions is not considered as ODR violation.
357357
In this case the suitable device image will be picked.

0 commit comments

Comments
 (0)