Skip to content

Commit 08163a4

Browse files
RuykAlexander Johnston
andauthored
[SYCL][CUDA] Improve CUDA backend documentation (#1293)
Co-Authored-By: Alexander Johnston <[email protected]> Signed-off-by: Ruyman Reyes <[email protected]>
1 parent 1698931 commit 08163a4

File tree

2 files changed

+44
-5
lines changed

2 files changed

+44
-5
lines changed

sycl/CMakeLists.txt

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,7 @@ install(DIRECTORY ${OPENCL_INCLUDE}/CL
142142
)
143143

144144
option(SYCL_BUILD_PI_CUDA
145-
"Selects the PI API backend. When set to ON, the CUDA backend is selected. \
146-
When set to OFF, the OpenCL backend is selected." OFF)
145+
"Enables the CUDA backend for the Plugin Interface" OFF)
147146

148147
# Configure SYCL version macro
149148
set(sycl_inc_dir ${CMAKE_CURRENT_SOURCE_DIR}/include)

sycl/doc/GetStartedGuide.md

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -123,10 +123,15 @@ should be used.
123123

124124
There is experimental support for DPC++ for CUDA devices.
125125

126-
To enable support for CUDA devices, the following arguments need to be added to
127-
the CMake command when building the DPC++ compiler.
126+
To enable support for CUDA devices, follow the instructions for the Linux
127+
DPC++ toolchain, but replace the cmake command with the following one:
128+
128129

129130
```
131+
cmake -DCMAKE_BUILD_TYPE=Release \
132+
-DLLVM_EXTERNAL_PROJECTS="llvm-spirv;sycl" \
133+
-DLLVM_EXTERNAL_SYCL_SOURCE_DIR=$DPCPP_HOME/llvm/sycl \
134+
-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR=$DPCPP_HOME/llvm/llvm-spirv \
130135
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda/ \
131136
-DLLVM_ENABLE_PROJECTS="clang;llvm-spirv;sycl;libclc" \
132137
-DSYCL_BUILD_PI_CUDA=ON \
@@ -145,6 +150,24 @@ above.
145150

146151
# Use DPC++ toolchain
147152

153+
## Using the DPC++ toolchain on CUDA platforms
154+
155+
The DPC++ toolchain support on CUDA platforms is still in an experimental phase.
156+
Currently, the DPC++ toolchain relies on having a recent OpenCL implementation
157+
on the system in order to link applications to the DPC++ runtime.
158+
The OpenCL implementation is not used at runtime if only the CUDA backend is
159+
used in the application, but must be installed.
160+
161+
The OpenCL implementation provided by the CUDA SDK is OpenCL 1.2, which is
162+
too old to link with the DPC++ runtime and lacks some symbols.
163+
164+
We recommend installing the low level CPU runtime, following the instructions
165+
in the next section.
166+
167+
Instead of installing the low level CPU runtime, it is possible to build and
168+
install the [Khronos ICD loader](https://github.com/KhronosGroup/OpenCL-ICD-Loader),
169+
which contains all the symbols required.
170+
148171
## Install low level runtime
149172

150173
To run DPC++ applications on OpenCL devices, OpenCL implementation(s) must be
@@ -262,6 +285,9 @@ ninja check-all
262285
If no OpenCL GPU/CPU runtimes are available, the corresponding tests are
263286
skipped.
264287

288+
If CUDA support has been built, it is tested only if there are CUDA devices
289+
available.
290+
265291
### Run Khronos\* SYCL\* conformance test suite (optional)
266292

267293
Khronos\* SYCL\* conformance test suite (CTS) is intended to validate
@@ -394,6 +420,19 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice \
394420
This `simple-sycl-app.exe` application doesn't specify SYCL device for
395421
execution, so SYCL runtime will use `default_selector` logic to select one
396422
of accelerators available in the system or SYCL host device.
423+
In this case, the behaviour of the `default_selector` can be altered
424+
using the `SYCL_BE` environment variable, setting `PI_CUDA` forces
425+
the usage of the CUDA backend (if available), `PI_OPENCL` will
426+
force the usage of the OpenCL backend.
427+
428+
```bash
429+
SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe
430+
```
431+
432+
The default is the OpenCL backend if available.
433+
If there are no OpenCL or CUDA devices available, the SYCL host device is used.
434+
The SYCL host device executes the SYCL application directly in the host,
435+
without using any low-level API.
397436

398437
Note: `nvptx64-nvidia-cuda-sycldevice` is usable with `-fsycl-targets`
399438
if clang was built with the cmake option `SYCL_BUILD_PI_CUDA=ON`.
@@ -403,6 +442,7 @@ if clang was built with the cmake option `SYCL_BUILD_PI_CUDA=ON`.
403442
./simple-sycl-app.exe
404443
The results are correct!
405444
```
445+
406446
**Note**:
407447
Currently, when the application has been built with the CUDA target, the CUDA
408448
backend must be selected at runtime using the `SYCL_BE` environment variable.
@@ -411,7 +451,7 @@ backend must be selected at runtime using the `SYCL_BE` environment variable.
411451
SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe
412452
```
413453

414-
NOTE: DPC++/SYCL developer can specify SYCL device for execution using device
454+
NOTE: DPC++/SYCL developers can specify SYCL device for execution using device
415455
selectors (e.g. `cl::sycl::cpu_selector`, `cl::sycl::gpu_selector`,
416456
[Intel FPGA selector(s)](extensions/IntelFPGA/FPGASelector.md)) as
417457
explained in following section [Code the program for a specific

0 commit comments

Comments
 (0)