@@ -10,14 +10,17 @@ and a wide range of compute accelerators such as GPU and FPGA.
10
10
* [ Build DPC++ toolchain] ( #build-dpc-toolchain )
11
11
* [ Build DPC++ toolchain with libc++ library] ( #build-dpc-toolchain-with-libc-library )
12
12
* [ Build DPC++ toolchain with support for NVIDIA CUDA] ( #build-dpc-toolchain-with-support-for-nvidia-cuda )
13
+ * [ Build Doxygen documentation] ( #build-doxygen-documentation )
13
14
* [ Use DPC++ toolchain] ( #use-dpc-toolchain )
14
15
* [ Install low level runtime] ( #install-low-level-runtime )
15
16
* [ Obtain prerequisites for ahead of time (AOT) compilation] ( #obtain-prerequisites-for-ahead-of-time-aot-compilation )
16
17
* [ Test DPC++ toolchain] ( #test-dpc-toolchain )
17
18
* [ Run simple DPC++ application] ( #run-simple-dpc-application )
19
+ * [ Code the program for a specific GPU] ( #code-the-program-for-a-specific-gpu )
20
+ * [ Using the DPC++ toolchain on CUDA platforms] ( #using-the-dpc-toolchain-on-cuda-platforms )
18
21
* [ C++ standard] ( #c-standard )
19
22
* [ Known Issues and Limitations] ( #known-issues-and-limitations )
20
- * [ CUDA backend limitations] ( #cuda-backend -limitations )
23
+ * [ CUDA back-end limitations] ( #cuda-back-end -limitations )
21
24
* [ Find More] ( #find-more )
22
25
23
26
## Prerequisites
@@ -145,30 +148,30 @@ a Titan RTX GPU (SM 71), but it should work on any GPU compatible with SM 50 or
145
148
above. The default SM for the NVIDIA CUDA backend is 5.0. Users can specify
146
149
lower values, but some features may not be supported.
147
150
148
- ### Deployment
151
+ ### Build Doxygen documentation
149
152
150
- TODO: add instructions how to deploy built DPC++ toolchain.
153
+ Building Doxygen documentation is similar to building the product itself. First,
154
+ the following tools need to be installed:
151
155
152
- ## Use DPC++ toolchain
156
+ * doxygen
157
+ * graphviz
153
158
154
- ### Using the DPC++ toolchain on CUDA platforms
159
+ Then you'll need to add the following options to your CMake configuration
160
+ command:
155
161
156
- The DPC++ toolchain support on CUDA platforms is still in an experimental phase.
157
- Currently, the DPC++ toolchain relies on having a recent OpenCL implementation
158
- on the system in order to link applications to the DPC++ runtime.
159
- The OpenCL implementation is not used at runtime if only the CUDA backend is
160
- used in the application, but must be installed.
162
+ ```
163
+ -DLLVM_ENABLE_DOXYGEN=ON
164
+ ```
161
165
162
- The OpenCL implementation provided by the CUDA SDK is OpenCL 1.2, which is
163
- too old to link with the DPC++ runtime and lacks some symbols.
166
+ After CMake cache is generated, build the documentation with ` doxygen-sycl `
167
+ target. It will be put to ` $DPCPP_HOME/llvm/build/tools/sycl/doc/html `
168
+ directory.
164
169
165
- We recommend installing the low level CPU runtime, following the instructions
166
- in the next section.
170
+ ### Deployment
167
171
168
- Instead of installing the low level CPU runtime, it is possible to build and
169
- install the
170
- [ Khronos ICD loader] ( https://github.com/KhronosGroup/OpenCL-ICD-Loader ) ,
171
- which contains all the symbols required.
172
+ TODO: add instructions how to deploy built DPC++ toolchain.
173
+
174
+ ## Use DPC++ toolchain
172
175
173
176
### Install low level runtime
174
177
@@ -394,25 +397,6 @@ cmake -DIntel_SYCL_ROOT=$DPCPP_HOME/deploy -DSYCL_IMPLEMENTATION=Intel_SYCL ...
394
397
cmake -DIntel_SYCL_ROOT=%DPCPP_HOME%\d eploy -DSYCL_IMPLEMENTATION=Intel_SYCL ...
395
398
` ` `
396
399
397
- # ## Build Doxygen documentation
398
-
399
- Building Doxygen documentation is similar to building the product itself. First,
400
- the following tools need to be installed:
401
-
402
- * doxygen
403
- * graphviz
404
-
405
- Then you' ll need to add the following options to your CMake configuration
406
- command:
407
-
408
- ```
409
- -DLLVM_ENABLE_DOXYGEN=ON
410
- ```
411
-
412
- After CMake cache is generated, build the documentation with `doxygen-sycl`
413
- target. It will be put to `$DPCPP_HOME/llvm/build/tools/sycl/doc/html`
414
- directory.
415
-
416
400
# ## Run simple DPC++ application
417
401
418
402
A simple DPC++ or SYCL\* program consists of following parts:
@@ -634,6 +618,25 @@ class CUDASelector : public cl::sycl::device_selector {
634
618
};
635
619
` ` `
636
620
621
+ # ## Using the DPC++ toolchain on CUDA platforms
622
+
623
+ The DPC++ toolchain support on CUDA platforms is still in an experimental phase.
624
+ Currently, the DPC++ toolchain relies on having a recent OpenCL implementation
625
+ on the system in order to link applications to the DPC++ runtime.
626
+ The OpenCL implementation is not used at runtime if only the CUDA backend is
627
+ used in the application, but must be installed.
628
+
629
+ The OpenCL implementation provided by the CUDA SDK is OpenCL 1.2, which is
630
+ too old to link with the DPC++ runtime and lacks some symbols.
631
+
632
+ We recommend installing the low level CPU runtime, following the instructions
633
+ in the next section.
634
+
635
+ Instead of installing the low level CPU runtime, it is possible to build and
636
+ install the
637
+ [Khronos ICD loader](https://github.com/KhronosGroup/OpenCL-ICD-Loader),
638
+ which contains all the symbols required.
639
+
637
640
# # C++ standard
638
641
639
642
* DPC++ runtime and headers require C++14 at least.
0 commit comments