@@ -168,25 +168,31 @@ There is experimental support for DPC++ for CUDA devices.
168
168
169
169
To enable support for CUDA devices, follow the instructions for the Linux or
170
170
Windows DPC++ toolchain, but add the ` --cuda ` flag to ` configure.py ` . Note,
171
- the CUDA backend has experimental Windows support, windows subsystem for
171
+ the CUDA backend has Windows support; windows subsystem for
172
172
linux (WSL) is not needed to build and run the CUDA backend.
173
173
174
- Enabling this flag requires an installation of
174
+ Enabling this flag requires an installation of at least
175
175
[ CUDA 10.2] ( https://developer.nvidia.com/cuda-10.2-download-archive ) on
176
176
the system, refer to
177
177
[ NVIDIA CUDA Installation Guide for Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html )
178
178
or
179
179
[ NVIDIA CUDA Installation Guide for Windows] ( https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html )
180
+
181
+ ** _ NOTE:_ ** An installation of at least
182
+ [ CUDA 11.6] ( https://developer.nvidia.com/cuda-downloads ) is recommended because
183
+ there is a known issue with some math builtins when using -O1/O2/O3
184
+ Optimization options for CUDA toolkits prior to 11.6 (This is due to a bug in
185
+ earlier versions of the CUDA toolkit: see
186
+ [ this issue] ( https://forums.developer.nvidia.com/t/libdevice-functions-causing-ptxas-segfault/193352 ) ).
187
+
180
188
An installation of at least
181
189
[ CUDA 11.0] ( https://developer.nvidia.com/cuda-11.0-download-archive )
182
- is required for fully utilize Turing (SM 75) devices.
183
-
184
- Currently, the only combination tested is Ubuntu 18.04 with CUDA 10.2 using
185
- a Titan RTX GPU (SM 71). The CUDA backend should work on Windows or Linux
186
- operating systems with any GPU compatible with SM 50 or above. The default
187
- SM for the NVIDIA CUDA backend is 5.0. Users can specify lower values,
188
- but some features may not be supported. Windows CUDA support is experimental
189
- as it is not currently tested on the CI.
190
+ is required to fully utilize Turing (SM 75) devices and to enable Ampere (SM 80)
191
+ core features.
192
+
193
+ The CUDA backend should work on Windows or Linux operating systems with any
194
+ GPU compatible with SM 50 or above. The default SM for the NVIDIA CUDA backend
195
+ is 5.0. Users can specify lower values, but some features may not be supported.
190
196
191
197
** Non-standard CUDA location**
192
198
0 commit comments