File tree Expand file tree Collapse file tree 1 file changed +5
-1
lines changed Expand file tree Collapse file tree 1 file changed +5
-1
lines changed Original file line number Diff line number Diff line change @@ -644,11 +644,15 @@ clang++ -fsycl -fsycl-targets=amdgcn-amd-amdhsa \
644
644
The target architecture may also be specified for the CUDA backend, with
645
645
` -Xsycl-target-backend --cuda-gpu-arch=<arch>` . Specifying the architecture is
646
646
necessary if an application aims to use newer hardware features, such as
647
- native atomic operations or tensor core operations.
647
+ native atomic operations or tensor core operations.
648
+ Moreover, it is possible to pass specific options to CUDA ` ptxas` (such as
649
+ ` --maxrregcount=< n> ` for limiting the register usage or ` --verbose` for
650
+ printing generation statistics) using the ` -Xcuda-ptxas` flag.
648
651
649
652
` ` ` bash
650
653
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
651
654
simple-sycl-app.cpp -o simple-sycl-app-cuda.exe \
655
+ -Xcuda-ptxas --maxrregcount=128 -Xcuda-ptxas --verbose \
652
656
-Xsycl-target-backend --cuda-gpu-arch=sm_80
653
657
` ` `
654
658
You can’t perform that action at this time.
0 commit comments