You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SYCL][NATIVECPU] Enable source-based code coverage in Native CPU (#15073)
Supports [clang's source-based code
coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html) to
enable code coverage testing of SYCL applications via the Native CPU
SYCL target.
Clang's `-fprofile-instr-generate -fcoverage-mapping` options can now be
used with the `native_cpu` SYCL target to compile/instrument host and
device code, enabling 'llvm-cov' to render a coverage report after
running the SYCL application (see also updated documentation in this
PR).
Subsequent PRs will enable in NativeCPU more of the currently
unsupported options for device compilation, also for performance
profiling.
**Details and explanations for the changes in this PR:**
This PR tests coverage options on the existing NativeCPU vector-add test
by adding an additional invocation with previously disabled options
`-fprofile-instr-generate -fcoverage-mapping -mllvm
-system-headers-coverage`. Enabling these options on device code caused
an [assert in the upstream clang profiling code generation
tools](https://github.com/intel/llvm/blob/b023d407862bd853ba5881c34985f99d039d856c/clang/lib/CodeGen/CoverageMappingGen.cpp#L960)
due to the invalid source location on the AST for the implicitly
generated kernel body, specifically the compound statement containing
the kernel body. This PR honors this upstream clang assert by replacing
the invalid source location in the compound statement with the source
location of the kernel body. Using this now valid source location
maintains the location (of the kernel caller function) currently tested
by [non-upstream-llvm lit test
`CodeGenSYCL/debug-info-srcpos-kernel.cpp`](https://github.com/intel/llvm/blob/sycl/clang/test/CodeGenSYCL/debug-info-srcpos-kernel.cpp),
but exposed an issue that led to a change in behavior in
[non-llvm-upstream lit test
`SemaSYCL/kernel-arg-opt-report.cpp`](https://github.com/intel/llvm/blob/sycl/clang/test/SemaSYCL/kernel-arg-opt-report.cpp),
which was due to the previously invalid source location causing the
compiler to skip code to set the current location. To restore the
original behavior of this test (checking for the location of the kernel
functor, as opposed to the kernel caller function) this PR temporarily
(and only for the purpose of generating the report) sets the current
location to the location of the kernel argument using the upstream clang
utility
[clang::CodeGen::ApplyDebugLocation](https://github.com/intel/llvm/blob/b023d407862bd853ba5881c34985f99d039d856c/clang/lib/CodeGen/CGDebugInfo.h#L860).
---------
Co-authored-by: Michael Toguchi <[email protected]>
Copy file name to clipboardExpand all lines: sycl/doc/design/SYCLNativeCPU.md
+28-9Lines changed: 28 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
1
# SYCL Native CPU
2
2
3
-
The SYCL Native CPU flow aims at treating the host CPU as a "first class citizen", providing a SYCL implementation that targets CPUs of various different architectures, with no other dependencies than DPC++ itself, while bringing performances comparable to state-of-the-art CPU backends.
3
+
The SYCL Native CPU flow aims at treating the host CPU as a "first class citizen", providing a SYCL implementation that targets CPUs of various different architectures, with no other dependencies than DPC++ itself, while bringing performances comparable to state-of-the-art CPU backends. SYCL Native CPU also provides some initial/experimental support for LLVM's [source-based code coverage tools](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html) (see also section [Code coverage](#code-coverage)).
4
4
5
5
# Compiler and runtime options
6
6
7
-
The SYCL Native CPU flow is enabled by setting `native_cpu` as a `sycl-target` (please note that currently doing so overrides any other SYCL target specified in the compiler invocation):
7
+
The SYCL Native CPU flow is enabled by setting `native_cpu` as a `sycl-target`:
Note that SYCL Native CPU co-exists alongside the other SYCL targets. For example, the following command line builds SYCL code simultaneously for SYCL Native CPU and for OpenCL.
The application can then run on either SYCL target by setting the DPC++ `ONEAPI_DEVICE_SELECTOR` environment variable accordingly.
37
+
31
38
## Configuring DPC++ with SYCL Native CPU
32
39
33
-
SYCL Native CPU needs to be enabled explictly when configuring DPC++, using `--native_cpu`, e.g.
40
+
SYCL Native CPU needs to be enabled explicitly when configuring DPC++, using `--native_cpu`, e.g.
34
41
35
42
```
36
43
python buildbot/configure.py \
@@ -86,7 +93,19 @@ Whole Function Vectorization is enabled by default, and can be controlled throug
86
93
*`-mllvm -sycl-native-cpu-no-vecz`: disable Whole Function Vectorization.
87
94
*`-mllvm -sycl-native-cpu-vecz-width`: sets the vector width to the specified value, defaults to 8.
88
95
89
-
For more details on how the Whole Function Vectorizer is integrated for SYCL Native CPU, refer to the [Technical details[(#technical-details) section.
96
+
For more details on how the Whole Function Vectorizer is integrated for SYCL Native CPU, refer to the [Technical details](#technical-details) section.
97
+
98
+
# Code coverage
99
+
100
+
SYCL Native CPU has experimental support for LLVM's source-based [code coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html). This enables coverage testing across device and host code.
llvm-cov show .\vector-add.exe -instr-profile=foo.profdata
108
+
```
90
109
91
110
## Ongoing work
92
111
@@ -95,7 +114,7 @@ For more details on how the Whole Function Vectorizer is integrated for SYCL Nat
95
114
* Subgroup support
96
115
* Performance optimizations
97
116
98
-
### Please note that Windows support is temporarily disabled due to some implementation details, it will be reinstantiated soon.
117
+
### Please note that Windows is partially supported but temporarily disabled due to some implementation details, it will be re-enabled soon.
99
118
100
119
# Technical details
101
120
@@ -140,13 +159,13 @@ entry:
140
159
}
141
160
```
142
161
143
-
For the SYCL Native CPU target, the device compiler is in charge of materializing the SPIRV builtins (such as `@__spirv_BuiltInGlobalInvocationId`), so that they can be correctly updated by the runtime when executing the kernel. This is performed by the [PrepareSYCLNativeCPU pass](https://github.com/intel/llvm/blob/sycl/llvm/lib/SYCLLowerIR/PrepareSYCLNativeCPU.cpp).
162
+
For the SYCL Native CPU target, the device compiler is in charge of materializing the SPIRV builtins (such as `@__spirv_BuiltInGlobalInvocationId`), so that they can be correctly updated by the runtime when executing the kernel. This is performed by the [PrepareSYCLNativeCPU pass](https://github.com/intel/llvm/blob/sycl/llvm/lib/SYCLNativeCPUUtils/PrepareSYCLNativeCPU.cpp).
144
163
The PrepareSYCLNativeCPUPass also emits a `subhandler` function, which receives the kernel arguments from the SYCL runtime (packed in a vector), unpacks them, and forwards only the used ones to the actual kernel.
145
164
146
165
147
166
## PrepareSYCLNativeCPU Pass
148
167
149
-
This pass will add a pointer to a `nativecpu_state` struct as kernel argument to all the kernel functions, and it will replace all the uses of SPIRV builtins with the return value of appropriately defined functions, which will read the requested information from the `__nativecpu_state` struct. The `__nativecpu_state` struct and the builtin functions are defined in [native_cpu.hpp](https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/detail/native_cpu.hpp).
168
+
This pass will add a pointer to a `native_cpu::state` struct as kernel argument to all the kernel functions, and it will replace all the uses of SPIRV builtins with the return value of appropriately defined functions, which will read the requested information from the `native_cpu::state` struct. The `native_cpu::state` struct is defined in the [native_cpu UR adapter](https://github.com/oneapi-src/unified-runtime/blob/main/source/adapters/native_cpu/nativecpu_state.hpp)and the builtin functions are defined in the [native_cpu device library](https://github.com/intel/llvm/blob/sycl/libdevice/nativecpu_utils.cpp).
150
169
151
170
152
171
The resulting IR is:
@@ -188,11 +207,11 @@ entry:
188
207
}
189
208
```
190
209
191
-
As you can see, the `subhandler` steals the kernel's function name, and receives two pointer arguments: the first one points to the kernel arguments from the SYCL runtime, and the second one to the `__nativecpu_state` struct.
210
+
As you can see, the `subhandler` steals the kernel's function name, and receives two pointer arguments: the first one points to the kernel arguments from the SYCL runtime, and the second one to the `nativecpu::state` struct.
192
211
193
212
## Handling barriers
194
213
195
-
On SYCL Native CPU, calls to `__spirv_ControlBarrier` are handled using the `WorkItemLoopsPass` from the oneAPI Construction Kit. This pass handles barriers by splitting the kernel between calls calls to `__spirv_ControlBarrier`, and creating a wrapper that runs the subkernels over the local range. In order to correctly interface to the oneAPI Construction Kit pass pipeline, SPIRV builtins are converted to `mux` builtins (used by the OCK) by the `ConvertToMuxBuiltinsSYCLNativeCPUPass`.
214
+
On SYCL Native CPU, calls to `__spirv_ControlBarrier` are handled using the `WorkItemLoopsPass` from the oneAPI Construction Kit. This pass handles barriers by splitting the kernel between calls to `__spirv_ControlBarrier`, and creating a wrapper that runs the subkernels over the local range. In order to correctly interface to the oneAPI Construction Kit pass pipeline, SPIRV builtins are defined in the device library to call the corresponding `mux` builtins (used by the OCK).
0 commit comments