Skip to content

Commit d2f13ff

Browse files
Sergey Kanaevkbobrovs
andcommitted
Address review comments
Signed-off-by: Sergey Kanaev <[email protected]> Co-authored-by: kbobrovs <[email protected]>
1 parent b57ac48 commit d2f13ff

File tree

1 file changed

+44
-51
lines changed

1 file changed

+44
-51
lines changed

sycl/doc/Assert.md

Lines changed: 44 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,13 @@ int main() {
3939
```
4040
4141
In this use-case every work-item with even X dimension will trigger assertion
42-
failure. Assertion failure should be trigger a call to `std::abort()` at host as
42+
failure. Assertion failure should trigger a call to `std::abort()` at host as
4343
described in
4444
[extension](extensions/Assert/SYCL_INTEL_ASSERT.asciidoc).
4545
Even though multiple failures of the same or different assertions can happen in
4646
multiple workitems, implementation is required to deliver at least one
4747
assertion. The assertion failure message is printed to `stderr` by DPCPP
48-
Runtime.
48+
Runtime or underlying backend.
4949
5050
When multiple kernels are enqueued and more than one fail at assertion, at least
5151
single assertion should be reported.
@@ -93,15 +93,17 @@ Implementation of this function is supplied by Native Device Compiler for
9393
safe approach or by DPCPP Compiler for fallback one.
9494
9595
In order to distinguish which implementation to use, DPCPP Runtime checks for
96-
`cl_intel_devicelib_cassert` extension. If the extension isn't available, then
96+
`PI_INTEL_DEVICELIB_CASSERT` extension. If the extension isn't available, then
9797
fallback implementation is used.
9898
9999
100100
## Safe approach
101101
102102
This is the preferred approach and implementations should use it when possible.
103103
It guarantees assertion failure notification delivery to the host regardless of
104-
kernel behavior which hit the assertion.
104+
kernel behavior which hit the assertion. If backend suports the safe approach,
105+
it must report this capability to DPCPP Runtime via the
106+
`PI_INTEL_DEVICELIB_CASSERT` extension query.
105107
106108
The Native Device Compiler is responsible for providing implementation of
107109
`__devicelib_assert_fail` which completely hides details of communication
@@ -125,10 +127,10 @@ The following sequence of events describes how user code gets notified:
125127
126128
## Fallback approach
127129
128-
If Device-side Runtime doesn't support `__devicelib_assert_fail` then a fallback
129-
approach comes in place. The approach doesn't require any support from
130-
Device-side Runtime and Native Device Compiler. Neither it does from Low-level
131-
Runtime.
130+
If Device-side Runtime doesn't support `__devicelib_assert_fail` (as reported
131+
via `PI_INTEL_DEVICELIB_CASSERT` extension query) then a fallback approach comes
132+
in place. The approach doesn't require any support from Device-side Runtime and
133+
Native Device Compiler. Neither it does from Low-level Runtime.
132134
133135
Within this approach, a mutable program scope variable is introduced. This
134136
variable stores a flag which says if an assert failure was encountered. Fallback
@@ -147,10 +149,15 @@ The following sequence of events describes how user code gets notified:
147149
2. A host-task is enqueued to check value of assert failure flag.
148150
3. The host task calls abort whenever assert failure flag is set.
149151
152+
DPCPP Runtime will automatically check if assertions are enabled in the kernel
153+
being run, and won't enqueue the auxiliary kernels if assertions are not
154+
enabled. So there is no host-side runtime overhead when assertion are not
155+
enabled.
156+
150157
Illustrating this with an example, lets assume the user enqueues three kernels:
151-
- `Kernel #1`
152-
- `Kernel #2`
153-
- `Kernel #3`, which depends on `Kernel #1`
158+
- `Kernel #1`, uses assert
159+
- `Kernel #2`, uses assert
160+
- `Kernel #3`, uses assert and depends on `Kernel #1`
154161
155162
The resulting graph will look like this: ![graph](images/assert-fallback-graph.svg)
156163
@@ -165,9 +172,15 @@ same binary image where fallback `__devicelib_assert_fail` resides.
165172
declaration:</a>
166173
167174
```c++
175+
namespace cl {
176+
namespace sycl {
177+
namespace detail {
168178
struct AssertHappened {
169179
int Flag = 0;
170180
};
181+
}
182+
}
183+
}
171184
172185
#ifdef __SYCL_DEVICE_ONLY__
173186
extern SYCL_GLOBAL_VAR AssertHappened AssertHappenedMem;
@@ -189,49 +202,29 @@ implementation of `__devicelib_assert_fail`.
189202
In DPCPP headers one can see if assert is enabled with status of `NDEBUG` macro
190203
with `#ifdef`'s. This allows to enqueue a copy kernel and host task. The copy
191204
kernel will copy `AssertHappenedMem` to host and host-task will check the `Flag`
192-
value and `abort()` as needed.
205+
value and `abort()` as needed. The kernel and host task are enqueued when
206+
`NDEBUG` macro isn't defined.
193207

194208
When in DPCPP Runtime Library this knowledge is obtained from device binary
195209
image descriptor's property sets.
196210

197-
Each device image is supplied with an array of property sets:
198-
```c++
199-
struct pi_device_binary_struct {
200-
//...
201-
// Array of property sets
202-
pi_device_binary_property_set PropertySetsBegin;
203-
pi_device_binary_property_set PropertySetsEnd;
204-
};
205-
```
206-
Each property set is represented by the following struct:
207-
```c++
208-
// Named array of properties.
209-
struct _pi_device_binary_property_set_struct {
210-
char *Name; // the name
211-
pi_device_binary_property PropertiesBegin; // array start
212-
pi_device_binary_property PropertiesEnd; // array end
213-
};
214-
```
215-
It contains name of property set and array of properties. Each property is
216-
represented by the following struct:
217-
```c++
218-
struct _pi_device_binary_property_struct {
219-
char *Name; // null-terminated property name
220-
void *ValAddr; // address of property value
221-
uint32_t Type; // _pi_property_type
222-
uint64_t ValSize; // size of property value in bytes
223-
};
224-
```
211+
Each device image is supplied with an array of property sets. For description
212+
of property sets see `struct pi_device_binary_struct` in
213+
[`pi.h`](https://github.com/intel/llvm/blob/sycl/sycl/include/CL/sycl/detail/pi.h#L692)
225214

226215
A distinct property set `SYCL/assert used` is added. In this set a property
227-
with the name of the kernel is added whenever the kernel uses assert. Use of
228-
assert is detected through call to `__devicelib_assert_fail` function after
229-
linking device binary image with wrapper device library (the `libsycl-crt`
230-
library).
231-
232-
The property set and the underlying properties are added by `sycl-post-link`
233-
tool with help of building callgraph for each and every kernel in device binary
234-
image.
216+
with the name of the kernel is added whenever the kernel uses assert. The use of
217+
assert is detected by a specific LLVM IR pass invoked by the `sycl-post-link`
218+
tool which runs on linked device code, i.e. after linking with the `libsycl-crt`
219+
library which defines the assert function. The pass builds complete call graph
220+
for a kernel, and sees if there's a call to `__devicelib_assert_fail` anywhere
221+
in the graph. If found, `sycl-post-link` adds the property for the kernel.
222+
223+
The same is done for all indirect callable functions (marked with specific
224+
attribute) found in the linked device code. Those are functions whose pointers
225+
can be taken and passed around in device code. If a callgraph for any such
226+
function has a call to `__devicelib_assert_fail`, then all kernels in the module
227+
are conservatively marked as using asserts.
235228

236229
The added property is used for:
237230
- deciding if online-linking against fallback devicelib is required;
@@ -340,7 +333,7 @@ void workload() {
340333
```
341334

342335
These two files are compiled into a single binary application. There are four
343-
states of definedness of `NDEBUG` macro available:
336+
states of definition of `NDEBUG` macro available:
344337

345338
| # | `impl.cpp` | `main.cpp` |
346339
| - | ---------- | ---------- |
@@ -349,12 +342,12 @@ states of definedness of `NDEBUG` macro available:
349342
| 3 | undefined | defined |
350343
| 4 | undefined | undefined |
351344

352-
States of definedness of `NDEBUG` macro defines the set of assertions which can
345+
States of definition of `NDEBUG` macro defines the set of assertions which can
353346
fail.
354347

355348
### Raising assert failure flag and reading it on host
356349

357-
Each and every translation unit provided by user should have declaration of
350+
All translation units provided by the user should have a declaration of the
358351
assert flag read function available:
359352
```c++
360353
int __devicelib_assert_read(void);

0 commit comments

Comments
 (0)