1
- < h2 > SYCL Kernel Parameter Handling and Array Support</ h2 >
1
+ # SYCL Kernel Parameter Handling and Array Support
2
2
3
- < h3 > Introduction</ h3 >
3
+ ## Introduction
4
4
5
5
This document describes how parameters of SYCL kernels are passed
6
6
from host to device. Support for arrays as kernel parameters was added
@@ -28,7 +28,8 @@ The first few sections describe the overall design.
28
28
The last three sections provide additional details of array support.
29
29
The implementation of this design is confined to four classes in the
30
30
file ` SemaSYCL.cpp ` .
31
- <h3 >A SYCL Kernel</h3 >
31
+
32
+ ## A SYCL Kernel
32
33
33
34
The SYCL constructs ` single_task ` , ` parallel_for ` , and
34
35
` parallel_for_work_group ` each take a function object or a lambda function
@@ -37,7 +38,7 @@ lambda function is executed on the device.
37
38
To enable execution of the kernel on OpenCL devices, the lambda/function object
38
39
is converted into the format of an OpenCL kernel.
39
40
40
- < h3 > SYCL Kernel Code Generation</ h3 >
41
+ ## SYCL Kernel Code Generation
41
42
42
43
Consider a source code example that captures an int, a struct and an accessor
43
44
by value:
@@ -112,7 +113,7 @@ spir_kernel void caller(
112
113
// Reassemble capture object from parts
113
114
local.i = i;
114
115
local.s = s;
115
- // Call accessor’ s init function
116
+ // Call accessor' s init function
116
117
sycl::accessor::init(&local.outAcc, AccData, AccR1, AccR2, I);
117
118
118
119
// Call the kernel body
@@ -140,7 +141,7 @@ host to device separately. The values received on the device
140
141
are passed to the ` init ` functions executed on the device,
141
142
which results in the reassembly of the SYCL object in a form usable on the device.
142
143
143
- There is one other aspect of code generation. An “ integration header”
144
+ There is one other aspect of code generation. An " integration header"
144
145
is generated for use during host compilation.
145
146
This header file contains entries for each kernel.
146
147
Among the items it defines is a table of sizes and offsets of the
@@ -169,7 +170,7 @@ object which contains three values:
169
170
The previous sections described how kernel arguments are handled today.
170
171
The next three sections describe support for arrays.
171
172
172
- <h3> Fix 1: Kernel Arguments that are Standard-Layout Arrays</ h3 >
173
+ ## Fix 1: Kernel Arguments that are Standard-Layout Arrays
173
174
174
175
As described earlier, each variable captured by a lambda that comprises a
175
176
SYCL kernel becomes a parameter of the kernel caller function.
@@ -180,7 +181,7 @@ the purposes of passing to the device. Each array element is passed as a
180
181
separate parameter. The array elements received on the device
181
182
are copied into the array within the local capture object.
182
183
183
- <h4> Source code fragment:</ h4 >
184
+ ** Source code fragment:**
184
185
185
186
``` C++
186
187
constexpr int num_items = 2 ;
@@ -197,7 +198,7 @@ are copied into the array within the local capture object.
197
198
});
198
199
```
199
200
200
- <h4> Integration header produced:</h4>
201
+ ** Integration header produced:**
201
202
202
203
```C++
203
204
static constexpr
@@ -211,7 +212,7 @@ const kernel_param_desc_t kernel_signatures[] = {
211
212
212
213
```
213
214
214
- < h4 > The changes to device code made to support this extension, in pseudo-code:</ h4 >
215
+ ** The changes to device code made to support this extension, in pseudo-code:**
215
216
216
217
``` C++
217
218
struct Capture {
@@ -238,22 +239,22 @@ spir_kernel void caller(
238
239
// Initialize array using existing clang Initialization mechanisms
239
240
local.array[0] = p_array_0;
240
241
local.array[1] = p_array_1;
241
- // Call accessor’ s init function
242
+ // Call accessor' s init function
242
243
sycl::accessor::init (&local.outAcc, AccData, AccR1, AccR2, I);
243
244
244
245
callee (&local, id<1> wi);
245
246
}
246
247
```
247
248
248
- < h3 > Fix 2: Kernel Arguments that are Arrays of Accessors</ h3 >
249
+ ## Fix 2: Kernel Arguments that are Arrays of Accessors
249
250
250
251
Arrays of accessors are supported in a manner similar to that of a plain
251
252
accessor. For each accessor array element, the four values required to
252
253
call its init function are passed as separate arguments to the kernel.
253
254
Reassembly within the kernel caller is done by calling the ` init ` functions
254
255
of each accessor array element in ascending index value.
255
256
256
- < h4 > Source code fragment:</ h4 >
257
+ ** Source code fragment:**
257
258
258
259
``` C++
259
260
myQueue.submit([&](handler &cgh) {
@@ -269,7 +270,7 @@ of each accessor array element in ascending index value.
269
270
});
270
271
```
271
272
272
- <h4> Integration header:</h4>
273
+ ** Integration header:**
273
274
274
275
```C++
275
276
static constexpr
@@ -281,7 +282,7 @@ const kernel_param_desc_t kernel_signatures[] = {
281
282
};
282
283
```
283
284
284
- < h4 > Device code generated in pseudo-code form:</ h4 >
285
+ ** Device code generated in pseudo-code form:**
285
286
286
287
``` C++
287
288
struct Capture {
@@ -311,20 +312,20 @@ spir_kernel void caller(
311
312
struct Capture local;
312
313
313
314
// Reassemble capture object from parts
314
- // Call outAcc accessor’ s init function
315
+ // Call outAcc accessor' s init function
315
316
sycl::accessor::init (&local.outAcc, outAccData, outAccR1, outAccR2, outI);
316
317
317
- // Call inAcc[0] accessor’ s init function
318
+ // Call inAcc[0] accessor' s init function
318
319
sycl::accessor::init (&local.inAcc[ 0] , inAccData_0, inAccR1_0, inAccR2_0, inI_0);
319
320
320
- // Call inAcc[1] accessor’ s init function
321
+ // Call inAcc[1] accessor' s init function
321
322
sycl::accessor::init (&local.inAcc[ 1] , inAccData_1, inAccR1_1, inAccR2_1, inI_1);
322
323
323
324
callee (&local, id<1> wi);
324
325
}
325
326
```
326
327
327
- < h3 > Fix 3: Accessor Arrays within Structs</ h3 >
328
+ ## Fix 3: Accessor Arrays within Structs
328
329
329
330
Kernel parameters that are structs are traversed member
330
331
by member, recursively, to enumerate member structs that are one of
@@ -340,7 +341,7 @@ Within the kernel caller function, the lambda object is reassembled
340
341
in a manner similar to other instances of accessor arrays.
341
342
342
343
343
- < h4 > Source code fragment:</ h4 >
344
+ ** Source code fragment:**
344
345
345
346
``` C++
346
347
myQueue.submit([&](handler &cgh) {
@@ -361,7 +362,7 @@ in a manner similar to other instances of accessor arrays.
361
362
});
362
363
```
363
364
364
- < h4 > Integration header:</ h4 >
365
+ ** Integration header:**
365
366
366
367
``` C++
367
368
static constexpr
@@ -375,7 +376,7 @@ const kernel_param_desc_t kernel_signatures[] = {
375
376
};
376
377
```
377
378
378
- < h4 > Device code generated in pseudo-code form:</ h4 >
379
+ ** Device code generated in pseudo-code form:**
379
380
380
381
``` C++
381
382
struct Capture {
@@ -411,15 +412,15 @@ spir_kernel void caller(
411
412
local.s = s;
412
413
413
414
// 2. Initialize accessors by calling init functions
414
- // 2a. Call outAcc accessor’ s init function
415
+ // 2a. Call outAcc accessor' s init function
415
416
sycl::accessor::init (
416
417
&local.outAcc, outAccData, outAccR1, outAccR2, outI);
417
418
418
- // 2b. Call s.inAcc[0] accessor’ s init function
419
+ // 2b. Call s.inAcc[0] accessor' s init function
419
420
sycl::accessor::init (
420
421
&local.s.inAcc[0], inAccData_0, inAccR1_0, inAccR2_0, inI_0);
421
422
422
- // 2c. Call s.inAcc[1] accessor’ s init function
423
+ // 2c. Call s.inAcc[1] accessor' s init function
423
424
sycl::accessor::init (
424
425
&local.s.inAcc[1], inAccData_1, inAccR1_1, inAccR2_1, inI_1);
425
426
0 commit comments