@@ -56,7 +56,7 @@ runtime. As we will see later, this has a ramification on the integration
56
56
headers and on the mechanism that connects instances of device global variables
57
57
in host code with their corresponding instances in device code.
58
58
59
- Another issue relates to the ` device_image_life ` property which can be applied
59
+ Another issue relates to the ` device_image_scope ` property which can be applied
60
60
to a device global variable declaration. The intent of this property is to
61
61
allow a device global variable to be implemented directly on top of a SPIR-V
62
62
module scope global variable. When this property is ** not** present, an
@@ -66,7 +66,7 @@ about the scope of a variable because the user need not understand which device
66
66
image contains each kernel. However, this semantic makes the implementation
67
67
less efficient, especially on FPGA targets.
68
68
69
- By contrast, the ` device_image_life ` property changes the semantic of a device
69
+ By contrast, the ` device_image_scope ` property changes the semantic of a device
70
70
global variable such that the user must understand which device image contains
71
71
each kernel, which is difficult to reason about. For example, changing the
72
72
value of a specialization constant may cause a kernel to be recompiled into a
@@ -75,11 +75,11 @@ referenced in a kernel may actually have several disjoint instances if the
75
75
kernel uses specialization constants. This problem is more tractable on FPGA
76
76
targets because specialization constants are not implemented via separate
77
77
device images on those targets, however, there are other factors that FPGA
78
- users need to be aware of when using the ` device_image_life ` property. These
78
+ users need to be aware of when using the ` device_image_scope ` property. These
79
79
are documented more throughly in the extension specification.
80
80
81
81
The important impact on the design, though, is that device global variables
82
- declared with the ` device_image_life ` property have an implementation that is
82
+ declared with the ` device_image_scope ` property have an implementation that is
83
83
quite different from device global variables that are not declared with this
84
84
property. The sections below describe both implementations.
85
85
@@ -91,7 +91,7 @@ property. The sections below describe both implementations.
91
91
The headers, of course, include the declaration of the new ` device_global `
92
92
class, which is described in the [ extension specification] [ 1 ] . The declaration
93
93
of this class uses partial specialization to define the class differently
94
- depending on whether is has the ` device_image_life ` property. When the
94
+ depending on whether is has the ` device_image_scope ` property. When the
95
95
property is not present, the class has a member variable which is a pointer to
96
96
the underlying type. Member functions which return a reference to the value
97
97
(e.g. ` get ` ) return the value of this pointer:
@@ -127,7 +127,7 @@ of this member variable.
127
127
The headers are also updated to add the new ` copy() ` and ` memcpy() ` member
128
128
functions to ` handler ` and ` queue ` which copy data to or from a device global
129
129
variable. These declarations use SFINAE such that they are conditionally
130
- available depending on the ` copy_access ` property.
130
+ available depending on the ` host_access ` property.
131
131
132
132
### New LLVM IR attributes
133
133
@@ -144,7 +144,7 @@ This is not possible, though, for variables with internal linkage because the
144
144
mangled name is not unique in this case. For these variables, we use the
145
145
mangled name and append a unique suffix.
146
146
147
- Each device global variable that has the ` device_image_life ` property is also
147
+ Each device global variable that has the ` device_image_scope ` property is also
148
148
decorated with the ` sycl-device-global-image-life ` attribute.
149
149
150
150
Note that language rules ensure that ` device_global ` variables are always
@@ -209,7 +209,7 @@ global variable that is defined in the translation unit:
209
209
* The variable's string from the ` sycl-unique-id ` attribute.
210
210
* The size (in bytes) of the underlying ` T ` type for the variable.
211
211
* A boolean telling whether the variable is decorated with the
212
- ` device_image_life ` property.
212
+ ` device_image_scope ` property.
213
213
214
214
```
215
215
namespace sycl::detail {
@@ -219,11 +219,11 @@ __sycl_device_global_registration::__sycl_device_global_registration() noexcept
219
219
device_global_map::add(&::Foo,
220
220
/* mangled name of '::Foo' with unique suffix appended */,
221
221
/* size of underlying 'T' type */,
222
- /* bool telling whether variable has 'device_image_life ` property */);
222
+ /* bool telling whether variable has 'device_image_scope ` property */);
223
223
device_global_map::add(&::inner::Bar,
224
224
/* mangled name of '::inner::Bar' */,
225
225
/* size of underlying 'T' type */,
226
- /* bool telling whether variable has 'device_image_life ` property */);
226
+ /* bool telling whether variable has 'device_image_scope ` property */);
227
227
}
228
228
229
229
} // namepsace (unnamed)
@@ -287,11 +287,11 @@ __sycl_device_global_registration::__sycl_device_global_registration() noexcept
287
287
device_global_map::add(&::FuBar,
288
288
/* mangled name of '::FuBar' */,
289
289
/* size of underlying 'T' type */,
290
- /* bool telling whether variable has 'device_image_life ` property */);
290
+ /* bool telling whether variable has 'device_image_scope ` property */);
291
291
device_global_map::add(::__sycl_UNIQUE_STRING,
292
292
/* mangled name of '::(unnamed)::FuBar' with unique suffix appended */,
293
293
/* size of underlying 'T' type */,
294
- /* bool telling whether variable has 'device_image_life ` property */);
294
+ /* bool telling whether variable has 'device_image_scope ` property */);
295
295
}
296
296
297
297
} // namepsace (unnamed)
@@ -319,7 +319,7 @@ global variable decorated with `sycl-device-global-image-life` appears in more
319
319
than one module, the ` sycl-post-link ` tool issues an error diagnostic:
320
320
321
321
```
322
- error: device_global variable <name> with property "device_image_life "
322
+ error: device_global variable <name> with property "device_image_scope "
323
323
is contained in more than one device image.
324
324
```
325
325
@@ -361,12 +361,13 @@ strings, where each string ends with a null character (`\0`).
361
361
Several changes are needed to the DPC++ runtime
362
362
363
363
* As noted in the requirements section, an instance of a device global variable
364
- that does not have the ` device_image_life ` property is shared by all device
364
+ that does not have the ` device_image_scope ` property is shared by all device
365
365
images on a device. To satisfy this requirement, the device global variable
366
366
contains a pointer to a buffer allocated from USM device memory, and the
367
- content of the variable is stored in this buffer. All device images point to
368
- the same buffer, so the variable's state is shared. The runtime, therefore,
369
- must allocate this USM buffer for each such device global variable.
367
+ content of the variable is stored in this buffer. All device images on a
368
+ particular device point to the same buffer, so the variable's state is
369
+ shared. The runtime, therefore, must allocate this USM buffer for each such
370
+ device global variable.
370
371
371
372
* As we noted above, the front-end generates new content in the integration
372
373
footer which calls the function ` sycl::detail::device_global_map::add() ` .
@@ -377,9 +378,9 @@ Several changes are needed to the DPC++ runtime
377
378
- The string which uniquely identifies the variable.
378
379
- The size (in bytes) of the underlying ` T ` type for the variable.
379
380
- A boolean telling whether the variable is decorated with the
380
- ` device_image_life ` property.
381
+ ` device_image_scope ` property.
381
382
- The associated per-device USM buffer pointer, if this variable does not
382
- have the ` device_image_life ` property.
383
+ have the ` device_image_scope ` property.
383
384
384
385
We refer to this information as the "device global database" below.
385
386
@@ -404,13 +405,14 @@ runtime does the following:
404
405
the ` pi_program ` to get the unique string associated with each device global
405
406
variable that is used by the ` pi_program ` . For each of these strings, the
406
407
runtime uses the device global database to see if the variable was decorated
407
- with ` device_image_life ` . If it was not so decorated and if a USM buffer has
408
- not already been created for the variable on this target device, the runtime
409
- allocates the buffer from USM device memory using the size from the database.
410
- The pointer to this buffer is saved in the database for future reuse.
408
+ with ` device_image_scope ` . If it was not so decorated and if a USM buffer
409
+ has not already been created for the variable on this target device, the
410
+ runtime allocates the buffer from USM device memory using the size from the
411
+ database and zero-initializes the content of the buffer. The pointer to this
412
+ buffer is saved in the database for future reuse.
411
413
412
414
* For each device global variable that is not decorated with
413
- ` device_image_life ` , the runtime initializes the ` usmptr ` member in the
415
+ ` device_image_scope ` , the runtime initializes the ` usmptr ` member in the
414
416
* device instance* of the variable by using a backend-specific function which
415
417
copies data from the host to a device variable. It is a simple matter to use
416
418
this function to overwrite the ` usmptr ` member with the address of the USM
@@ -421,22 +423,23 @@ runtime does the following:
421
423
Each of these functions accepts a (host) pointer to a device global variable as
422
424
one of its parameters, and the runtime uses this pointer to find the associated
423
425
information for this variable in the device global database. The remaining
424
- behavior depends on whether the variable is decorated with ` device_image_life ` .
426
+ behavior depends on whether the variable is decorated with
427
+ ` device_image_scope ` .
425
428
426
429
If the variable is not decorated with this property, the runtime uses the
427
430
database to determine if a USM buffer has been allocated yet for this variable
428
431
on this device. If not, the runtime allocates the buffer using the size from
429
- the database. Regardless, the runtime implements the ` copy ` / ` memcpy ` by
430
- copying to or from this USM buffer, using the normal mechanism for copying
431
- to / from a USM pointer.
432
+ the database and zero-initializes the buffer. Regardless, the runtime
433
+ implements the ` copy ` / ` memcpy ` by copying to or from this USM buffer, using
434
+ the normal mechanism for copying to / from a USM pointer.
432
435
433
436
The runtime avoids the future cost of looking up the variable in the database
434
437
by caching the USM pointer in the host instance of the variable's ` usmptr `
435
438
member.
436
439
437
- If the variable is decorated with the ` device_image_life ` property, the runtime
438
- gets the unique string identifier for the variable from the database and uses
439
- a backend-specific function to copy to or from the variable with that
440
+ If the variable is decorated with the ` device_image_scope ` property, the
441
+ runtime gets the unique string identifier for the variable from the database
442
+ and uses a backend-specific function to copy to or from the variable with that
440
443
identifier. Again, the details of this function are described below.
441
444
442
445
In all cases, the runtime diagnoses invalid calls that write beyond the device
@@ -578,8 +581,8 @@ SPIR-V decorations (defined in the
578
581
579
582
[ 12 ] : < extensions/DeviceGlobal/SPV_INTEL_global_variable_decorations.asciidoc >
580
583
581
- * ` copy_access `
582
- * ` init_via `
584
+ * ` host_access `
585
+ * ` init_mode `
583
586
* ` implement_in_csr `
584
587
585
588
It's not clear how this should work. One of the goals of the new property
@@ -613,11 +616,11 @@ Currently, we use the variable's mangled name, but this could be changed.
613
616
An alternative solution would be to augment the SPIR-V with some new decoration
614
617
that gives a unique name to each ` OpVariable ` that needs to be accessed from
615
618
the host. We could then use that name with the backend functions, and avoid
616
- renaming variables with internal linkage. This would be more effort, though,
617
- because we would need a new SPIR-V extension, and we would need to change the
618
- implementation of the Level Zero and OpenCL backends.
619
+ renaming variables with internal linkage. This would be only a minor change to
620
+ the [ SPV \_ INTEL \_ global \_ variable \_ decorations ] [ 12 ] extension, but it would
621
+ also require changes in the Level Zero and OpenCL backends.
619
622
620
- ### Does compiler need to be deterministic?
623
+ ### Does the compiler need to be deterministic?
621
624
622
625
The compiler is normally deterministic. If you compile the exact same source
623
626
file twice specifying the same command line options each time, you get exactly
0 commit comments