Skip to content

Commit e54e376

Browse files
committed
Add a few more tweaks
Add some final clarifications to the design doc before creating PR. Rename properties to align with latest review comments against API spec.
1 parent 6c07b1a commit e54e376

File tree

1 file changed

+40
-37
lines changed

1 file changed

+40
-37
lines changed

sycl/doc/DeviceGlobal.md

Lines changed: 40 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ runtime. As we will see later, this has a ramification on the integration
5656
headers and on the mechanism that connects instances of device global variables
5757
in host code with their corresponding instances in device code.
5858

59-
Another issue relates to the `device_image_life` property which can be applied
59+
Another issue relates to the `device_image_scope` property which can be applied
6060
to a device global variable declaration. The intent of this property is to
6161
allow a device global variable to be implemented directly on top of a SPIR-V
6262
module scope global variable. When this property is **not** present, an
@@ -66,7 +66,7 @@ about the scope of a variable because the user need not understand which device
6666
image contains each kernel. However, this semantic makes the implementation
6767
less efficient, especially on FPGA targets.
6868

69-
By contrast, the `device_image_life` property changes the semantic of a device
69+
By contrast, the `device_image_scope` property changes the semantic of a device
7070
global variable such that the user must understand which device image contains
7171
each kernel, which is difficult to reason about. For example, changing the
7272
value of a specialization constant may cause a kernel to be recompiled into a
@@ -75,11 +75,11 @@ referenced in a kernel may actually have several disjoint instances if the
7575
kernel uses specialization constants. This problem is more tractable on FPGA
7676
targets because specialization constants are not implemented via separate
7777
device images on those targets, however, there are other factors that FPGA
78-
users need to be aware of when using the `device_image_life` property. These
78+
users need to be aware of when using the `device_image_scope` property. These
7979
are documented more throughly in the extension specification.
8080

8181
The important impact on the design, though, is that device global variables
82-
declared with the `device_image_life` property have an implementation that is
82+
declared with the `device_image_scope` property have an implementation that is
8383
quite different from device global variables that are not declared with this
8484
property. The sections below describe both implementations.
8585

@@ -91,7 +91,7 @@ property. The sections below describe both implementations.
9191
The headers, of course, include the declaration of the new `device_global`
9292
class, which is described in the [extension specification][1]. The declaration
9393
of this class uses partial specialization to define the class differently
94-
depending on whether is has the `device_image_life` property. When the
94+
depending on whether is has the `device_image_scope` property. When the
9595
property is not present, the class has a member variable which is a pointer to
9696
the underlying type. Member functions which return a reference to the value
9797
(e.g. `get`) return the value of this pointer:
@@ -127,7 +127,7 @@ of this member variable.
127127
The headers are also updated to add the new `copy()` and `memcpy()` member
128128
functions to `handler` and `queue` which copy data to or from a device global
129129
variable. These declarations use SFINAE such that they are conditionally
130-
available depending on the `copy_access` property.
130+
available depending on the `host_access` property.
131131

132132
### New LLVM IR attributes
133133

@@ -144,7 +144,7 @@ This is not possible, though, for variables with internal linkage because the
144144
mangled name is not unique in this case. For these variables, we use the
145145
mangled name and append a unique suffix.
146146

147-
Each device global variable that has the `device_image_life` property is also
147+
Each device global variable that has the `device_image_scope` property is also
148148
decorated with the `sycl-device-global-image-life` attribute.
149149

150150
Note that language rules ensure that `device_global` variables are always
@@ -209,7 +209,7 @@ global variable that is defined in the translation unit:
209209
* The variable's string from the `sycl-unique-id` attribute.
210210
* The size (in bytes) of the underlying `T` type for the variable.
211211
* A boolean telling whether the variable is decorated with the
212-
`device_image_life` property.
212+
`device_image_scope` property.
213213

214214
```
215215
namespace sycl::detail {
@@ -219,11 +219,11 @@ __sycl_device_global_registration::__sycl_device_global_registration() noexcept
219219
device_global_map::add(&::Foo,
220220
/* mangled name of '::Foo' with unique suffix appended */,
221221
/* size of underlying 'T' type */,
222-
/* bool telling whether variable has 'device_image_life` property */);
222+
/* bool telling whether variable has 'device_image_scope` property */);
223223
device_global_map::add(&::inner::Bar,
224224
/* mangled name of '::inner::Bar' */,
225225
/* size of underlying 'T' type */,
226-
/* bool telling whether variable has 'device_image_life` property */);
226+
/* bool telling whether variable has 'device_image_scope` property */);
227227
}
228228
229229
} // namepsace (unnamed)
@@ -287,11 +287,11 @@ __sycl_device_global_registration::__sycl_device_global_registration() noexcept
287287
device_global_map::add(&::FuBar,
288288
/* mangled name of '::FuBar' */,
289289
/* size of underlying 'T' type */,
290-
/* bool telling whether variable has 'device_image_life` property */);
290+
/* bool telling whether variable has 'device_image_scope` property */);
291291
device_global_map::add(::__sycl_UNIQUE_STRING,
292292
/* mangled name of '::(unnamed)::FuBar' with unique suffix appended */,
293293
/* size of underlying 'T' type */,
294-
/* bool telling whether variable has 'device_image_life` property */);
294+
/* bool telling whether variable has 'device_image_scope` property */);
295295
}
296296
297297
} // namepsace (unnamed)
@@ -319,7 +319,7 @@ global variable decorated with `sycl-device-global-image-life` appears in more
319319
than one module, the `sycl-post-link` tool issues an error diagnostic:
320320

321321
```
322-
error: device_global variable <name> with property "device_image_life"
322+
error: device_global variable <name> with property "device_image_scope"
323323
is contained in more than one device image.
324324
```
325325

@@ -361,12 +361,13 @@ strings, where each string ends with a null character (`\0`).
361361
Several changes are needed to the DPC++ runtime
362362

363363
* As noted in the requirements section, an instance of a device global variable
364-
that does not have the `device_image_life` property is shared by all device
364+
that does not have the `device_image_scope` property is shared by all device
365365
images on a device. To satisfy this requirement, the device global variable
366366
contains a pointer to a buffer allocated from USM device memory, and the
367-
content of the variable is stored in this buffer. All device images point to
368-
the same buffer, so the variable's state is shared. The runtime, therefore,
369-
must allocate this USM buffer for each such device global variable.
367+
content of the variable is stored in this buffer. All device images on a
368+
particular device point to the same buffer, so the variable's state is
369+
shared. The runtime, therefore, must allocate this USM buffer for each such
370+
device global variable.
370371

371372
* As we noted above, the front-end generates new content in the integration
372373
footer which calls the function `sycl::detail::device_global_map::add()`.
@@ -377,9 +378,9 @@ Several changes are needed to the DPC++ runtime
377378
- The string which uniquely identifies the variable.
378379
- The size (in bytes) of the underlying `T` type for the variable.
379380
- A boolean telling whether the variable is decorated with the
380-
`device_image_life` property.
381+
`device_image_scope` property.
381382
- The associated per-device USM buffer pointer, if this variable does not
382-
have the `device_image_life` property.
383+
have the `device_image_scope` property.
383384

384385
We refer to this information as the "device global database" below.
385386

@@ -404,13 +405,14 @@ runtime does the following:
404405
the `pi_program` to get the unique string associated with each device global
405406
variable that is used by the `pi_program`. For each of these strings, the
406407
runtime uses the device global database to see if the variable was decorated
407-
with `device_image_life`. If it was not so decorated and if a USM buffer has
408-
not already been created for the variable on this target device, the runtime
409-
allocates the buffer from USM device memory using the size from the database.
410-
The pointer to this buffer is saved in the database for future reuse.
408+
with `device_image_scope`. If it was not so decorated and if a USM buffer
409+
has not already been created for the variable on this target device, the
410+
runtime allocates the buffer from USM device memory using the size from the
411+
database and zero-initializes the content of the buffer. The pointer to this
412+
buffer is saved in the database for future reuse.
411413

412414
* For each device global variable that is not decorated with
413-
`device_image_life`, the runtime initializes the `usmptr` member in the
415+
`device_image_scope`, the runtime initializes the `usmptr` member in the
414416
*device instance* of the variable by using a backend-specific function which
415417
copies data from the host to a device variable. It is a simple matter to use
416418
this function to overwrite the `usmptr` member with the address of the USM
@@ -421,22 +423,23 @@ runtime does the following:
421423
Each of these functions accepts a (host) pointer to a device global variable as
422424
one of its parameters, and the runtime uses this pointer to find the associated
423425
information for this variable in the device global database. The remaining
424-
behavior depends on whether the variable is decorated with `device_image_life`.
426+
behavior depends on whether the variable is decorated with
427+
`device_image_scope`.
425428

426429
If the variable is not decorated with this property, the runtime uses the
427430
database to determine if a USM buffer has been allocated yet for this variable
428431
on this device. If not, the runtime allocates the buffer using the size from
429-
the database. Regardless, the runtime implements the `copy` / `memcpy` by
430-
copying to or from this USM buffer, using the normal mechanism for copying
431-
to / from a USM pointer.
432+
the database and zero-initializes the buffer. Regardless, the runtime
433+
implements the `copy` / `memcpy` by copying to or from this USM buffer, using
434+
the normal mechanism for copying to / from a USM pointer.
432435

433436
The runtime avoids the future cost of looking up the variable in the database
434437
by caching the USM pointer in the host instance of the variable's `usmptr`
435438
member.
436439

437-
If the variable is decorated with the `device_image_life` property, the runtime
438-
gets the unique string identifier for the variable from the database and uses
439-
a backend-specific function to copy to or from the variable with that
440+
If the variable is decorated with the `device_image_scope` property, the
441+
runtime gets the unique string identifier for the variable from the database
442+
and uses a backend-specific function to copy to or from the variable with that
440443
identifier. Again, the details of this function are described below.
441444

442445
In all cases, the runtime diagnoses invalid calls that write beyond the device
@@ -578,8 +581,8 @@ SPIR-V decorations (defined in the
578581

579582
[12]: <extensions/DeviceGlobal/SPV_INTEL_global_variable_decorations.asciidoc>
580583

581-
* `copy_access`
582-
* `init_via`
584+
* `host_access`
585+
* `init_mode`
583586
* `implement_in_csr`
584587

585588
It's not clear how this should work. One of the goals of the new property
@@ -613,11 +616,11 @@ Currently, we use the variable's mangled name, but this could be changed.
613616
An alternative solution would be to augment the SPIR-V with some new decoration
614617
that gives a unique name to each `OpVariable` that needs to be accessed from
615618
the host. We could then use that name with the backend functions, and avoid
616-
renaming variables with internal linkage. This would be more effort, though,
617-
because we would need a new SPIR-V extension, and we would need to change the
618-
implementation of the Level Zero and OpenCL backends.
619+
renaming variables with internal linkage. This would be only a minor change to
620+
the [SPV\_INTEL\_global\_variable\_decorations][12] extension, but it would
621+
also require changes in the Level Zero and OpenCL backends.
619622

620-
### Does compiler need to be deterministic?
623+
### Does the compiler need to be deterministic?
621624

622625
The compiler is normally deterministic. If you compile the exact same source
623626
file twice specifying the same command line options each time, you get exactly

0 commit comments

Comments
 (0)