Skip to content

Commit 3c3b485

Browse files
authored
[SYCL][DOC] Clarification to device global design (#5432)
Clarify which SPIR-V decorations the `sycl-post-link` tool generates for each device global variable. Also update the link to the compile-time property design, which is now merged.
1 parent eb350ee commit 3c3b485

File tree

1 file changed

+26
-20
lines changed

1 file changed

+26
-20
lines changed

sycl/doc/DeviceGlobal.md

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ and it is described more fully by the [compile-time properties][2] design
156156
document. This attribute is also used for other classes that have properties,
157157
so it is not specific to the `device_global` class.
158158

159-
[2]: <https://github.com/intel/llvm/pull/4937>
159+
[2]: <CompileTimeProperties.md>
160160

161161
Note that the parameter list to
162162
`[[__sycl_detail__::add_ir_global_variable_attributes()]]` contains one
@@ -379,18 +379,24 @@ reference that variable.
379379

380380
As described in the design for [compile-time properties][2], the
381381
`sycl-post-link` tool is responsible for generating idiomatic LLVM IR for any
382-
compile-time properties that need to be generated in SPIR-V. The
383-
`sycl-post-link` tool does this for the following properties on each device
384-
global variable:
382+
compile-time properties that need to be generated in SPIR-V.
385383

386-
* `host_access`
387-
* `init_mode`
388-
* `implement_in_csr`
384+
The **HostAccessINTEL** decoration is required for all device global variables
385+
because it provides the name that the DPC++ runtime uses to access the
386+
variable. Therefore, the `sycl-post-link` tool always generates idiomatic LLVM
387+
IR for this decoration. The first SPIR-V operand is set according to the
388+
`host_access` property (or set to **Read/Write** if the device global doesn't
389+
have that property). The second SPIR-V operation is set to the value of the
390+
device global's `sycl-unique-id`.
389391

390-
The `host_access` property is handled specially because the SPIR-V decoration
391-
requires two "extra operands", but the SYCL property has only one operand. The
392-
second SPIR-V operand is the "name" of the variable, and the `sycl-post-link`
393-
tool passes the value from `sycl-unique-id` for this name.
392+
The `sycl-post-link` tool also generates idiomatic LLVM IR for the
393+
**InitModeINTEL** decoration (if the device global has the `init_mode`
394+
property) and for the **ImplementInCSRINTEL** decoration (if the device global
395+
has the `implement_in_csr` property). See the
396+
[SPV\_INTEL\_global\_variable\_decorations][6] SPIR-V extension for details
397+
about all of these decorations.
398+
399+
[6]: <extensions/DeviceGlobal/SPV_INTEL_global_variable_decorations.asciidoc>
394400

395401
The `sycl-post-link` tool also create a "SYCL/device globals" property set for
396402
each device code module that contains at least one device global variable.
@@ -479,12 +485,12 @@ runtime does the following:
479485

480486
- Regardless of whether the USM buffer has already been created for the
481487
variable, the runtime initializes the `usmptr` member in the *device
482-
instance* of the variable by using a new [PI interface][6] which copies
488+
instance* of the variable by using a new [PI interface][7] which copies
483489
data from the host to a global variable in a `pi_program`. It is a
484490
simple matter to use this interface to overwrite the `usmptr` member with
485491
the address of the USM buffer.
486492

487-
[6]: <#new-pi-interface-to-copy-to-or-from-a-module-scope-variable>
493+
[7]: <#new-pi-interface-to-copy-to-or-from-a-module-scope-variable>
488494

489495
Note that the runtime does not need to initialize the `val` member variable of
490496
device global variables that are decorated with `device_image_scope` because
@@ -530,7 +536,7 @@ contains this variable and uses its normal mechanism for creating a
530536
`pi_program` from this device code module. (The algorithm for creating device
531537
code modules in the `sycl-post-link` tool ensures that there will be no more
532538
than one module that contains the variable.) Finally, the runtime uses the
533-
new [PI interface][6] to copy to or from the contents of the variable in this
539+
new [PI interface][7] to copy to or from the contents of the variable in this
534540
`pi_program`.
535541

536542
It is possible that a device global variable with `device_image_scope` is not
@@ -561,9 +567,9 @@ In both cases the `name` parameter is the same as the `sycl-unique-id` string
561567
that is associated with the device global variable.
562568

563569
The Level Zero backend has existing APIs that can implement these PI
564-
interfaces. The plugin first calls [`zeModuleGetGlobalPointer()`][7] to get a
570+
interfaces. The plugin first calls [`zeModuleGetGlobalPointer()`][8] to get a
565571
device pointer for the variable and then calls
566-
[`zeCommandListAppendMemoryCopy()`][8] to copy to or from that pointer.
572+
[`zeCommandListAppendMemoryCopy()`][9] to copy to or from that pointer.
567573
However, the documentation (and implementation) of `zeModuleGetGlobalPointer()`
568574
needs to be extended slightly. The description currently says:
569575

@@ -592,16 +598,16 @@ This must be changed to say something along these lines:
592598
> * If `pGlobalName` identifies an imported SPIR-V variable, the module must be
593599
> dynamically linked before the variable's pointer may be queried.
594600
595-
[7]: <https://spec.oneapi.io/level-zero/latest/core/api.html#zemodulegetglobalpointer>
596-
[8]: <https://spec.oneapi.io/level-zero/latest/core/api.html#zecommandlistappendmemorycopy>
601+
[8]: <https://spec.oneapi.io/level-zero/latest/core/api.html#zemodulegetglobalpointer>
602+
[9]: <https://spec.oneapi.io/level-zero/latest/core/api.html#zecommandlistappendmemorycopy>
597603

598604
The OpenCL backend has a proposed extension
599-
[`cl_intel_global_variable_access`][9] that defines functions
605+
[`cl_intel_global_variable_access`][10] that defines functions
600606
`clEnqueueReadGlobalVariableINTEL()` and `clEnqueueWriteGlobalVariableINTEL()`
601607
which can be easily used to implement these PI interfaces. This DPC++ design
602608
depends upon implementation of that OpenCL extension.
603609

604-
[9]: <extensions/DeviceGlobal/cl_intel_global_variable_access.asciidoc>
610+
[10]: <extensions/DeviceGlobal/cl_intel_global_variable_access.asciidoc>
605611

606612
The CUDA backend has existing APIs `cudaMemcpyToSymbol()` and
607613
`cudaMemcpyFromSymbol()` which can be used to implement these PI interfaces.

0 commit comments

Comments
 (0)