@@ -156,7 +156,7 @@ and it is described more fully by the [compile-time properties][2] design
156
156
document. This attribute is also used for other classes that have properties,
157
157
so it is not specific to the ` device_global ` class.
158
158
159
- [ 2 ] : < https://github.com/intel/llvm/pull/4937 >
159
+ [ 2 ] : < CompileTimeProperties.md >
160
160
161
161
Note that the parameter list to
162
162
` [[__sycl_detail__::add_ir_global_variable_attributes()]] ` contains one
@@ -379,18 +379,24 @@ reference that variable.
379
379
380
380
As described in the design for [ compile-time properties] [ 2 ] , the
381
381
` sycl-post-link ` tool is responsible for generating idiomatic LLVM IR for any
382
- compile-time properties that need to be generated in SPIR-V. The
383
- ` sycl-post-link ` tool does this for the following properties on each device
384
- global variable:
382
+ compile-time properties that need to be generated in SPIR-V.
385
383
386
- * ` host_access `
387
- * ` init_mode `
388
- * ` implement_in_csr `
384
+ The ** HostAccessINTEL** decoration is required for all device global variables
385
+ because it provides the name that the DPC++ runtime uses to access the
386
+ variable. Therefore, the ` sycl-post-link ` tool always generates idiomatic LLVM
387
+ IR for this decoration. The first SPIR-V operand is set according to the
388
+ ` host_access ` property (or set to ** Read/Write** if the device global doesn't
389
+ have that property). The second SPIR-V operation is set to the value of the
390
+ device global's ` sycl-unique-id ` .
389
391
390
- The ` host_access ` property is handled specially because the SPIR-V decoration
391
- requires two "extra operands", but the SYCL property has only one operand. The
392
- second SPIR-V operand is the "name" of the variable, and the ` sycl-post-link `
393
- tool passes the value from ` sycl-unique-id ` for this name.
392
+ The ` sycl-post-link ` tool also generates idiomatic LLVM IR for the
393
+ ** InitModeINTEL** decoration (if the device global has the ` init_mode `
394
+ property) and for the ** ImplementInCSRINTEL** decoration (if the device global
395
+ has the ` implement_in_csr ` property). See the
396
+ [ SPV\_ INTEL\_ global\_ variable\_ decorations] [ 6 ] SPIR-V extension for details
397
+ about all of these decorations.
398
+
399
+ [ 6 ] : < extensions/DeviceGlobal/SPV_INTEL_global_variable_decorations.asciidoc >
394
400
395
401
The ` sycl-post-link ` tool also create a "SYCL/device globals" property set for
396
402
each device code module that contains at least one device global variable.
@@ -479,12 +485,12 @@ runtime does the following:
479
485
480
486
- Regardless of whether the USM buffer has already been created for the
481
487
variable, the runtime initializes the ` usmptr ` member in the * device
482
- instance* of the variable by using a new [ PI interface] [ 6 ] which copies
488
+ instance* of the variable by using a new [ PI interface] [ 7 ] which copies
483
489
data from the host to a global variable in a ` pi_program ` . It is a
484
490
simple matter to use this interface to overwrite the ` usmptr ` member with
485
491
the address of the USM buffer.
486
492
487
- [ 6 ] : < #new-pi-interface-to-copy-to-or-from-a-module-scope-variable >
493
+ [ 7 ] : < #new-pi-interface-to-copy-to-or-from-a-module-scope-variable >
488
494
489
495
Note that the runtime does not need to initialize the ` val ` member variable of
490
496
device global variables that are decorated with ` device_image_scope ` because
@@ -530,7 +536,7 @@ contains this variable and uses its normal mechanism for creating a
530
536
` pi_program ` from this device code module. (The algorithm for creating device
531
537
code modules in the ` sycl-post-link ` tool ensures that there will be no more
532
538
than one module that contains the variable.) Finally, the runtime uses the
533
- new [ PI interface] [ 6 ] to copy to or from the contents of the variable in this
539
+ new [ PI interface] [ 7 ] to copy to or from the contents of the variable in this
534
540
` pi_program ` .
535
541
536
542
It is possible that a device global variable with ` device_image_scope ` is not
@@ -561,9 +567,9 @@ In both cases the `name` parameter is the same as the `sycl-unique-id` string
561
567
that is associated with the device global variable.
562
568
563
569
The Level Zero backend has existing APIs that can implement these PI
564
- interfaces. The plugin first calls [ ` zeModuleGetGlobalPointer() ` ] [ 7 ] to get a
570
+ interfaces. The plugin first calls [ ` zeModuleGetGlobalPointer() ` ] [ 8 ] to get a
565
571
device pointer for the variable and then calls
566
- [ ` zeCommandListAppendMemoryCopy() ` ] [ 8 ] to copy to or from that pointer.
572
+ [ ` zeCommandListAppendMemoryCopy() ` ] [ 9 ] to copy to or from that pointer.
567
573
However, the documentation (and implementation) of ` zeModuleGetGlobalPointer() `
568
574
needs to be extended slightly. The description currently says:
569
575
@@ -592,16 +598,16 @@ This must be changed to say something along these lines:
592
598
> * If ` pGlobalName ` identifies an imported SPIR-V variable, the module must be
593
599
> dynamically linked before the variable's pointer may be queried.
594
600
595
- [ 7 ] : < https://spec.oneapi.io/level-zero/latest/core/api.html#zemodulegetglobalpointer >
596
- [ 8 ] : < https://spec.oneapi.io/level-zero/latest/core/api.html#zecommandlistappendmemorycopy >
601
+ [ 8 ] : < https://spec.oneapi.io/level-zero/latest/core/api.html#zemodulegetglobalpointer >
602
+ [ 9 ] : < https://spec.oneapi.io/level-zero/latest/core/api.html#zecommandlistappendmemorycopy >
597
603
598
604
The OpenCL backend has a proposed extension
599
- [ ` cl_intel_global_variable_access ` ] [ 9 ] that defines functions
605
+ [ ` cl_intel_global_variable_access ` ] [ 10 ] that defines functions
600
606
` clEnqueueReadGlobalVariableINTEL() ` and ` clEnqueueWriteGlobalVariableINTEL() `
601
607
which can be easily used to implement these PI interfaces. This DPC++ design
602
608
depends upon implementation of that OpenCL extension.
603
609
604
- [ 9 ] : < extensions/DeviceGlobal/cl_intel_global_variable_access.asciidoc >
610
+ [ 10 ] : < extensions/DeviceGlobal/cl_intel_global_variable_access.asciidoc >
605
611
606
612
The CUDA backend has existing APIs ` cudaMemcpyToSymbol() ` and
607
613
` cudaMemcpyFromSymbol() ` which can be used to implement these PI interfaces.
0 commit comments