Skip to content

Commit be8e031

Browse files
authored
[SYCL][Doc] device_global: device_image_scope Update (#11212)
Two issues have been identified with the wording of the `device_image_scope` property. 1. The spec doesn't currently state that all accesses to the device_global from device code must be from the same device. * In the JiT flow you can build a device_image containing a device_global and then run kernels from that device_image on multiple targets. By the current wording of the spec, we were worried that, even with device_image_scope, a user could expect that device_global to retain its value across device. 2. The spec doesn't constrain which queues can be used to access a device_global from host code. * A queue is associated with a specific device. We were worried that it’s legal to copy to/from a device that doesn’t use a device_image_scope device_global and that that would require every device image to have a copy of every device global. * How should copies behave when no device_image accesses the device global?
1 parent 3c327c7 commit be8e031

File tree

1 file changed

+36
-13
lines changed

1 file changed

+36
-13
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_device_global.asciidoc

Lines changed: 36 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -610,12 +610,13 @@ device_image_scope
610610
----
611611
a|
612612
This property is most useful for kernels that are submitted to an FPGA device,
613-
but it may be used with any kernel. Normally, a single instance of a device
613+
but it may be used with any kernel. Normally, a single instance of a device
614614
global variable is allocated for each device, and that instance is shared by
615-
all kernels that belong to the same context and are submitted to the same device,
616-
regardless of which _device image_ contains the kernel. When this property is
617-
specified, it is an assertion by the user that the device global is referenced
618-
only from kernels that are contained by the same _device image_. An
615+
all kernels that belong to the same context and are submitted to the same
616+
device, regardless of which _device image_ contains the kernel.
617+
When this property is specified, it is an assertion by the user that on a given
618+
device a given device_global decorated with this property is only ever accessed
619+
in a single _device_image_. An
619620
implementation may be able to optimize accesses to the device global when this
620621
property is specified (especially on an FPGA device), but the user must be aware
621622
of which _device image_ contains the kernels that use the variable.
@@ -628,14 +629,14 @@ that directly access a variable do not all reside in the same _device image_,
628629
however no diagnostic is required for an indirect access from another _device
629630
image_.
630631

631-
When a device global is decorated with this property, the implementation
632-
re-initializes it whenever the _device image_ is loaded onto the device. As a
633-
result, the application can only be guaranteed that a device global retains its
634-
value between kernel invocations if it understands when the _device image_ is
635-
loaded onto the device. For an FPGA, this happens whenever the device is
636-
reprogrammed. Other devices typically load the _device image_ once before the
637-
first invocation of any kernel in that _device image_, and then it remains
638-
loaded onto the device until the program terminates.
632+
A device global variable is guaranteed to be initialized for a device prior to
633+
the first time it is accessed (whether from a kernel or a copy operation).
634+
Device globals may also be re-initialized at implementation-defined times if
635+
multiple _device images_ are used on the same device. To avoid unexpected
636+
re-initializations, applications should ensure that all kernels that are
637+
enqueued to a device D come from the same _device image_. In addition,
638+
applications should ensure that all device global copy operation enqueued to
639+
device D correspond to that same _device image_.
639640

640641
The application may copy to or from a device global even before any kernel in
641642
the _device image_ is submitted to the device. Doing so causes the device
@@ -1039,6 +1040,12 @@ the variable _dest_, the implementation throws an `exception` with the
10391040
If `PropertyListT` contains the `device_image_scope` property and the _dest_
10401041
variable exists in more than one _device image_ for this queue's device, the
10411042
implementation throws an `exception` with the `errc::invalid` error code.
1043+
1044+
If `PropertyListT` contains the `device_image_scope` property, at least one
1045+
kernel in the _device image_ containing the _dest_ variable must access the
1046+
_dest_ variable. If this is not the case, the implementation throws an
1047+
`exception` with the `errc::kernel_not_supported` error code.
1048+
10421049
a|
10431050
[source, c++]
10441051
----
@@ -1063,6 +1070,11 @@ If `PropertyListT` contains the `device_image_scope` property and the _src_
10631070
variable exists in more than one _device image_ for this queue's device, the
10641071
implementation throws an `exception` with the `errc::invalid` error code.
10651072

1073+
If `PropertyListT` contains the `device_image_scope` property, at least one
1074+
kernel in the _device image_ containing the _dest_ variable must access the
1075+
_dest_ variable. If this is not the case, the implementation throws an
1076+
`exception` with the `errc::kernel_not_supported` error code.
1077+
10661078
a|
10671079
[source, c++]
10681080
----
@@ -1085,6 +1097,11 @@ If `PropertyListT` contains the `device_image_scope` property and the _dest_
10851097
variable exists in more than one _device image_ for this queue's device, the
10861098
implementation throws an `exception` with the `errc::invalid` error code.
10871099

1100+
If `PropertyListT` contains the `device_image_scope` property, at least one
1101+
kernel in the _device image_ containing the _dest_ variable must access the
1102+
_dest_ variable. If this is not the case, the implementation throws an
1103+
`exception` with the `errc::kernel_not_supported` error code.
1104+
10881105
a|
10891106
[source, c++]
10901107
----
@@ -1107,6 +1124,12 @@ the variable _src_, the implementation throws an `exception` with the
11071124
If `PropertyListT` contains the `device_image_scope` property and the _src_
11081125
variable exists in more than one _device image_ for this queue's device, the
11091126
implementation throws an `exception` with the `errc::invalid` error code.
1127+
1128+
If `PropertyListT` contains the `device_image_scope` property, at least one
1129+
kernel in the _device image_ containing the _dest_ variable must access the
1130+
_dest_ variable. If this is not the case, the implementation throws an
1131+
`exception` with the `errc::kernel_not_supported` error code.
1132+
11101133
|====
11111134
--
11121135

0 commit comments

Comments
 (0)