@@ -610,12 +610,13 @@ device_image_scope
610
610
----
611
611
a|
612
612
This property is most useful for kernels that are submitted to an FPGA device,
613
- but it may be used with any kernel. Normally, a single instance of a device
613
+ but it may be used with any kernel. Normally, a single instance of a device
614
614
global variable is allocated for each device, and that instance is shared by
615
- all kernels that belong to the same context and are submitted to the same device,
616
- regardless of which _device image_ contains the kernel. When this property is
617
- specified, it is an assertion by the user that the device global is referenced
618
- only from kernels that are contained by the same _device image_. An
615
+ all kernels that belong to the same context and are submitted to the same
616
+ device, regardless of which _device image_ contains the kernel.
617
+ When this property is specified, it is an assertion by the user that on a given
618
+ device a given device_global decorated with this property is only ever accessed
619
+ in a single _device_image_. An
619
620
implementation may be able to optimize accesses to the device global when this
620
621
property is specified (especially on an FPGA device), but the user must be aware
621
622
of which _device image_ contains the kernels that use the variable.
@@ -628,14 +629,14 @@ that directly access a variable do not all reside in the same _device image_,
628
629
however no diagnostic is required for an indirect access from another _device
629
630
image_.
630
631
631
- When a device global is decorated with this property, the implementation
632
- re-initializes it whenever the _device image_ is loaded onto the device. As a
633
- result, the application can only be guaranteed that a device global retains its
634
- value between kernel invocations if it understands when the _device image_ is
635
- loaded onto the device. For an FPGA, this happens whenever the device is
636
- reprogrammed. Other devices typically load the _device image_ once before the
637
- first invocation of any kernel in that _device image_, and then it remains
638
- loaded onto the device until the program terminates .
632
+ A device global variable is guaranteed to be initialized for a device prior to
633
+ the first time it is accessed (whether from a kernel or a copy operation).
634
+ Device globals may also be re-initialized at implementation-defined times if
635
+ multiple _device images_ are used on the same device. To avoid unexpected
636
+ re-initializations, applications should ensure that all kernels that are
637
+ enqueued to a device D come from the same _device image_. In addition,
638
+ applications should ensure that all device global copy operation enqueued to
639
+ device D correspond to that same _device image_ .
639
640
640
641
The application may copy to or from a device global even before any kernel in
641
642
the _device image_ is submitted to the device. Doing so causes the device
@@ -1039,6 +1040,12 @@ the variable _dest_, the implementation throws an `exception` with the
1039
1040
If `PropertyListT` contains the `device_image_scope` property and the _dest_
1040
1041
variable exists in more than one _device image_ for this queue's device, the
1041
1042
implementation throws an `exception` with the `errc::invalid` error code.
1043
+
1044
+ If `PropertyListT` contains the `device_image_scope` property, at least one
1045
+ kernel in the _device image_ containing the _dest_ variable must access the
1046
+ _dest_ variable. If this is not the case, the implementation throws an
1047
+ `exception` with the `errc::kernel_not_supported` error code.
1048
+
1042
1049
a|
1043
1050
[source, c++]
1044
1051
----
@@ -1063,6 +1070,11 @@ If `PropertyListT` contains the `device_image_scope` property and the _src_
1063
1070
variable exists in more than one _device image_ for this queue's device, the
1064
1071
implementation throws an `exception` with the `errc::invalid` error code.
1065
1072
1073
+ If `PropertyListT` contains the `device_image_scope` property, at least one
1074
+ kernel in the _device image_ containing the _dest_ variable must access the
1075
+ _dest_ variable. If this is not the case, the implementation throws an
1076
+ `exception` with the `errc::kernel_not_supported` error code.
1077
+
1066
1078
a|
1067
1079
[source, c++]
1068
1080
----
@@ -1085,6 +1097,11 @@ If `PropertyListT` contains the `device_image_scope` property and the _dest_
1085
1097
variable exists in more than one _device image_ for this queue's device, the
1086
1098
implementation throws an `exception` with the `errc::invalid` error code.
1087
1099
1100
+ If `PropertyListT` contains the `device_image_scope` property, at least one
1101
+ kernel in the _device image_ containing the _dest_ variable must access the
1102
+ _dest_ variable. If this is not the case, the implementation throws an
1103
+ `exception` with the `errc::kernel_not_supported` error code.
1104
+
1088
1105
a|
1089
1106
[source, c++]
1090
1107
----
@@ -1107,6 +1124,12 @@ the variable _src_, the implementation throws an `exception` with the
1107
1124
If `PropertyListT` contains the `device_image_scope` property and the _src_
1108
1125
variable exists in more than one _device image_ for this queue's device, the
1109
1126
implementation throws an `exception` with the `errc::invalid` error code.
1127
+
1128
+ If `PropertyListT` contains the `device_image_scope` property, at least one
1129
+ kernel in the _device image_ containing the _dest_ variable must access the
1130
+ _dest_ variable. If this is not the case, the implementation throws an
1131
+ `exception` with the `errc::kernel_not_supported` error code.
1132
+
1110
1133
|====
1111
1134
--
1112
1135
0 commit comments