You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the device_global extension specifies that the same allocation
used by a device_global is accessible by other kernels on the same
device, however memory allocations for devices are not guaranteed to be
accessible across contexts. As such, this commit clarifies that the
consistency between kernels is only true for kernels running on the same
device and context.
---------
Signed-off-by: Larsen, Steffen <[email protected]>
Co-authored-by: Greg Lueck <[email protected]>
For both `dm1` and `dm2`, the `MyClass` and `int[4]` allocations on each device are zero-initialized before any non-initialization accesses occur.
146
+
For both `dm1` and `dm2`, the `MyClass` and `int[4]` allocations on each device
147
+
in the context associated with `Q` are zero-initialized before any
148
+
non-initialization accesses occur.
144
149
145
150
== Proposal
146
151
@@ -163,11 +168,30 @@ that the implementation supports.
163
168
164
169
=== Representation of device globals
165
170
166
-
`device_global` provides a mechanism to allocate device scope memory - memory which has unique underlying storage (of type _T_) for each `sycl::device` object. If multiple device objects are present then each device object receives its own unique underlying allocation. All kernels that reference the same `device_global` entity (either directly or via a pointer to its underlying object of type _T_) share the same allocation of that object when those kernels run on the same device.
167
-
168
-
`device_global` allocations are in the global address space, as are any underlying allocations of type `T` which are implicitly allocated on each device as a result of a `device_global` object. It is undefined behavior if the host program directly accesses a `device_global` or any address obtained from a `device_global` member function, and similarly it is undefined behavior if a `device_global` or address obtained on one device from a `device_global` member function is accessed on a different device. There is no mechanism to obtain addresses of or directly access a device's `device_global` allocation within the host program.
169
-
170
-
A `device_global` on a given device maintains its state (address of the allocation and data within the allocation) even after the application changes the value of a specialization constant via `handler::set_specialization_constant()`. Additionally, a `device_global` maintains its state even when it is referenced from a kernel in a different `kernel_bundle`.
171
+
`device_global` provides a mechanism to allocate device scope memory - memory
172
+
which has unique underlying storage (of type _T_) for each `sycl::device` and
173
+
`sycl::context` combination. If multiple valid device and context combinations
174
+
are present then each receives its own unique underlying allocation. All kernels
175
+
that reference the same `device_global` entity (either directly or via a pointer
176
+
to its underlying object of type _T_) share the same allocation of that object
177
+
when those kernels run on the same device and context.
178
+
179
+
`device_global` allocations are in the global address space, as are any
180
+
underlying allocations of type `T` which are implicitly allocated on each device
181
+
as a result of a `device_global` object. It is undefined behavior if the host
182
+
program directly accesses a `device_global` or any address obtained from a
183
+
`device_global` member function, and similarly it is undefined behavior if a
184
+
`device_global` or address obtained on one device from a `device_global` member
185
+
function is accessed on a different device or context. There is no mechanism to
186
+
obtain addresses of or directly access a device's `device_global` allocation
187
+
within the host program.
188
+
189
+
A `device_global` on a given device and context maintains its state (address of
190
+
the allocation and data within the allocation) even after the application
191
+
changes the value of a specialization constant via
192
+
`handler::set_specialization_constant()`. Additionally, a `device_global`
193
+
maintains its state even when it is referenced from a kernel in a different
194
+
`kernel_bundle`.
171
195
172
196
[source,c++]
173
197
----
@@ -596,13 +620,13 @@ a|
596
620
This property is most useful for kernels that are submitted to an FPGA device,
597
621
but it may be used with any kernel. Normally, a single instance of a device
598
622
global variable is allocated for each device, and that instance is shared by
599
-
all kernels that are submitted to the device, regardless of which _device
600
-
image_ contains the kernel. When this property is specified, it is an
601
-
assertion by the user that the device global is referenced only from kernels
602
-
that are contained by the same _device image_. An implementation may be able
603
-
to optimize accesses to the device global when this property is specified
604
-
(especially on an FPGA device), but the user must be aware of which _device
605
-
image_ contains the kernels that use the variable.
623
+
all kernels that belong to the same context and are submitted to the same device,
624
+
regardless of which _device image_ contains the kernel. When this property is
625
+
specified, it is an assertion by the user that the device global is referenced
626
+
only from kernels that are contained by the same _device image_. An
627
+
implementation may be able to optimize accesses to the device global when this
628
+
property is specified (especially on an FPGA device), but the user must be aware
629
+
of which _device image_ contains the kernels that use the variable.
606
630
607
631
A device global that is decorated with this property may not be accessed from
608
632
kernels that reside in different _device images_, either by direct reference
0 commit comments