@@ -77,14 +77,14 @@ Initialization and Discovery
77
77
Device handle lifetime
78
78
----------------------
79
79
80
- The device objects are reference-counted, and there are ${x}DeviceRetain and ${x}DeviceRelease.
81
- The ref-count of a device is automatically incremented when device is obtained by ${x}DeviceGet.
82
- After device is no longer needed to the application it must call to ${x}DeviceRelease.
83
- When ref-count of the underlying device handle becomes zero then that device object is deleted.
84
- Note, that besides the application itself, the Unified Runtime may increment and decrement ref-count on its own .
85
- So, after the call to ${x}DeviceRelease below, the device may stay alive until other
86
- objects attached to it, like command-queues , are deleted. But application may not use the device
87
- after it released its own reference.
80
+ Device objects are reference-counted, using ${x}DeviceRetain and ${x}DeviceRelease.
81
+ The ref-count of a device is automatically incremented when a device is obtained by ${x}DeviceGet.
82
+ After a device is no longer needed by the application it must call ${x}DeviceRelease.
83
+ When the ref-count of the underlying device handle becomes zero then that device object is deleted.
84
+ Note that a Unified Runtime adapter may internally increment and decrement a device's ref-count.
85
+ So after the call to ${x}DeviceRelease below, the device may stay active until other
86
+ objects using it, such as a command-queue , are deleted. However, an application
87
+ may not use the device after it releases its last reference.
88
88
89
89
.. parsed-literal ::
90
90
@@ -120,7 +120,7 @@ In case where the info size is only known at runtime then two calls are needed,
120
120
Device partitioning into sub-devices
121
121
------------------------------------
122
122
123
- The ${x}DevicePartition could partition a device into sub-device. The exact representation and
123
+ ${x}DevicePartition partitions a device into a sub-device. The exact representation and
124
124
characteristics of the sub-devices are device specific, but normally they each represent a
125
125
fixed part of the parent device, which can explicitly be programmed individually.
126
126
@@ -161,9 +161,10 @@ An implementation will return "0" in the count if no further partitioning is sup
161
161
Contexts
162
162
========
163
163
164
- Contexts are serving the purpose of resources sharing (between devices in the same context),
165
- and resources isolation (resources do not cross context boundaries). Resources such as memory allocations,
166
- events, and programs are explicitly created against a context. A trivial work with context looks like this:
164
+ Contexts serve the purpose of resource sharing (between devices in the same context),
165
+ and resource isolation (ensuring that resources do not cross context
166
+ boundaries). Resources such as memory allocations, events, and programs are
167
+ explicitly created against a context.
167
168
168
169
.. parsed-literal ::
169
170
@@ -235,18 +236,20 @@ explicit and implicit kernel arguments along with data needed for launch.
235
236
Queue and Enqueue
236
237
=================
237
238
238
- A queue object represents a logic input stream to a device. Kernels
239
- and commands are submitted to queue for execution using Equeue commands:
239
+ Queue objects are used to submit work to a given device. Kernels
240
+ and commands are submitted to queue for execution using Enqueue commands:
240
241
such as ${x}EnqueueKernelLaunch, ${x}EnqueueMemBufferWrite. Enqueued kernels
241
242
and commands can be executed in order or out of order depending on the
242
243
queue's property ${X}_QUEUE_FLAG_OUT_OF_ORDER_EXEC_MODE_ENABLE when the
243
- queue is created.
244
+ queue is created. If a queue is out of order, the queue may internally do some
245
+ scheduling of work to achieve concurrency on the device, while honouring the
246
+ event dependencies that are passed to each Enqueue command.
244
247
245
248
.. parsed-literal ::
246
249
247
250
// Create an out of order queue for hDevice in hContext
248
251
${x}_queue_handle_t hQueue;
249
- ${x}QueueCreate(hContext, hDevice,
252
+ ${x}QueueCreate(hContext, hDevice,
250
253
${X}_QUEUE_FLAG_OUT_OF_ORDER_EXEC_MODE_ENABLE, &hQueue);
251
254
252
255
// Launch a kernel with 3D workspace partitioning
0 commit comments