Skip to content

Commit b01d820

Browse files
[SYCL][Doc] Expand device_global related PI API (#5906)
This commit changes the PI API introduced by the device_global design document, aligning it with the corresponding OpenCL API and the rest of PI API. Additionally, it changes the references to relevant CUDA API functions from referring to the CUDA runtime API to refer to the CUDA driver API, which is the API used by the CUDA backend. Signed-off-by: Steffen Larsen <[email protected]>
1 parent ed877c2 commit b01d820

File tree

1 file changed

+31
-9
lines changed

1 file changed

+31
-9
lines changed

sycl/doc/design/DeviceGlobal.md

Lines changed: 31 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -563,15 +563,37 @@ instance of a device global variable in a `pi_program`. This functionality is
563563
exposed as two new PI interfaces:
564564

565565
```
566-
pi_result piextCopyToDeviceVariable(pi_device Device, pi_program Program,
567-
const char *name, const void *src, size_t count, size_t offset);
568-
569-
pi_result piextCopyFromDeviceVariable(pi_device Device, pi_program Program,
570-
const char *name, void *dst, size_t count, size_t offset);
566+
pi_result piextEnqueueDeviceVariableRead(pi_queue Queue, pi_program Program,
567+
const char *Name, pi_bool BlockingRead,
568+
size_t Count, size_t Offset, void *Dst,
569+
pi_uint32 NumEventsInWaitList,
570+
const pi_event *EventsWaitList,
571+
pi_event *Event);
572+
573+
pi_result piextEnqueueDeviceVariableWrite(pi_queue Queue, pi_program Program,
574+
const char *Name,
575+
pi_bool BlockingWrite, size_t Count,
576+
size_t Offset, const void *Src,
577+
pi_uint32 NumEventsInWaitList,
578+
const pi_event *EventsWaitList,
579+
pi_event *Event);
571580
```
572581

573-
In both cases the `name` parameter is the same as the `sycl-unique-id` string
574-
that is associated with the device global variable.
582+
The `piextEnqueueDeviceVariableRead` function reads `Count` bytes at byte-offset
583+
`Offset` from a device global variable in `Program` identified by the name
584+
`Name`. The read data is stored in `Dst`. Likewise, the
585+
`piextEnqueueDeviceVariableWrite` function reads `Count` bytes from `Dst` and
586+
stores them at byte-offset `Offset` in the device global variable in `Program`
587+
identified by the name `Name`.
588+
589+
Both functions will enqueue the associated memory command on `Queue` where it
590+
will first wait for `NumEventsInWaitList` events in `EventsWaitList` to finish.
591+
`Event` will be populated with the event associated with resulting enqueued
592+
command. If either `BlockingRead` or `BlockingWrite` is `true` the call will
593+
block on the host until the enqueued command finishes execution.
594+
595+
For `device_global` variables the `Name` parameter in calls to these functions
596+
is the same as the associated `sycl-unique-id` string.
575597

576598
The Level Zero backend has existing APIs that can implement these PI
577599
interfaces. The plugin first calls [`zeModuleGetGlobalPointer()`][8] to get a
@@ -616,8 +638,8 @@ depends upon implementation of that OpenCL extension.
616638

617639
[10]: <opencl-extensions/cl_intel_global_variable_access.asciidoc>
618640

619-
The CUDA backend has existing APIs `cudaMemcpyToSymbol()` and
620-
`cudaMemcpyFromSymbol()` which can be used to implement these PI interfaces.
641+
The CUDA backend has existing APIs `cuModuleGetGlobal()` and `cuMemcpyAsync()`
642+
which can be used to implement these PI interfaces.
621643

622644

623645
## Design choices

0 commit comments

Comments
 (0)