Skip to content

WIP: [Offload] Add testing for Offload program and kernel related entry points #127803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 23 additions & 4 deletions offload/liboffload/API/Common.td
Original file line number Diff line number Diff line change
Expand Up @@ -62,19 +62,38 @@ def : Handle {
let desc = "Handle of context object";
}

def : Handle {
let name = "ol_queue_handle_t";
let desc = "Handle of queue object";
}

def : Handle {
let name = "ol_event_handle_t";
let desc = "Handle of event object";
}

def : Handle {
let name = "ol_program_handle_t";
let desc = "Handle of program object";
}

def : Handle {
let name = "ol_kernel_handle_t";
let desc = "Handle of kernel object";
}

def : Enum {
let name = "ol_errc_t";
let desc = "Defines Return/Error codes";
let etors =[
Etor<"SUCCESS", "Success">,
Etor<"INVALID_VALUE", "Invalid Value">,
Etor<"INVALID_PLATFORM", "Invalid platform">,
Etor<"DEVICE_NOT_FOUND", "Device not found">,
Etor<"INVALID_DEVICE", "Invalid device">,
Etor<"DEVICE_LOST", "Device hung, reset, was removed, or driver update occurred">,
Etor<"UNINITIALIZED", "plugin is not initialized or specific entry-point is not implemented">,
Etor<"INVALID_QUEUE", "Invalid queue">,
Etor<"INVALID_EVENT", "Invalid event">,
Etor<"INVALID_KERNEL_NAME", "Named kernel not found in the program binary">,
Etor<"OUT_OF_RESOURCES", "Out of resources">,
Etor<"UNSUPPORTED_VERSION", "generic error code for unsupported versions">,
Etor<"UNSUPPORTED_FEATURE", "generic error code for unsupported features">,
Etor<"INVALID_ARGUMENT", "generic error code for invalid arguments">,
Etor<"INVALID_NULL_HANDLE", "handle argument is not valid">,
Expand Down
12 changes: 12 additions & 0 deletions offload/liboffload/API/Device.td
Original file line number Diff line number Diff line change
Expand Up @@ -104,3 +104,15 @@ def : Function {
Return<"OL_ERRC_INVALID_DEVICE">
];
}

def : Function {
let name = "olGetHostDevice";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it works in HSA is that you iterate through all of the devices, and one of them has the special 'type' of host. So this should use the same interface as the GPU devices but have a different 'platform' as you call it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little wary of having a device discovered the regular way that a user can't actually enqueue work on (hopefully it will be usable that way, but as you've suggested in other comments the host plugin needs a bit of work).

But having the user check the device type to find the host device isn't too onerous so I can make this change.

let desc = "Return the special host device used to represent the host in memory transfer operations";
let details = [
"The host device does not support queues"
];
let params = [
Param<"ol_device_handle_t*", "Device", "Output pointer for the device">
]; // TODO: Take a platform?
let returns = [];
}
61 changes: 61 additions & 0 deletions offload/liboffload/API/Enqueue.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
//===-- Enqueue.td - Enqueue definitions for Offload -------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains Offload API definitions related to enqueable operations
//
//===----------------------------------------------------------------------===//

def : Function {
let name = "olEnqueueMemcpy";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to be specific about enqueue? The way things work in the plugins at least is that everything takes a queue pointer, and if it's null we do it synchronously. We could also just make olMemcpy and olMemcpyAsync if we want to omit the argument since we do only give out handles, not pointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off the top of my head I can't think of an reason why we couldn't make the queue handles optional.

In UR we have optional handles, and don't hide the fact that they're pointers so they can be set to null.

But if we want to avoid that then I can make the change to olMemcpy and olMemcpyAsync

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer olMemcpy if we're not making the distinction. Forcing the user to create a queue is fine since this is supposed to be lower level.

let desc = "Enqueue a memcpy operation.";
let details = [
"For host pointers, use the device returned by olGetHostDevice",
"At least one device must be a non-host device"
];
let params = [
Param<"ol_queue_handle_t", "Queue", "handle of the queue", PARAM_IN>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't decide if I like the stream at the beginning or the end, but whatever we do it should be a consistent convention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The queue is the first param in every function except olCreateQueue where it's the last, because it's an output parameter. Generally I'd prefer to keep all output pointers as the final argument in every function, but that's just personal preference so could be changed.

Param<"void*", "DstPtr", "pointer to copy to", PARAM_IN>,
Param<"ol_device_handle_t", "DstDevice", "device that DstPtr belongs to", PARAM_IN>,
Param<"void*", "SrcPtr", "pointer to copy from", PARAM_IN>,
Param<"ol_device_handle_t", "SrcDevice", "device that SrcPtr belongs to", PARAM_IN>,
Param<"size_t", "Size", "size in bytes of data to copy", PARAM_IN>,
Param<"ol_event_handle_t*", "EventOut", "optional recorded event for the enqueued operation", PARAM_OUT_OPTIONAL>
];
let returns = [
Return<"OL_ERRC_INVALID_SIZE", ["`Size == 0`"]>
];
}

def : Struct {
let name = "ol_kernel_launch_size_args_t";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing dynamic memory and other stuff, but since this is a struct we can always widen it.

let desc = "Size-related arguments for a kernel launch.";
let members = [
StructMember<"size_t", "Dimensions", "Number of work dimensions">,
StructMember<"size_t", "NumGroupsX", "Number of work groups on the X dimension">,
StructMember<"size_t", "NumGroupsY", "Number of work groups on the Y dimension">,
StructMember<"size_t", "NumGroupsZ", "Number of work groups on the Z dimension">,
StructMember<"size_t", "GroupSizeX", "Size of a work group on the X dimension.">,
StructMember<"size_t", "GroupSizeY", "Size of a work group on the Y dimension.">,
StructMember<"size_t", "GroupSizeZ", "Size of a work group on the Z dimension.">
];
}

def : Function {
let name = "olEnqueueKernelLaunch";
let desc = "Enqueue a kernel launch with the specified size and parameters";
let details = [];
let params = [
Param<"ol_queue_handle_t", "Queue", "handle of the queue", PARAM_IN>,
Param<"ol_kernel_handle_t", "Kernel", "handle of the kernel", PARAM_IN>,
Param<"const void*", "ArgumentsData", "pointer to the kernel argument struct", PARAM_IN>,
Param<"size_t", "ArgumentsSize", "size of the kernel argument struct", PARAM_IN>,
Param<"const ol_kernel_launch_size_args_t*", "LaunchSizeArgs", "pointer to the struct containing launch size parameters", PARAM_IN>,
Param<"ol_event_handle_t*", "EventOut", "optional recorded event for the enqueued operation", PARAM_OUT_OPTIONAL>
];
let returns = [];
}
41 changes: 41 additions & 0 deletions offload/liboffload/API/Event.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
//===-- Event.td - Event definitions for Offload -----------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains Offload API definitions related to the event handle
//
//===----------------------------------------------------------------------===//

def : Function {
let name = "olRetainEvent";
let desc = "Increment the event's reference count";
let details = [];
let params = [
Param<"ol_event_handle_t", "Event", "handle of the event", PARAM_IN>
];
let returns = [];
}

def : Function {
let name = "olReleaseEvent";
let desc = "Decrement the event's reference count, and free it if the reference count reaches 0";
let details = [];
let params = [
Param<"ol_event_handle_t", "Event", "handle of the event", PARAM_IN>
];
let returns = [];
}

def : Function {
let name = "olWaitEvent";
let desc = "Wait for the event to be complete";
let details = [];
let params = [
Param<"ol_event_handle_t", "Event", "handle of the event", PARAM_IN>
];
let returns = [];
}
45 changes: 45 additions & 0 deletions offload/liboffload/API/Kernel.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
//===-- Kernel.td - Kernel definitions for Offload ---------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains Offload API definitions related to the kernel handle
//
//===----------------------------------------------------------------------===//

def : Function {
let name = "olCreateKernel";
let desc = "Create a kernel from the function identified by `KernelName` in the given program";
let details = [
"The created kernel has an initial reference count of 1."
];
let params = [
Param<"ol_program_handle_t", "Program", "handle of the program", PARAM_IN>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We likely need to make a distinction between an ELF image in host memory and a loaded image with an address on the GPU, HSA does that with some sort of hsa_load_executable and hsa_executable_freeze where the latter actually loads the image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have that distinction in UR too, loading it on creation was just slightly easier with how the plugins work now. I can see if I can pull some stuff out into a olLoadProgram (or similar name) entry point. I think with that design it makes sense for the ol_program_handle_t to wrap both the unloaded ELF and the loaded device image, with a flag representing whether it is loaded or not to prevent it being used for kernel execution before being loaded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to expose a read / write on the image before it's loaded, since that's what the image access lets us do. It's a very fast way to initialize globals.

I'll need to fix up some of the image handling in the plugins, right now it's really, really bad about managing lifetimes, just copying a pointer and assuming that the data lives forever (Even after global destructors have run).

Param<"const char*", "KernelName", "name of the kernel entry point in the program", PARAM_IN>,
Param<"ol_kernel_handle_t*", "Kernel", "output pointer for the created kernel", PARAM_OUT>
];
let returns = [];
}

def : Function {
let name = "olRetainKernel";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a reference count on the kernel? It's just a pointer to some global in the image the user gave. I'd just prefer an API that lets users get the address of a global and then get returned some information about the global like its size, whether it's a kernel, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a consequence of the ability to set individual (or all) kernel arguments on a kernel object. With that, the kernel becomes the global address + whatever argument state the user has set, which necessitates tracking this state.

If we go the direction of passing arguments at the time of a kernel launch then we can drop all this.

let desc = "Increment the kernel's reference count";
let details = [];
let params = [
Param<"ol_kernel_handle_t", "Kernel", "handle of the kernel", PARAM_IN>
];
let returns = [];
}

def : Function {
let name = "olReleaseKernel";
let desc = "Decrement the kernel's reference count, and free it if the reference count reaches 0";
let details = [];
let params = [
Param<"ol_kernel_handle_t", "Kernel", "handle of the kernel", PARAM_IN>
];
let returns = [];
}
48 changes: 48 additions & 0 deletions offload/liboffload/API/Memory.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
//===-- Memory.td - Memory definitions for Offload ---------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains Offload API definitions related to memory allocations
//
//===----------------------------------------------------------------------===//

def : Enum {
let name = "ol_alloc_type_t";
let desc = "Represents the type of allocation made with olMemAlloc";
let etors = [
Etor<"HOST", "Host allocation">,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names honestly suck (I know they came from the plugin), we should think up some better ones but that's an issue for another day.

Etor<"DEVICE", "Device allocation">,
Etor<"SHARED", "Shared allocation">
];
}

def : Function {
let name = "olMemAlloc";
let desc = "Creates a memory allocation on the specified device";
let params = [
Param<"ol_device_handle_t", "Device", "handle of the device to allocate on", PARAM_IN>,
Param<"ol_alloc_type_t", "Type", "type of the allocation", PARAM_IN>,
Param<"size_t", "Size", "size of the allocation in bytes", PARAM_IN>,
Param<"void**", "AllocationOut", "output for the allocated pointer", PARAM_OUT>
];
let returns = [
Return<"OL_ERRC_INVALID_SIZE", [
"`Size == 0`"
]>
];
}

def : Function {
let name = "olMemFree";
let desc = "Frees a memory allocation previously made by olMemAlloc";
let params = [
Param<"ol_device_handle_t", "Device", "handle of the device to allocate on", PARAM_IN>,
Param<"ol_alloc_type_t", "Type", "type of the allocation", PARAM_IN>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undecided on forcing the user to track the allocation kind. I know CUDA does it with host/mem stuff. But I wonder if it's restrictively expensive to just put that in a hashmap somewhere in the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree with this, and actually I don't know if a hashmap is really necessary. On CUDA we can use cuPointerGetAttributes with CU_POINTER_ATTRIBUTE_IS_MANAGED and CU_POINTER_ATTRIBUTE_MEMORY_TYPE to know whether to use cuMemFree or cuMemFreeHost. If it's possible with hsa as well we could drop the kind from the plugin interface. I didn't tackle that in this PR since it seemed like a fairly invasive plugin change.

Param<"void*", "Address", "address of the allocation to free", PARAM_IN>,
];
let returns = [];
}
6 changes: 6 additions & 0 deletions offload/liboffload/API/OffloadAPI.td
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,9 @@ include "APIDefs.td"
include "Common.td"
include "Platform.td"
include "Device.td"
include "Memory.td"
include "Queue.td"
include "Event.td"
include "Enqueue.td"
include "Program.td"
include "Kernel.td"
46 changes: 46 additions & 0 deletions offload/liboffload/API/Program.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
//===-- Program.td - Program definitions for Offload -------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains Offload API definitions related to the program handle
//
//===----------------------------------------------------------------------===//

def : Function {
let name = "olCreateProgram";
let desc = "Create a program for the device from the binary image pointed to by `ProgData`";
let details = [
"The created program has an initial reference count of 1."
];
let params = [
Param<"ol_device_handle_t", "Device", "handle of the device", PARAM_IN>,
Param<"void*", "ProgData", "pointer to the program binary data", PARAM_IN>,
Param<"size_t", "ProgDataSize", "size of the program binary in bytes", PARAM_IN>,
Param<"ol_program_handle_t*", "Program", "output pointer for the created program", PARAM_OUT>
];
let returns = [];
}

def : Function {
let name = "olRetainProgram";
let desc = "Increment the program's reference count";
let details = [];
let params = [
Param<"ol_program_handle_t", "Program", "handle of the program", PARAM_IN>
];
let returns = [];
}

def : Function {
let name = "olReleaseProgram";
let desc = "Decrement the program's reference count, and free it if the reference count reaches 0";
let details = [];
let params = [
Param<"ol_program_handle_t", "Program", "handle of the program", PARAM_IN>
];
let returns = [];
}
54 changes: 54 additions & 0 deletions offload/liboffload/API/Queue.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
//===-- Queue.td - Queue definitions for Offload -----------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains Offload API definitions related to the queue handle
//
//===----------------------------------------------------------------------===//

def : Function {
let name = "olCreateQueue";
let desc = "Create a queue for the given device";
let details = [
"The created queue has an initial reference count of 1."
];
let params = [
Param<"ol_device_handle_t", "Device", "handle of the device", PARAM_IN>,
Param<"ol_queue_handle_t*", "Queue", "output pointer for the created queue", PARAM_OUT>
];
let returns = [];
}

def : Function {
let name = "olRetainQueue";
let desc = "Increment the queue's reference count.";
let details = [];
let params = [
Param<"ol_queue_handle_t", "Queue", "handle of the queue", PARAM_IN>
];
let returns = [];
}

def : Function {
let name = "olReleaseQueue";
let desc = "Decrement the queues's reference count, and free it if the reference count reaches 0";
let details = [];
let params = [
Param<"ol_queue_handle_t", "Queue", "handle of the queue", PARAM_IN>
];
let returns = [];
}

def : Function {
let name = "olWaitQueue";
let desc = "Wait for the enqueued work on a queue to complete";
let details = [];
let params = [
Param<"ol_queue_handle_t", "Queue", "handle of the queue", PARAM_IN>
];
let returns = [];
}
6 changes: 3 additions & 3 deletions offload/liboffload/API/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,13 +138,13 @@ allow more backends to be easily added in future.

A new object can be added to the API by adding to one of the existing `.td`
files. It is also possible to add a new tablegen file to the API by adding it
to the includes in `OffloadAPI.td`. When the offload target is rebuilt, the
new definition will be included in the generated files.
to the includes in `OffloadAPI.td`. When the `OffloadGenerate` target is
rebuilt, the new definition will be included in the generated files.

### Adding a new entry point

When a new entry point is added (e.g. `offloadDeviceFoo`), the actual entry
point is automatically generated, which contains validation and tracing code.
It expects an implementation function (`offloadDeviceFoo_impl`) to be defined,
which it will call into. The definition of this implementation function should
be added to `src/offload_impl.cpp`
be added to `src/OffloadImpl.cpp`
Loading
Loading