-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][L0] Adds device member to L0 make_queue input type #6148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][L0] Adds device member to L0 make_queue input type #6148
Conversation
This commit adds a new device member to the make_queue input type for the L0 backend. This allows the L0 backend to correctly associate the right device with the command queue to create the PI queue from, preventing unintented and illegal behavior when the PI queue is later used for otherwise valid operations. It is currently still valid to create a SYCL queue from a native L0 command queue without associating the correct device, which will lead to an unspecified device being selected, similar to the old behavior. This behavior is however marked as deprecated. Signed-off-by: Larsen, Steffen <[email protected]>
@@ -115,6 +115,7 @@ struct { | |||
``` C++ | |||
struct { | |||
ze_command_queue_handle_t NativeHandle; | |||
device Device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to mention that the old variant is still usable, although deprecated? Maybe it is covered by:
NOTE: This extension is following SYCL 2020 backend specification. Prior API for interoperability with Level-Zero is marked as deprecated and will be removed in the next release.
We could include the constructors, but they are really only there to allow Device
to be before Ownership
as it is more in line with other input types, like the one for context
. I fear including the constructors here will only make it less readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should document the deprecated API. Since this PR changes the API, the feature-test macro should also be incremented to 3
, and the new form of make_queue
should note that it was added in version 3
of this extension.
I don't understand your comment about the constructor. Are you referring to the make_queue
function as the "constructor"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand your comment about the constructor. Are you referring to the make_queue function as the "constructor"?
By constructor, I mean the constructor of the struct the new Device
member. It is unnamed in this document, but in the implementation we have
template <> struct BackendInput<backend::ext_oneapi_level_zero, queue> {
struct type {
interop<backend::ext_oneapi_level_zero, queue>::type NativeHandle;
device Device;
ext::oneapi::level_zero::ownership Ownership{
ext::oneapi::level_zero::ownership::transfer};
type(interop<backend::ext_oneapi_level_zero, queue>::type nativeHandle,
ext::oneapi::level_zero::ownership ownership) ...
type(interop<backend::ext_oneapi_level_zero, queue>::type nativeHandle,
device dev, ext::oneapi::level_zero::ownership ownership) ...
};
This allows both make_queue(... {ZeCommandQueueHandle, MyOwnership} ...)
and make_queue(... {ZeCommandQueueHandle, MyDevice, MyOwnership} ...)
, where the former is deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if someone has existing code like this:
using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;
void foo(context ctxt, ze_command_queue_handle_t native) {
l0_queue var;
var.NativeHandle = native;
queue q = make_queue<backend::ext_oneapi_level_zero>(var, ctxt);
}
Signed-off-by: Larsen, Steffen <[email protected]>
@@ -115,6 +115,7 @@ struct { | |||
``` C++ | |||
struct { | |||
ze_command_queue_handle_t NativeHandle; | |||
device Device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message indicates that prior to this PR the device associated with the constructed queue is unspecified. However, the Level Zero interop spec does specify which device is associated with the queue:
The queue is attached to the first device in the passed SYCL context.
Was the code implemented to do that before? Maybe we don't need this PR? Though, I do agree that passing a specific device is more friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. Yes, it seems to be in line with the old behavior and I will need to make a small adjustment to recover this.
That said, this seems to have been a mistake in the design. ze_command_queue_handle_t
is created with a context and a device, so assuming that the right device is also the same device as the first device in the SYCL context is both restrictive and prone to error. We are seeing this with SYCL/Plugin/interop-level-zero.cpp in the test suite, as it attempts to recreate a SYCL queue from a native handle, which on systems with multiple L0 devices will cause it to pick the wrong device, causing hard-to-debug errors.
The user (and the aforementioned test) could be required to make sure the context only has the right device, but this seems too restrictive and it could potentially cause problems if the native context was actually created with multiple devices.
@@ -115,6 +115,7 @@ struct { | |||
``` C++ | |||
struct { | |||
ze_command_queue_handle_t NativeHandle; | |||
device Device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should document the deprecated API. Since this PR changes the API, the feature-test macro should also be incremented to 3
, and the new form of make_queue
should note that it was added in version 3
of this extension.
I don't understand your comment about the constructor. Are you referring to the make_queue
function as the "constructor"?
sycl/include/CL/sycl/detail/pi.h
Outdated
pi_native_handle nativeHandle, pi_context context, pi_queue *queue, | ||
bool pluginOwnsNativeHandle); | ||
pi_native_handle nativeHandle, pi_context context, pi_device *device, | ||
pi_queue *queue, bool pluginOwnsNativeHandle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the output queue argument the last one to follow other PI API
Please also bump up the PI version at the top of this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I have moved the output parameter to the end of the arguments and incremented the API version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that reaching minor version 10 for the API requires us to increase the buffer size for the version string. This is by itself an ABI break as it changes the layout of _pi_plugin
, so accessing the version from the runtime library can give weird results and as the function table may be shifted there is no guarantee the pointers are correct between versions.
I don't think this is necessarily a concern right now, but I think we should consider reworking this soon to prevent having to change _pi_plugin
when the version is incremented. Especially since we will have it happen again in 2 major versions.
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
|
||
// TODO: Change this to be device when the deprecated constructor is | ||
// removed. | ||
std::shared_ptr<device_impl> Device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't match the API specification, which says this is a device
. Someone following the API spec could write code like this:
using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;
void foo(device dev) {
l0_queue var;
var.Device = dev;
/* ... */
}
I understand your problem here. There's no way to initialize a device
objects as "not a device". I'm not sure how to solve this. I've never liked the way the interop APIs require a specific backend_input_t
type, and this seems like another reason not to like this structure. If the core SYCL spec allowed each backend to define its own parameters to the make_<object>
functions, we could simply define two overloads of make_queue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. Alternatively we could have this be a wrapper class around the pointer, allowing implicit conversion to and from sycl::device
. Can you think of any problems that could cause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the following be possible?
- Add a private constructor to
device
that constructs a device as "not a device". - Make the
BackendInput
struct a friend of thedevice
class, so it can use that private constructor. - Add a default member initializer for
Device
that calls this newdevice
constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fairly confident that is possible, but I worry about such a device being accessible by the user. The user could create the type without a device and read the device from it, then pass that around. It may be an edge-case but I don't think it is worth figuring out all the places we would need to make special checks to make sure we're not given a "fake" device.
I was thinking of something along the lines of:
struct OptionalDevice {
OptionalDevice() : DeviceImpl(nullptr) {}
OptionalDevice(device dev) : DeviceImpl(getSyclObjImpl(dev)) {}
operator device() const {
if (!hasDevice())
throw runtime_error("No device has been set.", PI_INVALID_DEVICE);
return createSyclObjFromImpl<device>(DeviceImpl);
}
bool hasDevice() const { return DeviceImpl != nullptr; }
private:
std::shared_ptr<device_impl> DeviceImpl;
};
That should allow most of the setting and getting cases, resulting in an exception if the device was not set before it was converted. I do however wonder how many drawbacks there potentially are to this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a commit with the OptionalDevice
wrapper, just to illustrate how it would look. If we agree that a new device constructor is the better option I will make the switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern with this approach is that it may be difficult to remove later. Let's say we add OptionalDevice
now. Can we remove it later once the deprecation period is over for the old form of make_queue
? Once we remove the old form of make_queue
, I'd like the struct to look like this:
struct {
ze_command_queue_handle_t NativeHandle;
device Device;
ext::oneapi::level_zero::ownership Ownership{
ext::oneapi::level_zero::ownership::transfer};
}
The approach I outline above with the private device
constructor will let us get here eventually. Will we be able to get here with the OptionalDevice
approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that having it as device
is the end goal once we are rid of the deprecated API, but since the input type is unpacked in sycl/include/sycl/ext/oneapi/backend/level_zero.hpp where the relevant device is also finally found, the input type lives solely in the headers. As such we should not have any problems changing the member type when we remove the deprecated API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach is OK, but see my specific comments in the code.
I still think the API spec needs to document the deprecated API and the feature-test macro version should be incremented, as I noted in my other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extension document has been updated to document the deprecated API and the feature macro has been incremented.
@@ -115,6 +115,7 @@ struct { | |||
``` C++ | |||
struct { | |||
ze_command_queue_handle_t NativeHandle; | |||
device Device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if someone has existing code like this:
using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;
void foo(context ctxt, ze_command_queue_handle_t native) {
l0_queue var;
var.NativeHandle = native;
queue q = make_queue<backend::ext_oneapi_level_zero>(var, ctxt);
}
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
struct OptionalDevice { | ||
OptionalDevice() : DeviceImpl(nullptr) {} | ||
OptionalDevice(device dev) : DeviceImpl(getSyclObjImpl(dev)) {} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need an assignment operator here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Assignment operators have been added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant that we need an assignment operator to device
. We want user code like this to work:
using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;
void foo(sycl::device dev) {
l0_queue var;
var.Device = dev;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that was my bad. It has been amended.
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
…with old behavior Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
struct OptionalDevice { | ||
OptionalDevice() : DeviceImpl(nullptr) {} | ||
OptionalDevice(device dev) : DeviceImpl(getSyclObjImpl(dev)) {} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant that we need an assignment operator to device
. We want user code like this to work:
using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;
void foo(sycl::device dev) {
l0_queue var;
var.Device = dev;
}
sycl/doc/extensions/supported/sycl_ext_oneapi_backend_level_zero.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
@@ -1823,9 +1823,9 @@ struct _pi_plugin { | |||
// Some choices are: | |||
// - Use of integers to keep major and minor version. | |||
// - Keeping char* Versions. | |||
char PiVersion[5]; | |||
char PiVersion[10]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this size into a "const int" parameter, and stop needing to sync all users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some backends simply use sizeof(PluginVersion)
. I have propagated this to backends without it.
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
* [SYCL][L0] Adds device member to L0 make_queue input type This commit adds a new `device` member to the `make_queue` input type for the L0 backend. This allows the L0 backend to correctly associate the right device with the command queue to create the PI queue from, preventing unintended and illegal behavior when the PI queue is later used for otherwise valid operations. It is currently still valid to create a SYCL queue from a native L0 command queue without associating the correct device, which will lead to the queue being associated with the first device in the supplied SYCL context. This behavior is however marked as deprecated. Note: The ABI checker test changes were generated with the checker tool, so the ones that are marked as removed are simply just moved by the tool. This is a non-breaking ABI change for the runtime library. Signed-off-by: Larsen, Steffen <[email protected]>
This commit adds a new
device
member to themake_queue
input type for the L0 backend. This allows the L0 backend to correctly associate the right device with the command queue to create the PI queue from, preventing unintended and illegal behavior when the PI queue is later used for otherwise valid operations.It is currently still valid to create a SYCL queue from a native L0 command queue without associating the correct device, which will lead to the queue being associated with the first device in the supplied SYCL context. This behavior is however marked as deprecated.
Note: The ABI checker test changes were generated with the checker tool, so the ones that are marked as removed are simply just moved by the tool. This is a non-breaking ABI change for the runtime library.