Skip to content

[SYCL][L0] Adds device member to L0 make_queue input type #6148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

steffenlarsen
Copy link
Contributor

@steffenlarsen steffenlarsen commented May 12, 2022

This commit adds a new device member to the make_queue input type for the L0 backend. This allows the L0 backend to correctly associate the right device with the command queue to create the PI queue from, preventing unintended and illegal behavior when the PI queue is later used for otherwise valid operations.

It is currently still valid to create a SYCL queue from a native L0 command queue without associating the correct device, which will lead to the queue being associated with the first device in the supplied SYCL context. This behavior is however marked as deprecated.

Note: The ABI checker test changes were generated with the checker tool, so the ones that are marked as removed are simply just moved by the tool. This is a non-breaking ABI change for the runtime library.

This commit adds a new device member to the make_queue input type for
the L0 backend. This allows the L0 backend to correctly associate the
right device with the command queue to create the PI queue from,
preventing unintented and illegal behavior when the PI queue is later
used for otherwise valid operations.

It is currently still valid to create a SYCL queue from a native L0
command queue without associating the correct device, which will lead
to an unspecified device being selected, similar to the old behavior.
This behavior is however marked as deprecated.

Signed-off-by: Larsen, Steffen <[email protected]>
@@ -115,6 +115,7 @@ struct {
``` C++
struct {
ze_command_queue_handle_t NativeHandle;
device Device;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to mention that the old variant is still usable, although deprecated? Maybe it is covered by:

NOTE: This extension is following SYCL 2020 backend specification. Prior API for interoperability with Level-Zero is marked as deprecated and will be removed in the next release.

We could include the constructors, but they are really only there to allow Device to be before Ownership as it is more in line with other input types, like the one for context. I fear including the constructors here will only make it less readable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should document the deprecated API. Since this PR changes the API, the feature-test macro should also be incremented to 3, and the new form of make_queue should note that it was added in version 3 of this extension.

I don't understand your comment about the constructor. Are you referring to the make_queue function as the "constructor"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your comment about the constructor. Are you referring to the make_queue function as the "constructor"?

By constructor, I mean the constructor of the struct the new Device member. It is unnamed in this document, but in the implementation we have

template <> struct BackendInput<backend::ext_oneapi_level_zero, queue> {
  struct type {
    interop<backend::ext_oneapi_level_zero, queue>::type NativeHandle;
    device Device;
    ext::oneapi::level_zero::ownership Ownership{
        ext::oneapi::level_zero::ownership::transfer};

    type(interop<backend::ext_oneapi_level_zero, queue>::type nativeHandle,
         ext::oneapi::level_zero::ownership ownership) ...

    type(interop<backend::ext_oneapi_level_zero, queue>::type nativeHandle,
         device dev, ext::oneapi::level_zero::ownership ownership) ...
  };

This allows both make_queue(... {ZeCommandQueueHandle, MyOwnership} ...) and make_queue(... {ZeCommandQueueHandle, MyDevice, MyOwnership} ...), where the former is deprecated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone has existing code like this:

using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;

void  foo(context ctxt, ze_command_queue_handle_t native) {
  l0_queue var;
  var.NativeHandle = native;
  queue q = make_queue<backend::ext_oneapi_level_zero>(var, ctxt);
}

Signed-off-by: Larsen, Steffen <[email protected]>
@@ -115,6 +115,7 @@ struct {
``` C++
struct {
ze_command_queue_handle_t NativeHandle;
device Device;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message indicates that prior to this PR the device associated with the constructed queue is unspecified. However, the Level Zero interop spec does specify which device is associated with the queue:

The queue is attached to the first device in the passed SYCL context.

Was the code implemented to do that before? Maybe we don't need this PR? Though, I do agree that passing a specific device is more friendly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. Yes, it seems to be in line with the old behavior and I will need to make a small adjustment to recover this.

That said, this seems to have been a mistake in the design. ze_command_queue_handle_t is created with a context and a device, so assuming that the right device is also the same device as the first device in the SYCL context is both restrictive and prone to error. We are seeing this with SYCL/Plugin/interop-level-zero.cpp in the test suite, as it attempts to recreate a SYCL queue from a native handle, which on systems with multiple L0 devices will cause it to pick the wrong device, causing hard-to-debug errors.

The user (and the aforementioned test) could be required to make sure the context only has the right device, but this seems too restrictive and it could potentially cause problems if the native context was actually created with multiple devices.

@@ -115,6 +115,7 @@ struct {
``` C++
struct {
ze_command_queue_handle_t NativeHandle;
device Device;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should document the deprecated API. Since this PR changes the API, the feature-test macro should also be incremented to 3, and the new form of make_queue should note that it was added in version 3 of this extension.

I don't understand your comment about the constructor. Are you referring to the make_queue function as the "constructor"?

pi_native_handle nativeHandle, pi_context context, pi_queue *queue,
bool pluginOwnsNativeHandle);
pi_native_handle nativeHandle, pi_context context, pi_device *device,
pi_queue *queue, bool pluginOwnsNativeHandle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the output queue argument the last one to follow other PI API
Please also bump up the PI version at the top of this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I have moved the output parameter to the end of the arguments and incremented the API version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that reaching minor version 10 for the API requires us to increase the buffer size for the version string. This is by itself an ABI break as it changes the layout of _pi_plugin, so accessing the version from the runtime library can give weird results and as the function table may be shifted there is no guarantee the pointers are correct between versions.

I don't think this is necessarily a concern right now, but I think we should consider reworking this soon to prevent having to change _pi_plugin when the version is incremented. Especially since we will have it happen again in 2 major versions.


// TODO: Change this to be device when the deprecated constructor is
// removed.
std::shared_ptr<device_impl> Device;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match the API specification, which says this is a device. Someone following the API spec could write code like this:

using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;

void  foo(device dev) {
  l0_queue var;
  var.Device = dev;
  /* ... */
}

I understand your problem here. There's no way to initialize a device objects as "not a device". I'm not sure how to solve this. I've never liked the way the interop APIs require a specific backend_input_t type, and this seems like another reason not to like this structure. If the core SYCL spec allowed each backend to define its own parameters to the make_<object> functions, we could simply define two overloads of make_queue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Alternatively we could have this be a wrapper class around the pointer, allowing implicit conversion to and from sycl::device. Can you think of any problems that could cause?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the following be possible?

  • Add a private constructor to device that constructs a device as "not a device".
  • Make the BackendInput struct a friend of the device class, so it can use that private constructor.
  • Add a default member initializer for Device that calls this new device constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly confident that is possible, but I worry about such a device being accessible by the user. The user could create the type without a device and read the device from it, then pass that around. It may be an edge-case but I don't think it is worth figuring out all the places we would need to make special checks to make sure we're not given a "fake" device.

I was thinking of something along the lines of:

struct OptionalDevice {

  OptionalDevice() : DeviceImpl(nullptr) {}
  OptionalDevice(device dev) : DeviceImpl(getSyclObjImpl(dev)) {}

  operator device() const {
    if (!hasDevice())
      throw runtime_error("No device has been set.", PI_INVALID_DEVICE);
    return createSyclObjFromImpl<device>(DeviceImpl);
  }

  bool hasDevice() const { return DeviceImpl != nullptr; }
private:
  std::shared_ptr<device_impl> DeviceImpl;
};

That should allow most of the setting and getting cases, resulting in an exception if the device was not set before it was converted. I do however wonder how many drawbacks there potentially are to this approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a commit with the OptionalDevice wrapper, just to illustrate how it would look. If we agree that a new device constructor is the better option I will make the switch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with this approach is that it may be difficult to remove later. Let's say we add OptionalDevice now. Can we remove it later once the deprecation period is over for the old form of make_queue? Once we remove the old form of make_queue, I'd like the struct to look like this:

struct {
  ze_command_queue_handle_t NativeHandle;
  device Device;
  ext::oneapi::level_zero::ownership Ownership{
      ext::oneapi::level_zero::ownership::transfer};
}

The approach I outline above with the private device constructor will let us get here eventually. Will we be able to get here with the OptionalDevice approach?

Copy link
Contributor Author

@steffenlarsen steffenlarsen May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that having it as device is the end goal once we are rid of the deprecated API, but since the input type is unpacked in sycl/include/sycl/ext/oneapi/backend/level_zero.hpp where the relevant device is also finally found, the input type lives solely in the headers. As such we should not have any problems changing the member type when we remove the deprecated API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach is OK, but see my specific comments in the code.

I still think the API spec needs to document the deprecated API and the feature-test macro version should be incremented, as I noted in my other comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension document has been updated to document the deprecated API and the feature macro has been incremented.

@@ -115,6 +115,7 @@ struct {
``` C++
struct {
ze_command_queue_handle_t NativeHandle;
device Device;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone has existing code like this:

using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;

void  foo(context ctxt, ze_command_queue_handle_t native) {
  l0_queue var;
  var.NativeHandle = native;
  queue q = make_queue<backend::ext_oneapi_level_zero>(var, ctxt);
}

Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
struct OptionalDevice {
OptionalDevice() : DeviceImpl(nullptr) {}
OptionalDevice(device dev) : DeviceImpl(getSyclObjImpl(dev)) {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need an assignment operator here too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Assignment operators have been added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant that we need an assignment operator to device. We want user code like this to work:

using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;

void  foo(sycl::device dev) {
  l0_queue var;
  var.Device = dev;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was my bad. It has been amended.

struct OptionalDevice {
OptionalDevice() : DeviceImpl(nullptr) {}
OptionalDevice(device dev) : DeviceImpl(getSyclObjImpl(dev)) {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant that we need an assignment operator to device. We want user code like this to work:

using namespace sycl;
using l0_queue = backend_input_t<backend::ext_oneapi_level_zero, queue>;

void  foo(sycl::device dev) {
  l0_queue var;
  var.Device = dev;
}

gmlueck
gmlueck previously approved these changes May 16, 2022
Signed-off-by: Larsen, Steffen <[email protected]>
@@ -1823,9 +1823,9 @@ struct _pi_plugin {
// Some choices are:
// - Use of integers to keep major and minor version.
// - Keeping char* Versions.
char PiVersion[5];
char PiVersion[10];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this size into a "const int" parameter, and stop needing to sync all users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some backends simply use sizeof(PluginVersion). I have propagated this to backends without it.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen steffenlarsen merged commit 29a5369 into intel:sycl May 19, 2022
yinyangsx pushed a commit to yinyangsx/llvm that referenced this pull request May 25, 2022
* [SYCL][L0] Adds device member to L0 make_queue input type

This commit adds a new `device` member to the `make_queue` input type for the L0 backend. This allows the L0 backend to correctly associate the right device with the command queue to create the PI queue from, preventing unintended and illegal behavior when the PI queue is later used for otherwise valid operations.

It is currently still valid to create a SYCL queue from a native L0 command queue without associating the correct device, which will lead to the queue being associated with the first device in the supplied SYCL context. This behavior is however marked as deprecated.

Note: The ABI checker test changes were generated with the checker tool, so the ones that are marked as removed are simply just moved by the tool. This is a non-breaking ABI change for the runtime library.

Signed-off-by: Larsen, Steffen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants