-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][L0] Implementation of ext_oneapi_prod API #13555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR mentions only L0. Is this going to be supported for other backends?
@@ -104,6 +104,7 @@ inline namespace _V1 { | |||
#define SYCL_EXT_INTEL_MATRIX 1 | |||
#define SYCL_EXT_INTEL_FPGA_TASK_SEQUENCE 1 | |||
#define SYCL_EXT_ONEAPI_PRIVATE_ALLOCA 1 | |||
#define SYCL_EXT_ONEAPI_PROD 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, defining this macro indicates to the user that feature is supported. But it is not supported yet because you have to bring UR changes.
Also once feature is supported we should move the extension document to the supported
directory.
So may be it makes sense to merge changes to UR first and then combine these changes with UR tag update + move the document to supported. Or probably we shouldn't define the macro for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait until the UR changes are merged and then merge this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature is a hint and implementation may decide to ignore it completely. Therefore, I think that it is fine to say that this is supported, even if we have some bugs on UR side
sycl/source/detail/queue_impl.hpp
Outdated
@@ -388,6 +388,10 @@ class queue_impl { | |||
template <typename Param> | |||
typename Param::return_type get_backend_info() const; | |||
|
|||
/// Begin to execute previously issued commands on this queue | |||
/// if they have not been executed yet. Overrides normal batching behaviour. | |||
void flush() { getPlugin()->call<PiApiKind::piQueueFlush>(MQueues[0]); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is possible for MQueues to be empty. At least internally we most likely use host queues which don't have any PI handles.
So, it will be nice to have some tests for this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll add a check to verify that MQueues is not empty. I'm also working on adding some tests on this PR.
I don't know whether it will be supported for other backends. From what I can see it is already implemented for OpenCL but not HIP and CUDA. |
sycl/source/detail/queue_impl.hpp
Outdated
@@ -388,6 +388,10 @@ class queue_impl { | |||
template <typename Param> | |||
typename Param::return_type get_backend_info() const; | |||
|
|||
/// Begin to execute previously issued commands on this queue | |||
/// if they have not been executed yet. Overrides normal batching behaviour. | |||
void flush() { getPlugin()->call<PiApiKind::piQueueFlush>(MQueues[0]); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think special handling is needed for the case when graph recording is in progress. (for example, see queue_impl::wait). It will be nice to have test for this case as well.
@againull I added a check for when the MQueues member is empty and added some code to throw an exception when the queue is in graph recording mode since I guess it does not make sense to call flush while commands are being recorded. I'm working on adding several tests to cover these cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UR LGTM
UR tag was updated which resolves my comment
This PR includes a preliminary implementation of the ext_oneapi_prod extension defined here:
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_prod.asciidoc
It is preliminary because the UR-Level Zero queue flushing method which I've used to implement this extension is a no-op at the moment but I have a PR in review to change that: oneapi-src/unified-runtime#1545.
Afterwards, my guess is that I will have to update the UR version we use in this repo to include that new commit once the Unified Runtime PR is merged.
Nevertheless, I think I can continue with the review and merge of this PR independently of that since compilation will succeed because the UR flush method is defined, it just does nothing at the moment.