-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Improve error reporting for kernel enqueue #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Improve error reporting for kernel enqueue #860
Conversation
23093ab
to
a1659eb
Compare
Is it even OK to put two different functionalities within single pull-request? |
No. It's not okay. It's recommended to keep pull requests simple and self-contained. |
Could you please point me to where I introduce several new functionalities?
This is done by removing part of existing error-checking: I don't see any value to check for the same error twice. Moreover, it will break new test if I leave the old code |
If support of non-uniform work-groups is required by specification then no support of it is a bug. This bug may obviously be reported and should have a distinct fix. Otherwise, if there must be no such support according to the specification I don't see any obvious reasons to let this support pass to source code. |
@AlexeySachkov I have a more general question. This WG size logic is valid for OpenCL. But PI does not require you to have an OpenCL backend. And these checks may become invalid. Should we delegate such logic to plugins, so that every plugin implements its very own rules? |
Fair enough, but at this point I don't see a clear way how this can be achieved:
Tagging @smaslov-intel here |
This is not true, non-uniform work-groups are prohibited by the SYCL spec for any |
a1659eb
to
044e3da
Compare
Why is the claim that some checks are invalid in other back-ends? Is it because 1) you don't know it is possible to check through L0 (fair concern), or because 2) you know it is impossible? In any case, from now on when adding new PI calls (even new type of arguments to existing PI API) we have to ask ourselves if that's going to work with major other back-ends (L0 and CUDA). The plan is to:
|
This patch checks OpenCL version. Another backend will have another versioning and may not support non-uniform WG size at all. An easy solution would be to check for OpenCL 2.0+ first, and emit error about non-uniform WG size in all other cases. This way we will still be unable to handle situation when user uses custom plugin. The error message may become invalid (or should I say unexpected?), but at least it will conform to the spec. https://github.com/intel/llvm/pull/860/files#diff-70a2b4f7b47aee28e73b8411d6d7186cR59-R94 |
SYCL 1.2.1 spec explicitly built on top of OpenCL, so, any PI plugin for SYCL 1.2.1 have to emulate/mimic OpenCL (probably even better to say OpenCL 1.2).
I guess this is the only solution for now. Am I missing other ways?
In any case we don't have any API to detect which PI plugin is being used and which features it supports/which limitations it has. (or at least I'm unaware of such API).
BTW, this one will be good with any PI plugin, I think. But other messages explicitly mentioning OpenCL apparently might look strange if other PI plugins are used. |
You're probably right. Until we have a way to report errors in a user-friendly manner from PI plugins, this is the only way. |
044e3da
to
49b074e
Compare
Applied most of the comments from @romanovvlad |
49b074e
to
6478a0b
Compare
Applied rest of the comments from @romanovvlad, fixed some things in test and error handling code |
6478a0b
to
8a36626
Compare
Improved test to fail if anything unexpected has been caught. Fixed fail on accelerator devices |
8a36626
to
81ccea5
Compare
Looks like I used incorrect runtime for accelerator device. Adjusted the test to account all known issues in underlying runtimes used in CI. |
6777882
to
f7ae7f6
Compare
f7ae7f6
to
22740f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me
aad421e
22740f7
to
aad421e
Compare
Rebased to incorporate the latest PI changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
aad421e
to
9eec123
Compare
The idea is to handle error code returned from PI and based on input arguments, emit meaningful exception with error explanation. One of the side-effects of this is that this patch effectively allows to use non-uniform work-groups if underlying OpenCL supports this functionality. Signed-off-by: Alexey Sachkov <[email protected]>
9eec123
to
ff116fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexeySachkov, please, avoid force-pushing. Use merge/push instead. It's easier to track changes this way.
Fixed inconsistency introduced by intel#860 In particular: - `PI_DEVICE_NAME` -> `PI_DEVICE_INFO_NAME` - `PI_DEVICE_MAX_WORK_GROUP_SIZE` -> `PI_DEVICE_INFO_MAX_WORK_GROUP_SIZE` Signed-off-by: Alexey Sachkov <[email protected]>
Fixed inconsistency introduced by #860 In particular: - `PI_DEVICE_NAME` -> `PI_DEVICE_INFO_NAME` - `PI_DEVICE_MAX_WORK_GROUP_SIZE` -> `PI_DEVICE_INFO_MAX_WORK_GROUP_SIZE` Signed-off-by: Alexey Sachkov <[email protected]>
The idea is to handle error code returned from PI and based on input
arguments, emit meaningful exception with error explanation.
One of the side-effects of this is that this patch effectively allows to
use non-uniform work-groups if underlying OpenCL supports this
functionality.