-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][HIP][L0][NATIVECPU] urAdapterGetLastError
fix so messages aren't lost.
#10953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It was being called in l0 and hip plugins but just returning success, which means the messages were lost. Signed-off-by: JackAKirk <[email protected]>
This means SYCL runtime correctly retrieves the error messages. Signed-off-by: JackAKirk <[email protected]>
Signed-off-by: JackAKirk <[email protected]>
Add back `std::ignore adapter;` Signed-off-by: JackAKirk <[email protected]>
std::ignore = Error; | ||
urPrint("[UR][L0] %s function not implemented!\n", __FUNCTION__); | ||
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE; | ||
*Message = ErrorMessage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @JackAKirk . L0 implementation for getLastError is being done here #10784. There's a discussion here on how it should behave. i'm planning on following on that, so feel free to address the hip and cpu adapters only on this patch, or to address also L0 based on those comments. Either way works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, L0 impl here is already consistent with those comments (all backends have the same impl).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jandres742 do you want me to remove the l0 implementation from this PR, or are you happy with it?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JackAKirk : implementation here is good, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good for Native CPU since it aligns our implementation with the other adaptors, thank you 👍
@sergey-semenov Would you be able to review this please? Thanks |
Adapters have been moved to UR repo. I'm adding the L0 changes here oneapi-src/unified-runtime#1033 |
urAdapterGetLastError
is being called in l0 and hip plugins but just returning success, which means the messages were lost. I've implementedurAdapterGetLastError
in these backends in the same way as in the CUDA backend (#10626) to fix this. I also made the same change for nativeCPU: this backend does not yet callurAdapterGetLastError
but it could in the future.Fixed hip message:
adapters/hip/enqueue.cpp: setErrorMessage(LocalMemSzPtrUR ? "Invalid value specified for "
Fixed l0 messages:
adapters/level_zero/kernel.cpp: setErrorMessage("Write device global variable is out of range.",
adapters/level_zero/kernel.cpp: setErrorMessage("Read from device global variable is out of range.",
adapters/level_zero/device.cpp: setErrorMessage("Set ZES_ENABLE_SYSMAN=1 to obtain free memory",
The associated errors returned by UR for the adapters/level_zero/kernel.cpp have also been fixed: SYCL runtime will only retrieve the errors if
UR_RESULT_ERROR_ADAPTER_SPECIFIC
is returned.