Skip to content

[SYCL][HIP][L0][NATIVECPU] urAdapterGetLastError fix so messages aren't lost. #10953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

JackAKirk
Copy link
Contributor

urAdapterGetLastError is being called in l0 and hip plugins but just returning success, which means the messages were lost. I've implemented urAdapterGetLastError in these backends in the same way as in the CUDA backend (#10626) to fix this. I also made the same change for nativeCPU: this backend does not yet call urAdapterGetLastError but it could in the future.

Fixed hip message:

adapters/hip/enqueue.cpp: setErrorMessage(LocalMemSzPtrUR ? "Invalid value specified for "

Fixed l0 messages:

adapters/level_zero/kernel.cpp: setErrorMessage("Write device global variable is out of range.",
adapters/level_zero/kernel.cpp: setErrorMessage("Read from device global variable is out of range.",
adapters/level_zero/device.cpp: setErrorMessage("Set ZES_ENABLE_SYSMAN=1 to obtain free memory",

The associated errors returned by UR for the adapters/level_zero/kernel.cpp have also been fixed: SYCL runtime will only retrieve the errors if UR_RESULT_ERROR_ADAPTER_SPECIFIC is returned.

JackAKirk added 4 commits August 24, 2023 09:50
It was being called in l0 and hip plugins but just returning success, which means the messages were lost.

Signed-off-by: JackAKirk <[email protected]>
This means SYCL runtime correctly retrieves the error messages.

Signed-off-by: JackAKirk <[email protected]>
Add back `std::ignore adapter;`

Signed-off-by: JackAKirk <[email protected]>
@JackAKirk JackAKirk requested review from a team as code owners August 24, 2023 09:19
std::ignore = Error;
urPrint("[UR][L0] %s function not implemented!\n", __FUNCTION__);
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE;
*Message = ErrorMessage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @JackAKirk . L0 implementation for getLastError is being done here #10784. There's a discussion here on how it should behave. i'm planning on following on that, so feel free to address the hip and cpu adapters only on this patch, or to address also L0 based on those comments. Either way works for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, L0 impl here is already consistent with those comments (all backends have the same impl).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jandres742 do you want me to remove the l0 implementation from this PR, or are you happy with it?

Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackAKirk : implementation here is good, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks!

Copy link
Contributor

@PietroGhg PietroGhg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good for Native CPU since it aligns our implementation with the other adaptors, thank you 👍

@JackAKirk
Copy link
Contributor Author

@sergey-semenov Would you be able to review this please?

Thanks

@jandres742
Copy link
Contributor

Adapters have been moved to UR repo. I'm adding the L0 changes here oneapi-src/unified-runtime#1033

@JackAKirk JackAKirk closed this Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants