Skip to content

[SYCL][UR][L0] Implement urAdapterGetLastError #10784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -179,11 +179,9 @@ UR_APIEXPORT ur_result_t UR_APICALL urAdapterGetLastError(
int32_t *Error ///< [out] pointer to an integer where the adapter specific
///< error code will be stored.
) {
std::ignore = Adapter;
std::ignore = Message;
std::ignore = Error;
urPrint("[UR][L0] %s function not implemented!\n", __FUNCTION__);
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE;
*Message = ErrorMessage;
*Error = ErrorMessageCode;
return UR_RESULT_SUCCESS;
Comment on lines +183 to +184
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how urAdapterGetLastError is defined to always return success?
In a matching PI API we need to return error if SYCL RT needs to abort, or success if this is rather a warning. Is this happening?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urAdapterGetLastError doesnt return an error depending on the message. It may return an error if the call to urAdapterGetLastError fails, but not because of the error it is interpreting.

See here https://oneapi-src.github.io/unified-runtime/core/api.html?highlight=uradaptergetlasterror#uradaptergetlasterror

Example usage:

if (::urQueueCreate(hContext, hDevice, nullptr, &hQueue) ==
        ::UR_RESULT_ERROR_ADAPTER_SPECIFIC) {
    const char* pMessage;
    int32_t error;
    ::urAdapterGetLastError(hAdapter, &pMessage, &error);
}

Another call may have returned UR_RESULT_ERROR_ADAPTER_SPECIFIC. Then application (in this case SYCL RT) would call urAdapterGetLastError to get the description of the message for the call that previously failed (*Message) and the error code (*Error).

This is the same implementation we have in other adapters, for instance CU:

UR_APIEXPORT ur_result_t UR_APICALL urAdapterGetLastError(

It is up to the application to parse this info and determining the level of severity.

@callumfare , @kbenzie : is my understanding correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original idea was that adapters that raise the error decide on the severity (not the SYCL RT), see here

/// \return PI_SUCCESS if plugin is indicating non-fatal warning. Any other

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smaslov-intel : so something like this?

UR_APIEXPORT ur_result_t UR_APICALL urAdapterGetLastError(
    ur_adapter_handle_t Adapter, ///< [in] handle of the platform instance
    const char **Message, ///< [out] pointer to a C string where the adapter
                          ///< specific error message will be stored.
    int32_t *Error ///< [out] pointer to an integer where the adapter specific
                   ///< error code will be stored.
) {
  *Message = ErrorMessage;
  *Error = ErrorMessageCode;
  if(static_cast<ur_result_t>(ErrorMessageCode) != UR_RESULT_SUCCESS) {
    return UR_RESULT_ERROR_ADAPTER_SPECIFIC;
  }
  return UR_RESULT_SUCCESS;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just return ErrorMessageCode?
Then I don't see why we need the Error argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Error argument is intended for returning the adapter specific error code, in this case a ze_result_t.

The ur_result_t returned from urAdapterGetLastError should be UR_RESULT_SUCCESS if the error is non-fatal/warning, and UR_RESULT_ERROR_ADAPTER_SPECIFIC if it is an unrecoverable/hard error.

Currently Error isn't hooked up to piPluginGetLastError because it doesn't have an equivelent argument. This should be resolved as part of the work to replace PI with UR in the SYCL RT.

As for what to do with ErrorMessageCode, if depends how its being set when an adapter specific error occurs. If its being set to either UR_RESULT_SUCCESS or UR_RESULT_ERROR_ADAPTER_SPECIFIC it should be returned from urAdapterGetLastError. If it contains a ze_result_t then it should be used to set *Error.

Here's a recent change to the CUDA adapter that's addressing a similar issue https://github.com/intel/llvm/pull/10626/files#diff-1efcd510b7db74f7166a4a06ccd048867c04911e8809ef7c39b9821aa66583e5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discovered a while ago that the implementation of piPluginGetLastError in pi2ur was incorrect, and the cuda adapter was doing the wrong thing - it was fixed in #10626

To match PI the return code is used to signify whether the result is treated as a warning (UR_RESULT_SUCCESS) or an error (any other return code).

The purpose of the Error argument is to return native error codes (e.g. CUresult) from the adapter, but the cuda adapter doesn't implement this yet.

}

UR_APIEXPORT ur_result_t UR_APICALL urAdapterGetInfo(ur_adapter_handle_t,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -286,8 +286,3 @@ thread_local char ErrorMessage[MaxMessageSize];
strcpy(ErrorMessage, message);
ErrorMessageCode = error_code;
}

ur_result_t zerPluginGetLastError(char **message) {
*message = &ErrorMessage[0];
return ErrorMessageCode;
}