-
Notifications
You must be signed in to change notification settings - Fork 790
[SYCL][CUDA][HIP] Add support for querying device architecture #10573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…prove performance
Can you also update the extension specification to remove the following limitation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few stylistic comments should be fixed.
sycl/source/detail/device_info.hpp
Outdated
@@ -568,64 +568,108 @@ struct get_device_info_impl<range<Dimensions>, | |||
} | |||
}; | |||
|
|||
// This macro is only for AMD, NVIDIA and Intel GPU architectures | |||
#define ARCHES(X) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be done in a separate PR later during some refactoring but we can have two std::unordered_map
s: one for string keys and another for uint32_t keys: this way in case of Intel platforms we will not convert number to string and compare strings - it will be faster. For AMD and NVIDIA archs with std::unordered_map
the access time will be O(1) in a good case and O(N) in a bad case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that the way the uint32_t
keys were converted was wrong anyways. The converted strings were not in hexadecimal form anymore and that's also why the CI failed.
I've added another macro for now to avoid the conversion and leave the DeviceIP as an integer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@aelovikov-intel re-triggering of SYCL Pre Commit on Linux / test (AMD/HIP and Intel) didn't help, how it can be fixed? Possibly, some problem with CI machines. |
Looks like a common problem at least with Intel devices: #9845 (comment) |
Restarted the checks, it was reported that issue with Intel devices should be fixed now. |
Unrelated failure on HIP:
|
Implements
sycl_ext_oneapi_device_architecture
for AMD and NVIDIA.The device architecture can be queried through
dev.get_info<sycl::ext::oneapi::experimental::info::device::architecture>()
(
dev
is asycl::device
)