-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][matrix] Update the query interface with the latest joint matrix approved syntax #10847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…x approved syntax
|
||
template <typename Ta, typename Tb, typename Tc, typename Td> | ||
struct matrix_params< | ||
architecture::intel_gpu_dg1, Ta, Tb, Tc, Td, 0, 0, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: should we use intel_gpu_dg2_g10 or does intel_gpu_dg1 support Intel XMX with SIMD8 capability as well?
Also what about other arch values (if they support Intel XMX with SIMD8) - intel_gpu_dg2_g11, intel_gpu_dg2_g12, should they also be supported in query interface as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, I will find out
@@ -99,6 +99,7 @@ namespace sycl::ext::oneapi::experimental { | |||
|
|||
enum class architecture : /* unspecified */ { | |||
x86_64, | |||
intel_cpu_spr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also update table below.
@@ -14,6 +14,7 @@ namespace ext::oneapi::experimental { | |||
|
|||
enum class architecture { | |||
x86_64, | |||
intel_cpu_spr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more complicated than adding new item here for CPU device: it won't be usable if user tries to use it in device AOT or host scenarios.
(1) What we can do right now is that we can add explicit limitation to the spec and add the actual functionality later
(2) Another option is to introduce all intel_cpu_* items as separate PR and make this PR dependent on that PR (e.g., regarding host scenario we already have the updated OpenCL extension in OpenCL CPU RT which allows us to query the unique IDs of Intel CPU architectures)
@gmlueck what do you think between (1) and (2) or maybe other option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to have a plan for how this new SPR enumerator fits in with all of the other uses of the architecture
enum. Currently, all of the following are related:
- The enumerators in
architecture
. - The legal value for the
-fsycl-targets
command line switch. - The
if_architecture_is
function that can be called from device code. - The
device::ext_oneapi_architecture_is
function that can be called from host code.
It seems like it would be nice to support syntax like -fsycl-targets=intel_cpu_spr
, which would AOT compile device code for SPR. This would provide a consistent way to AOT compile for either GPU or CPU devices. I think this would also enable support for if_architecture_is
as a side-effect.
For device::ext_oneapi_architecture_is
, can we use the CPUID instruction to determine if the CPU is SPR?
It might be useful to have a meeting to discuss this further.
replaced by #11004 |
No description provided.