Skip to content

[SYCL][matrix] Update the query interface with the latest joint matrix approved syntax #10847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

dkhaldi
Copy link
Contributor

@dkhaldi dkhaldi commented Aug 16, 2023

No description provided.

@dkhaldi dkhaldi requested a review from a team as a code owner August 16, 2023 20:30

template <typename Ta, typename Tb, typename Tc, typename Td>
struct matrix_params<
architecture::intel_gpu_dg1, Ta, Tb, Tc, Td, 0, 0, 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: should we use intel_gpu_dg2_g10 or does intel_gpu_dg1 support Intel XMX with SIMD8 capability as well?
Also what about other arch values (if they support Intel XMX with SIMD8) - intel_gpu_dg2_g11, intel_gpu_dg2_g12, should they also be supported in query interface as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I will find out

@@ -99,6 +99,7 @@ namespace sycl::ext::oneapi::experimental {

enum class architecture : /* unspecified */ {
x86_64,
intel_cpu_spr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update table below.

@@ -14,6 +14,7 @@ namespace ext::oneapi::experimental {

enum class architecture {
x86_64,
intel_cpu_spr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more complicated than adding new item here for CPU device: it won't be usable if user tries to use it in device AOT or host scenarios.
(1) What we can do right now is that we can add explicit limitation to the spec and add the actual functionality later
(2) Another option is to introduce all intel_cpu_* items as separate PR and make this PR dependent on that PR (e.g., regarding host scenario we already have the updated OpenCL extension in OpenCL CPU RT which allows us to query the unique IDs of Intel CPU architectures)
@gmlueck what do you think between (1) and (2) or maybe other option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to have a plan for how this new SPR enumerator fits in with all of the other uses of the architecture enum. Currently, all of the following are related:

  • The enumerators in architecture.
  • The legal value for the -fsycl-targets command line switch.
  • The if_architecture_is function that can be called from device code.
  • The device::ext_oneapi_architecture_is function that can be called from host code.

It seems like it would be nice to support syntax like -fsycl-targets=intel_cpu_spr, which would AOT compile device code for SPR. This would provide a consistent way to AOT compile for either GPU or CPU devices. I think this would also enable support for if_architecture_is as a side-effect.

For device::ext_oneapi_architecture_is, can we use the CPUID instruction to determine if the CPU is SPR?

It might be useful to have a meeting to discuss this further.

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Aug 29, 2023

replaced by #11004
The intel_cpu_spr outstanding issue will be discussed there.

@dkhaldi dkhaldi closed this Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants