Skip to content

[SYCL] Prevent q_xxs using mul_mat_q #7459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2024
Merged

Conversation

AidanBeltonS
Copy link
Contributor

There is currently a test failure for the A100 GPU which attempts to use mul_mat_q for q_xxs quantization. This quantization type is not supported in this approach, and results in an assert.

This change gets all expect one MUL_MAT test passing for the A100 GPU, it does not effect the ARC and PVC paths.
As the A100 has its min_compute_capability high enough to enable the use_mul_mat_q path.

@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label May 22, 2024
@mofosyne mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label May 22, 2024
@NeoZhangJianyu
Copy link
Collaborator

I think the code will effect the ARC and PVC paths.
min_compute_capability is fake value and not related to hardware type in fact.

What's the min_compute_capability value for A100 and Intel GPU? Maybe I'm wrong.

@AidanBeltonS
Copy link
Contributor Author

I have found that it doesn't change the PVC path in my case. But even if it does, you do not want to run mul_mat_q with the Q_XXS quantization type as it will assert because it is unsupported. So this is a general behaviour that is needed for any hardware to be piped to the correct mul_mat.

Copy link
Collaborator

@abhilash1910 abhilash1910 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the time being until q_xxs variants are properly supported.
Maybe rebasing would help with the CI?

@abhilash1910 abhilash1910 merged commit 95f84d5 into master May 27, 2024
68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants