Skip to content

Update ggml_sycl_op_mul_mat_vec_q #5502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 20, 2024

Conversation

AidanBeltonS
Copy link
Contributor

This PR updates the unsupported quantized data types and refactors the code for ggml_sycl_op_mul_mat_vec_q.
SYCL does not currently have the intrinsics to support some quantized data types, this adds one missing quantized data type to the unsupported check, so tests won't be run.
This also refactors the code so there is a single templated mul_mat_vec_q_sycl_submitter rather than multiple duplicate functions which submit a different instantiated kernel. This makes the code less verbose and much smaller.

@AidanBeltonS
Copy link
Contributor Author

@NeoZhangJianyu, @abhilash1910, @Alcpz, feedback would be appreciated

Copy link
Collaborator

@Alcpz Alcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment on the refactor. Looks great.

@abhilash1910
Copy link
Collaborator

Thanks @AidanBeltonS , could you please rebase to latest master for CI build?
Also tagging @ggerganov & @airMeng for a look when available.

@abhilash1910
Copy link
Collaborator

@ggerganov @0cc4m I think the vulkan build CI is exiting abruptly - maybe issue is common for other requests. Could you help take a look ? Thanks

@ggerganov
Copy link
Member

ggerganov commented Feb 16, 2024

It's because we enabled the gguf example which links only the ggml library and does not link ggml-vulkan or ggml-rocm:

#5216 (comment)

We can easily disable the gguf example from the build, but I'm wondering if we need separate libs in the first place. Can we not link everything in ggml, similar to what CUDA, Metal, SYCL, etc. do

Copy link
Collaborator

@airMeng airMeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abhilash1910
Copy link
Collaborator

@AidanBeltonS could you please rebase to latest master - should solve some build issues with vulkan ci.

@NeoZhangJianyu
Copy link
Collaborator

  1. "unsupported quantized data types"
    What's the data types to be supported by this PR?

  2. How to test with the supported data type?

@AidanBeltonS
Copy link
Contributor Author

AidanBeltonS commented Feb 19, 2024

  1. "unsupported quantized data types"
    What's the data types to be supported by this PR?

    1. How to test with the supported data type?
  1. This PR does not change the supported and unsupported data types. If you look at the switch case, all the types that are there are supported. It simply explicitly sets an assert for the quantization types we do not support, and sets tests which use that data type to unsupported.

  2. All the regular tests for this functionality are run. So any existing testing for the other quantization types will still be run as normal to check functionality.

@abhilash1910 abhilash1910 merged commit b9111bd into ggml-org:master Feb 20, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* Update ggml_sycl_op_mul_mat_vec_q

* Apply suggestions from code review

Co-authored-by: Abhilash Majumder <[email protected]>

* revert suggestion on macro

* fix bug

* Add quant type GGML_TYPE_IQ1_S to unsupported

* fix format

---------

Co-authored-by: Abhilash Majumder <[email protected]>
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* Update ggml_sycl_op_mul_mat_vec_q

* Apply suggestions from code review

Co-authored-by: Abhilash Majumder <[email protected]>

* revert suggestion on macro

* fix bug

* Add quant type GGML_TYPE_IQ1_S to unsupported

* fix format

---------

Co-authored-by: Abhilash Majumder <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants