Skip to content

ggml-quants: Provide ggml_vqtbl1q_u8 for 64bit compatibility #5711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 25, 2024

Conversation

rgryta
Copy link
Contributor

@rgryta rgryta commented Feb 25, 2024

Additionally unblocks Android example and workflow build for Android build with armeabi-v7a target.

Function vqtbl1q_u8 is not available under neon ARM-V7.

@rgryta rgryta changed the title [ggml-quants] Add preprocessor check for __ARM_ARCH 8 sepcific neon optimizations ggml-quants: Add preprocessor check for __ARM_ARCH 8 sepcific neon optimizations Feb 25, 2024
@rgryta rgryta changed the title ggml-quants: Add preprocessor check for __ARM_ARCH 8 sepcific neon optimizations ggml-quants: Add preprocessor check for __ARM_ARCH 8 specific neon optimizations Feb 25, 2024
@rgryta
Copy link
Contributor Author

rgryta commented Feb 25, 2024

android-build workflow will fail as it's fetching the current llama.cpp dependency from master (which of course doesn't have the neon v7a fix for now)

@rgryta
Copy link
Contributor Author

rgryta commented Feb 25, 2024

@ggerganov, let me know if you'd rather I split this PR into two separate ones instead (one that adds the ARM-V check and then another one that updates the build workflow).
I've tested the example workflow locally by changing the CMakeLists.txt dependency to my fork instead and everything seems to be correct.

ggml-quants.c Outdated
@@ -9452,7 +9452,7 @@ void ggml_vec_dot_iq3_s_q8_K (int n, float * GGML_RESTRICT s, size_t bs, const v

const int nb = n / QK_K;

#if defined(__ARM_NEON)
#if defined(__ARM_NEON) && (__ARM_ARCH >= 8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, we should add ggml_vqtbl1q_u8 similar to how we have ggml_vqtbl1q_s8 and replace all usages of vqtbl1q_u8 with ggml_vqtbl1q_u8

Copy link
Contributor Author

@rgryta rgryta Feb 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended and compiled locally in and Android project

@rgryta rgryta changed the title ggml-quants: Add preprocessor check for __ARM_ARCH 8 specific neon optimizations ggml-quants: Provide ggml_vqtbl1q_u8 for 64bit compatibility Feb 25, 2024
@ggerganov ggerganov merged commit abbabc5 into ggml-org:master Feb 25, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
…rg#5711)

* [ggml-quants] Provide ggml_vqtbl1q_u8 for 64bit compatibility

vqtbl1q_u8 is not part of arm v7 neon library

* [android-example] Remove abi filter after arm v7a fix

* [github-workflows] Do not skip Android armeabi-v7a build
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
…rg#5711)

* [ggml-quants] Provide ggml_vqtbl1q_u8 for 64bit compatibility

vqtbl1q_u8 is not part of arm v7 neon library

* [android-example] Remove abi filter after arm v7a fix

* [github-workflows] Do not skip Android armeabi-v7a build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants