Skip to content

vulkan: optimize iq1 coopmat2 dequant functions #12427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 19, 2025

Conversation

jeffbolznv
Copy link
Collaborator

Perf on RTX 4070:

before:
  MUL_MAT(type_a=iq1_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 460 runs -  2174.26 us/run -  60.13 GFLOP/run -  27.66 TFLOPS
  MUL_MAT(type_a=iq1_m,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 288 runs -  3482.95 us/run -  60.13 GFLOP/run -  17.26 TFLOPS
  
after:
  MUL_MAT(type_a=iq1_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 726 runs -  1379.33 us/run -  60.13 GFLOP/run -  43.59 TFLOPS
  MUL_MAT(type_a=iq1_m,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 412 runs -  2428.05 us/run -  60.13 GFLOP/run -  24.76 TFLOPS

@jeffbolznv jeffbolznv requested a review from 0cc4m March 17, 2025 13:22
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Mar 17, 2025
@0cc4m
Copy link
Collaborator

0cc4m commented Mar 19, 2025

Master:

model size params backend ngl test t/s
llama 8B IQ1_S - 1.5625 bpw 1.87 GiB 8.03 B Vulkan 99 pp512 2609.84 ± 16.94
llama 8B IQ1_S - 1.5625 bpw 1.87 GiB 8.03 B Vulkan 99 tg128 65.52 ± 0.36

PR:

model size params backend ngl test t/s
llama 8B IQ1_S - 1.5625 bpw 1.87 GiB 8.03 B Vulkan 99 pp512 3461.18 ± 47.70
llama 8B IQ1_S - 1.5625 bpw 1.87 GiB 8.03 B Vulkan 99 tg128 66.59 ± 1.57

@0cc4m 0cc4m merged commit a9b5928 into ggml-org:master Mar 19, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants