vulkan: optimize coopmat2 q4_k/q5_k dequant functions. #11206

jeffbolznv · 2025-01-12T20:00:48Z

Do masking on whole dwords, fetch all scales at once.

pp512 results on RTX 4070:

before:
Phi-3-mini-4k-instruct-q4.gguf		4998.55
llama-3.2-3b-instruct-q5_k_m.gguf	5573.48

after:
Phi-3-mini-4k-instruct-q4.gguf		5322.34
llama-3.2-3b-instruct-q5_k_m.gguf	6082.11

Do masking on whole dwords, fetch all scales at once.

0cc4m

Looks good and I also see a decent improvement on RTX 3090.

Do masking on whole dwords, fetch all scales at once.

vulkan: optimize coopmat2 q4_k/q5_k dequant functions.

61e5d8d

Do masking on whole dwords, fetch all scales at once.

jeffbolznv requested a review from 0cc4m January 12, 2025 20:00

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jan 12, 2025

0cc4m approved these changes Jan 16, 2025

View reviewed changes

0cc4m merged commit 466300f into ggml-org:master Jan 16, 2025
2 checks passed

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (ggml-org#11206)

2f08dc0

Do masking on whole dwords, fetch all scales at once.

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (ggml-org#11206)

28267b3

Do masking on whole dwords, fetch all scales at once.

mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (ggml-org#11206)

0326138

Do masking on whole dwords, fetch all scales at once.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. #11206

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. #11206

Uh oh!

jeffbolznv commented Jan 12, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Uh oh!

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. #11206

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. #11206

Uh oh!

Conversation

jeffbolznv commented Jan 12, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!