Gibberish output of Qwen2.5-3B-Instruct with Q2_K quantization #12378
Unanswered
simmonssong
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I tried converting
Qwen2.5-3B-Instruct
intoQ2_K
quantization on two different machines. The output of the compressed model is always nonsense:,“||9"363的76...5 31367244一246“),).请-264“3))-64))5761595431636843467435565846"):4843)"),\n5353"34“ ), 3\"6)) the"24\n\n964
But it seems that this only happens to
Q2_K
.Platforms:
Windows 10 with llama.cpp build b4846.
Windows 11 with llama.cpp build b4520.
Original model:
https://huggingface.co/Qwen/Qwen2.5-3B-Instruct
Conversion script:
python convert_hf_to_gguf.py ***\Qwen2.5-7B-Instruct --outfile ***\Qwen2.5-7B-Instruct-FP16.gguf
Quantization script:
llama-quantize.exe ***\Qwen2.5-3B-Instruct-FP16.gguf ***\Qwen2.5-3B-Instruct-Q2_K.gguf Q2_K
Model testing script:
llama-cli.exe -m ***\Qwen2.5-3B-Instruct-Q2_K.gguf
Beta Was this translation helpful? Give feedback.
All reactions