Gibberish output of Qwen2.5-3B-Instruct with Q2_K quantization #12378

simmonssong · 2025-03-14T01:10:35Z

simmonssong
Mar 14, 2025

I tried converting Qwen2.5-3B-Instruct into Q2_K quantization on two different machines. The output of the compressed model is always nonsense:
,“||9"363的76...5 31367244一246“),).请-264“3))-64）)5761595431636843467435565846"):4843)"),\n5353"34“ ), 3\"6)) the"24\n\n964

But it seems that this only happens to Q2_K.

Platforms:
Windows 10 with llama.cpp build b4846.
Windows 11 with llama.cpp build b4520.
Original model:
https://huggingface.co/Qwen/Qwen2.5-3B-Instruct
Conversion script:
python convert_hf_to_gguf.py ***\Qwen2.5-7B-Instruct --outfile ***\Qwen2.5-7B-Instruct-FP16.gguf
Quantization script:
llama-quantize.exe ***\Qwen2.5-3B-Instruct-FP16.gguf ***\Qwen2.5-3B-Instruct-Q2_K.gguf Q2_K
Model testing script:
llama-cli.exe -m ***\Qwen2.5-3B-Instruct-Q2_K.gguf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gibberish output of Qwen2.5-3B-Instruct with Q2_K quantization #12378

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Gibberish output of Qwen2.5-3B-Instruct with Q2_K quantization #12378

Uh oh!

simmonssong Mar 14, 2025

Replies: 0 comments

simmonssong
Mar 14, 2025