Commit 84c4ee1

authored and

committed

new gguf parsing for Q40 that conforms with pytorch's quantization stack (#150)

* new gguf parsing for Q40 that conforms with pytorch's quantization stack * updates * add q6_k and clean up q40 * fixes to unpack_q40

1 parent e49d36e commit 84c4ee1Copy full SHA for 84c4ee1

4 files changed

+222

-360

lines changed

gguf_util
- ggml_quantization_type
  - Q4_0.py
- loader.py
- tests
  - test_ggml_q40_subclass.py
- unpack.py

4 files changed

+222

-360

lines changed

`‎gguf_util/ggml_quantization_type/Q4_0.py`

This file was deleted.

Comments

(0)