Skip to content

Commit fc27b4c

Browse files
committed
add src comments
1 parent 1c1ef0e commit fc27b4c

File tree

1 file changed

+16
-8
lines changed

1 file changed

+16
-8
lines changed

packages/gguf/src/quant_descriptions.ts

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,20 @@ export const QUANT_DESCRIPTIONS: Record<GGMLQuantizationType, string> = {
2121
[GGMLQuantizationType.Q5_K]: `5-bit quantization (q). Super-blocks with 8 blocks, each block has 32 weights. Weight formula: w = q * block_scale(6-bit) + block_min(6-bit), resulting in 5.5 bits-per-weight.`, // src: https://github.com/ggerganov/llama.cpp/pull/1684#issue-1739619305
2222
[GGMLQuantizationType.Q6_K]: `6-bit quantization (q). Super-blocks with 16 blocks, each block has 16 weights. Weight formula: w = q * block_scale(8-bit), resulting in 6.5625 bits-per-weight.`, // src: https://github.com/ggerganov/llama.cpp/pull/1684#issue-1739619305
2323
[GGMLQuantizationType.Q8_K]: `8-bit quantization (q). Each block has 256 weights. Only used for quantizing intermediate results. All 2-6 bit dot products are implemented for this quantization type. Weight formula: w = q * block_scale.`, // src: https://github.com/ggerganov/llama.cpp/pull/1684#issue-1739619305
24-
[GGMLQuantizationType.IQ2_XXS]: "", // todo: add description
25-
[GGMLQuantizationType.IQ2_XS]: "", // todo: add description
26-
[GGMLQuantizationType.IQ3_XXS]: "", // todo: add description
27-
[GGMLQuantizationType.IQ1_S]: "", // todo: add description
28-
[GGMLQuantizationType.IQ4_NL]: "", // todo: add description
29-
[GGMLQuantizationType.IQ3_S]: "", // todo: add description
30-
[GGMLQuantizationType.IQ2_S]: "", // todo: add description
31-
[GGMLQuantizationType.IQ4_XS]: "", // todo: add description
24+
[GGMLQuantizationType.IQ2_XXS]:
25+
"2-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 2.06 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
26+
[GGMLQuantizationType.IQ2_XS]:
27+
"2-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 2.31 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
28+
[GGMLQuantizationType.IQ3_XXS]:
29+
"3-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 3.06 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
30+
[GGMLQuantizationType.IQ1_S]:
31+
"1-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 1.56 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
32+
[GGMLQuantizationType.IQ4_NL]:
33+
"4-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix",
34+
[GGMLQuantizationType.IQ3_S]:
35+
"3-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 3.44 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
36+
[GGMLQuantizationType.IQ2_S]:
37+
"2-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 2.5 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
38+
[GGMLQuantizationType.IQ4_XS]:
39+
"4-bit quantization (q). Super-blocks with 256 weights. Weight w is obtained using super_block_scale & importance matrix, resulting in 4.25 bits-per-weight.", // src: https://huggingface.co/CISCai/OpenCodeInterpreter-DS-6.7B-SOTA-GGUF/blob/main/README.md?code=true#L59-L70
3240
};

0 commit comments

Comments
 (0)