quantize : make output filename optional again #2823

cebtenzzre · 2023-08-27T05:30:06Z

Commit 5c7b0e7 removed the ability to simply call quantize with an input filename and a quantization type, like this:

./quantize some_ggml_model.bin Q4_0

The output filename may be omitted if you pass a value for nthreads. In other words, the script works as long as you pass at least three arguments, but two of the four are optional.

This was presumably unintentional. Changing the minimum argument count to two makes the behavior of quantize match the usage string:

usage: ./quantize [--help] [--allow-requantize] [--leave-output-tensor] model-f32.bin [model-quant.bin] type [nthreads]

ghost · 2023-08-27T13:25:35Z

Hi,

Unrelated to this PR specifically: I'm getting an error with quantize trying to convert to GGUF v2. wizardLM & llama2chat converted fine, it seems that Vicuna/Pygmalion-Vicuna has a q6_k tensor, but conversion is disabled for it:

llama_model_loader: - type  f32:   65 tensors
llama_model_loader: - type q4_0:  225 tensors
llama_model_loader: - type q6_K:    1 tensors
llama_model_quantize_internal: meta size = 773344 bytes
[   1/ 291]                    token_embd.weight - [ 4096, 32000,     1,     1], type =   q4_0, size =   70.312 MB
[   2/ 291]                   output_norm.weight - [ 4096,     1,     1,     1], type =    f32, size =    0.016 MB
[   3/ 291]                        output.weight - [ 4096, 32000,     1,     1], type =   q6_K, llama_model_quantize: failed to quantize: requantizing from type q6_K is disabled
main: failed to quantize model from '/data/data/com.termux/files/home/Pygmalion-Vicuna.gguf'

Here's my attempt: ./quantize ~/Pygmalion-Vicuna.gguf 2 3 --allow-requantization

I understand if it cannot be enabled, but figured I'd ask anyway.

Thank you.

ggerganov · 2023-08-27T13:44:25Z

Try ./quantize --allow-requantization ~/Pygmalion-Vicuna.gguf 2 3 instead

slaren · 2023-08-27T14:02:53Z

While fixing this, can we also fix the destination path when using \ as the directory separator under Windows? Otherwise, it writes the file to the current directory.

diff --git a/examples/quantize/quantize.cpp b/examples/quantize/quantize.cpp
index 0e3a26b..df9a214 100644
--- a/examples/quantize/quantize.cpp
+++ b/examples/quantize/quantize.cpp
@@ -114,7 +114,7 @@ int main(int argc, char ** argv) {
     std::string ftype_str;
     if (try_parse_ftype(argv[arg_idx], params.ftype, ftype_str)) {
         std::string fpath;
-        const size_t pos = fname_inp.find_last_of('/');
+        const size_t pos = fname_inp.find_last_of("/\\");
         if (pos != std::string::npos) {
             fpath = fname_inp.substr(0, pos + 1);
         }

ghost · 2023-08-27T14:15:34Z

Try ./quantize --allow-requantize ~/Pygmalion-Vicuna.gguf 2 3 instead

I successfully converted, thank you.

@slaren

suggested by @slaren

KerfuffleV2

Thanks for fixing this, I think you're right about it being an oversight. I reaaaallly hate the way the quantize tool handles arguments in general. I think it's better to be explicit than have weird magical behavior with positional arguments.

KerfuffleV2 · 2023-08-27T16:54:10Z

it seems that Vicuna/Pygmalion-Vicuna has a q6_k tensor, but conversion is disabled for it

It was probably quantized with --leave-output-tensor. Quantizing to q4_0 from the q6_k tensor there is probably going to noticeably hurt quality (fine if you you were just testing the arguments, but I thought I'd mention that).

@slaren

* quantize : make output filename optional again * quantize : fix path parsing on Windows suggested by @slaren

quantize : make output filename optional again

7f1c434

cebtenzzre requested a review from KerfuffleV2 August 27, 2023 05:46

quantize : fix path parsing on Windows

bb24276

suggested by @slaren

slaren approved these changes Aug 27, 2023

View reviewed changes

KerfuffleV2 approved these changes Aug 27, 2023

View reviewed changes

ggerganov merged commit ebcee20 into ggml-org:master Aug 28, 2023

akawrykow pushed a commit to akawrykow/llama.cpp that referenced this pull request Aug 29, 2023

quantize : make output filename optional again (ggml-org#2823)

edd5c01

* quantize : make output filename optional again * quantize : fix path parsing on Windows suggested by @slaren

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

quantize : make output filename optional again #2823

quantize : make output filename optional again #2823

Uh oh!

cebtenzzre commented Aug 27, 2023

Uh oh!

ghost commented Aug 27, 2023

Uh oh!

ggerganov commented Aug 27, 2023

Uh oh!

slaren commented Aug 27, 2023

Uh oh!

ghost commented Aug 27, 2023 •

edited by ghost

Loading

Uh oh!

KerfuffleV2 left a comment

Uh oh!

KerfuffleV2 commented Aug 27, 2023

Uh oh!

Uh oh!

quantize : make output filename optional again #2823

quantize : make output filename optional again #2823

Uh oh!

Conversation

cebtenzzre commented Aug 27, 2023

Uh oh!

ghost commented Aug 27, 2023

Uh oh!

ggerganov commented Aug 27, 2023

Uh oh!

slaren commented Aug 27, 2023

Uh oh!

ghost commented Aug 27, 2023 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KerfuffleV2 left a comment

Choose a reason for hiding this comment

Uh oh!

KerfuffleV2 commented Aug 27, 2023

Uh oh!

Uh oh!

ghost commented Aug 27, 2023 •

edited by ghost

Loading