Update Quant call using llama.cpp (#868)

Jack-Khuu · Jack-Khuu · commit 9bcab7fc6fdc · 2024-06-20T17:05:02.000-07:00
llama.cpp did a BC breaking refactor: ggml-org/llama.cpp@1c641e6 resulting in some of our CI breaking This updates our CI to match llama.cpp's schema
diff --git a/.github/workflows/pull.yml b/.github/workflows/pull.yml
@@ -725,7 +725,7 @@ jobs:
         run: |
           mkdir gguf_files
           wget -O gguf_files/llama-2-7b.Q4_0.gguf "https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q4_0.gguf?download=true"
-          ./llama.cpp/quantize --allow-requantize gguf_files/llama-2-7b.Q4_0.gguf gguf_files/llama-2-7b.Q4_0.requant_F32.gguf F32
+          ./llama.cpp/llama-quantize --allow-requantize gguf_files/llama-2-7b.Q4_0.gguf gguf_files/llama-2-7b.Q4_0.requant_F32.gguf F32
 
       - name: Load files
         run: |