Releases: ggml-org/llama.cpp
Releases Β· ggml-org/llama.cpp
b4384
server : fix missing model id in /model endpoint (#10957) * server : fix missing model id in /model endpoint * fix ci
b4383
server : add system_fingerprint to chat/completion (#10917) * server : add system_fingerprint to chat/completion * update README
b4382
rpc-server : add support for the SYCL backend (#10934)
b4381
llama : support InfiniAI Megrez 3b (#10893) * Support InfiniAI Megrez 3b * Fix tokenizer_clean_spaces for megrez
b4380
llama : support for Llama-3_1-Nemotron-51B (#10669) * conflict resolution * move comments after bracket to its own line
b4379
llama-run : include temperature option (#10899) This commit updates the `examples/run/README.md` file to include a new option for setting the temperature and updates the `run.cpp` file to parse this option. Signed-off-by: Eric Curtin <[email protected]>
b4378
ggml : fix run-time on FreeBSD in get_executable_path() (#10948)
b4376
llama : add Falcon3 support (#10883) * Add Falcon3 model support * Add fix for adding bos to added special tokens * Add comment explaining the logic behind the if statement * Add a log message to better track the when the following line of code is triggered * Update log to only print when input and output characters are different * Fix handling pre-normalized tokens * Refactoring
b4375
vulkan: build fixes for 32b (#10927) * vulkan: build fixes for 32b Should fix #10923 * vulkan: initialize some buffer/offset variables
b4372
ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0β¦