Skip to content

Releases: ggml-org/llama.cpp

b5576

02 Jun 17:42
c9bbc77
Compare
Choose a tag to compare
`server`: update deepseek reasoning format (pass reasoning_content as…

b5575

02 Jun 15:06
bfd3227
Compare
Choose a tag to compare
mtmd : fix memory leak in mtmd_helper_eval_chunk_single (#13961)

* mtmd : fix memory in mtmd_helper_eval_chunk_single

* mtmd-cli : fix mem leak

* Update tools/mtmd/mtmd-cli.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b5574

02 Jun 12:39
093e3f1
Compare
Choose a tag to compare
cmake : Handle mixed-case 'Power' strings in POWER CPU detection (#13…

b5573

02 Jun 09:48
663445b
Compare
Choose a tag to compare
sycl: quantize and reorder the input to q8_1 when reorder is enabled …

b5572

01 Jun 16:49
7675c55
Compare
Choose a tag to compare
gguf: fix failure on version == 0 (#13956)

b5571

01 Jun 16:43
5e1c3ae
Compare
Choose a tag to compare
convert : fix nomic-bert-moe mask token (#13757)

b5569

01 Jun 15:10
e57bb87
Compare
Choose a tag to compare
ggml: check if non-native endian model is being loaded (#13943)

* gguf: prevent non-native endian models from being loaded

Signed-off-by: Aaron Teo <[email protected]>

* gguf: update error message

Signed-off-by: Aaron Teo <[email protected]>

* gguf: make the non-native endian check more verbose

Signed-off-by: Aaron Teo <[email protected]>

* ggml: move ggml_assert location

Signed-off-by: Aaron Teo <[email protected]>

* ggml: reword the endianness check error message

Signed-off-by: Aaron Teo <[email protected]>

---------

Signed-off-by: Aaron Teo <[email protected]>

b5568

01 Jun 12:27
Compare
Choose a tag to compare
sync : ggml

ggml-ci

b5560

01 Jun 10:42
c046217
Compare
Choose a tag to compare
parallel : fix n_junk == 0 (#13952)

b5559

01 Jun 09:32
0fc16b4
Compare
Choose a tag to compare
kv-cache : split implementation in separate sources (#13920)

ggml-ci