Skip to content

Releases: ggml-org/llama.cpp

b5573

02 Jun 09:48
663445b
Compare
Choose a tag to compare
sycl: quantize and reorder the input to q8_1 when reorder is enabled …

b5572

01 Jun 16:49
7675c55
Compare
Choose a tag to compare
gguf: fix failure on version == 0 (#13956)

b5571

01 Jun 16:43
5e1c3ae
Compare
Choose a tag to compare
convert : fix nomic-bert-moe mask token (#13757)

b5569

01 Jun 15:10
e57bb87
Compare
Choose a tag to compare
ggml: check if non-native endian model is being loaded (#13943)

* gguf: prevent non-native endian models from being loaded

Signed-off-by: Aaron Teo <[email protected]>

* gguf: update error message

Signed-off-by: Aaron Teo <[email protected]>

* gguf: make the non-native endian check more verbose

Signed-off-by: Aaron Teo <[email protected]>

* ggml: move ggml_assert location

Signed-off-by: Aaron Teo <[email protected]>

* ggml: reword the endianness check error message

Signed-off-by: Aaron Teo <[email protected]>

---------

Signed-off-by: Aaron Teo <[email protected]>

b5568

01 Jun 12:27
Compare
Choose a tag to compare
sync : ggml

ggml-ci

b5560

01 Jun 10:42
c046217
Compare
Choose a tag to compare
parallel : fix n_junk == 0 (#13952)

b5559

01 Jun 09:32
0fc16b4
Compare
Choose a tag to compare
kv-cache : split implementation in separate sources (#13920)

ggml-ci

b5558

31 May 23:57
053b153
Compare
Choose a tag to compare
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Win…

b5556

31 May 15:55
e15898d
Compare
Choose a tag to compare
server: allow unclosed thinking tags (#13931)

b5555

31 May 13:47
803f8ba
Compare
Choose a tag to compare
llama : deprecate explicit kv_self defrag/update calls (#13921)

ggml-ci