Releases · ngxson/llama.cpp

02 Jun 19:06

3637576

b5579

server : disable speculative decoding for SWA models (#13970)

* server : use swa-full fo draft context

ggml-ci

* server : disable speculative decoding for SWA models

Assets 18

02 Jun 18:11

github-actions

b5577

5582c49

b5577

gemma : more consistent attention scaling for v2 and v3 (#13951)

* gemma : fix attn scale for 27B

* cont : apply scale before attn

* cont : consistent attention scaling

Assets 18

02 Jun 17:46

github-actions

b5576

c9bbc77

b5576

`server`: update deepseek reasoning format (pass reasoning_content as…

Assets 18

02 Jun 14:49

github-actions

b5575

bfd3227

b5575

mtmd : fix memory leak in mtmd_helper_eval_chunk_single (#13961)

* mtmd : fix memory in mtmd_helper_eval_chunk_single

* mtmd-cli : fix mem leak

* Update tools/mtmd/mtmd-cli.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 18

02 Jun 12:40

github-actions

b5574

093e3f1

b5574

cmake : Handle mixed-case 'Power' strings in POWER CPU detection (#13…

Assets 18

02 Jun 09:32

github-actions

b5573

663445b

b5573

sycl: quantize and reorder the input to q8_1 when reorder is enabled …

Assets 18

01 Jun 16:42

github-actions

b5572

7675c55

b5572

gguf: fix failure on version == 0 (#13956)

Assets 18

01 Jun 16:30

github-actions

b5571

5e1c3ae

b5571

convert : fix nomic-bert-moe mask token (#13757)

Assets 18

01 Jun 15:13

github-actions

b5569

e57bb87

b5569

ggml: check if non-native endian model is being loaded (#13943)

* gguf: prevent non-native endian models from being loaded

Signed-off-by: Aaron Teo <[email protected]>

* gguf: update error message

Signed-off-by: Aaron Teo <[email protected]>

* gguf: make the non-native endian check more verbose

Signed-off-by: Aaron Teo <[email protected]>

* ggml: move ggml_assert location

Signed-off-by: Aaron Teo <[email protected]>

* ggml: reword the endianness check error message

Signed-off-by: Aaron Teo <[email protected]>

---------

Signed-off-by: Aaron Teo <[email protected]>

Assets 18

01 Jun 11:44

github-actions

b5568

f3a4b16

b5568

sync : ggml

ggml-ci

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b5579

Uh oh!

b5577

Uh oh!

b5576

Uh oh!

b5575

Uh oh!

b5574

Uh oh!

b5573

Uh oh!

b5572

Uh oh!

b5571

Uh oh!

b5569

Uh oh!

b5568

Uh oh!