Skip to content

Releases: ngxson/llama.cpp

b5492

26 May 10:29
2222931
Compare
Choose a tag to compare
llama : clarify deprecation message (#13794)

b5490

26 May 04:31
fef693d
Compare
Choose a tag to compare
vulkan: mark IM2COL as supporting non-contig (#13783)

b5489

26 May 02:34
2d38b6e
Compare
Choose a tag to compare
CANN: Add the basic supports of Flash Attention kernel (#13627)

* cann: add the basic FA support

* cann: update the readme

* cann: update the FlashAttention with PSEShift

* cann: update the input parameters in FA

* cann: update the alibi with max_bias

* cann: add the constrints of softcap

* cann: update the docs CANN.md

* cann: update the docs CANN.md

* cann: fix typo of CANN.md

* cann: add some comments and update the CANN.md

* cann: update the CANN.md

* cann: update the inner precise for fusedInferAttention

* cann: update the constraints of flash_attn_ext on ggml-cann.cpp

* cann: clean the whitespace

* cann: clean the whitespace

* cann: add a new endline

b5488

25 May 23:51
e121edc
Compare
Choose a tag to compare
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3…

b5486

25 May 14:46
aa50ba4
Compare
Choose a tag to compare
tests : improve UGM tokenizer test coverage (#13773)

b5484

25 May 12:58
c508256
Compare
Choose a tag to compare
rpc : Fix build on OpenBSD (#13541)

b5483

25 May 12:35
40aaa8a
Compare
Choose a tag to compare
mtmd : add support for Qwen2-Audio and SeaLLM-Audio (#13760)

* mtmd : add Qwen2-Audio support

* small clean up

* update discussion link

* clarify mtmd_get_output_embd

* clarification in multimodal.md

* fix ultravox bug

* ggml_cont

b5481

25 May 10:04
d785f9c
Compare
Choose a tag to compare
server: fix/test add_generation_prompt (#13770)

Co-authored-by: ochafik <[email protected]>

b5480

25 May 08:54
4032ca4
Compare
Choose a tag to compare
llama : add support for Qwen3 MoE tied word embeddings (#13768)

b5479

25 May 07:35
515fdbf
Compare
Choose a tag to compare
SYCL: revert "sycl: simplify bin_bcast_kernel (#13383)" (#13752)

Temporarily reverted due to failing fp16 DIV operation

This reverts commit 02cdd2d8b092b5a4bb18e013c6887ce49ba20ac5.

ggml-ci