Skip to content

Releases: ggml-org/llama.cpp

b5529

29 May 07:00
1b8fb81
Compare
Choose a tag to compare
ggml: aarch64: Implement SVE F32 kernels for vector functions (#13843)

* F32-Mamba-SVE

* F32-Mamba-SVE

* Resolve test errors-1

* Resolve test errors-2

* F32-vec-SVE

* F32-vec-SVE

* F32-vec-SVE

b5527

28 May 21:04
763d06e
Compare
Choose a tag to compare
llama : fix KV shift for qwen2vl (#13870)

* llama : fix KV shift for qwen2vl

* add ref to the PR

b5526

28 May 20:58
1096133
Compare
Choose a tag to compare
mtmd : move helpers to dedicated library (⚠️ breaking change) (#13866)

* mtmd : move helpers to dedicated library

* fix server build

* rm leftover cmakelist code

b5524

28 May 17:22
e0e3aa2
Compare
Choose a tag to compare
llama : add support for BertForSequenceClassification reranker (#13858)

* convert: add support for BertForSequenceClassification

* add support for reranking using BertForSequenceClassification

* merge checks of eos and sep

* fix lint

---------

Co-authored-by: dinhhuy <[email protected]>

b5522

28 May 14:55
c962ae3
Compare
Choose a tag to compare
server: fix remove 'image_url'/'input_audio' json-object effectlly fo…

b5519

28 May 13:06
a682474
Compare
Choose a tag to compare
CUDA: fix FA tg at long context for CC >= 8.9 (#13852)

b5517

28 May 04:13
1e8659e
Compare
Choose a tag to compare
CANN: Add SOC TYPE printing in cmake configuration (#13837)

b5516

27 May 20:31
a3c3084
Compare
Choose a tag to compare
opencl: add new ops - `argsort`, `div`, `sub`, `addrows`, `sigmoid`, …

b5515

27 May 20:21
1701d4c
Compare
Choose a tag to compare
opencl: mark `mul_mat` `f32f32` as supporting non-contiguous tensors …

b5514

27 May 18:47
bef8176
Compare
Choose a tag to compare
vulkan: use timestamp queries for GGML_VULKAN_PERF (#13817)

Also change it to be controlled by an env var rather than cmake flag