Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5529
ggml: aarch64: Implement SVE F32 kernels for vector functions (#13843) * F32-Mamba-SVE * F32-Mamba-SVE * Resolve test errors-1 * Resolve test errors-2 * F32-vec-SVE * F32-vec-SVE * F32-vec-SVE
b5527
llama : fix KV shift for qwen2vl (#13870) * llama : fix KV shift for qwen2vl * add ref to the PR
b5526
mtmd : move helpers to dedicated library (⚠️ breaking change) (#13866) * mtmd : move helpers to dedicated library * fix server build * rm leftover cmakelist code
b5524
llama : add support for BertForSequenceClassification reranker (#13858) * convert: add support for BertForSequenceClassification * add support for reranking using BertForSequenceClassification * merge checks of eos and sep * fix lint --------- Co-authored-by: dinhhuy <[email protected]>
b5522
server: fix remove 'image_url'/'input_audio' json-object effectlly fo…
b5519
CUDA: fix FA tg at long context for CC >= 8.9 (#13852)
b5517
CANN: Add SOC TYPE printing in cmake configuration (#13837)
b5516
opencl: add new ops - `argsort`, `div`, `sub`, `addrows`, `sigmoid`, …
b5515
opencl: mark `mul_mat` `f32f32` as supporting non-contiguous tensors …
b5514
vulkan: use timestamp queries for GGML_VULKAN_PERF (#13817) Also change it to be controlled by an env var rather than cmake flag