Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5513
cmake : add llama-cparams.cpp to build (#13832)
b5512
SYCL: add gelu_erf kernel (#13749) * SYCL: add gelu_erf kernel * refactor code Co-authored-by: Atharva Dubey <[email protected]> * Use scope_op_debug_print --------- Co-authored-by: Atharva Dubey <[email protected]>
b5510
ggml : add ggml_repeat_4d (#13824)
b5509
ggml : riscv: add xtheadvector support (#13720) * ggml : riscv: add xtheadvector support * ggml : clean up some macro usage
b5508
mtmd : support Qwen 2.5 Omni (input audio+vision, no audio output) (#…
b5506
ggml-cpu: x86 feature detection is specific to x86 (#13811)
b5505
ggml : allow CUDA graphs when using pipeline parallelism (#13814)
b5504
kv-cells : track min/max used cells and per-sequence positions (#13808) * kv-cells : track min/max used cells and per-sequence positions ggml-ci * kv-cells : fix pos-modification updates for seq_pos ggml-ci * kv-cells : add comments ggml-ci
b5503
sampling : make sure samplers return at least 1 token (#13822) * sampling : min-p should always return at least one token ggml-ci * sampling : same for typical sampling * tests : sampling tests use min_keep == 0 ggml-ci
b5502
llama : validate seq id batch input (#13809) * llama : validate seq id batch input ggml-ci * cont : fix the fix ggml-ci