Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b5591
vulkan: automatically deduce size of push constants (#13936)
b5590
ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (#13813) * * ggml-vulkan: adds op CONV_TRANSPOSE_1D * test-backend-ops: adds more spohisticated tests for CONV_TRANSPOSE_1D * Missing barrier added to shader. Number of additional tests reduced to 108. * * Fixes typo in variable name. * Removes extra whitespaces. * Adds int64->int32 casts to prevent possible warnings. * Problem size reduced in tests to pass tests with llvmpipe. * supports_op condition moved from unintended position
b5589
kv-cache : refactor the update/defrag mechanism (#13988) * kv-cache : refactor update mechanism ggml-ci * memory : improve status handling * defrag : reset head + add comments ggml-ci * cont : minor fixes ggml-ci
b5588
ci : remove cuda 11.7 releases, switch runner to windows 2022 (#13997)
b5587
releases : use dl backend for linux release, remove arm64 linux relea…
b5586
llama-graph : use ggml_repeat_4d (#13998)
b5585
CUDA: fix FTZ in FA for Gemma 3 (#13991)
b5584
kv-cache : fix unified::seq_rm to work with seq_id < 0 (#13985) ggml-ci
b5581
opencl: add `backend_synchronize` (#13939) * This is not needed by the normal use where the result is read using `tensor_get`, but it allows perf mode of `test-backend-ops` to properly measure performance.
b5580
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (#13840) * add concat, pad, repeat, tsembd, tanh, upscale * small fixes