Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5601
context : fix SWA-related warning for multiple sequences (#14045)
b5600
llama : support multiple classifier outputs and labels (#13940)
b5598
vulkan: Enable VK_KHR_cooperative_matrix extension for Intel Xe2 GPUs…
b5596
memory : migrate from llama_kv_cache to more generic llama_memory (#1…
b5595
llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WI…
b5593
vocab : warn about missing mask token (#14022)
b5592
context : fix pos_min initialization upon error decode (#14008) ggml-ci
b5591
vulkan: automatically deduce size of push constants (#13936)
b5590
ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (#13813) * * ggml-vulkan: adds op CONV_TRANSPOSE_1D * test-backend-ops: adds more spohisticated tests for CONV_TRANSPOSE_1D * Missing barrier added to shader. Number of additional tests reduced to 108. * * Fixes typo in variable name. * Removes extra whitespaces. * Adds int64->int32 casts to prevent possible warnings. * Problem size reduced in tests to pass tests with llvmpipe. * supports_op condition moved from unintended position
b5589
kv-cache : refactor the update/defrag mechanism (#13988) * kv-cache : refactor update mechanism ggml-ci * memory : improve status handling * defrag : reset head + add comments ggml-ci * cont : minor fixes ggml-ci