Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b4304
bug-fix: snprintf prints NULL in place of the last character (#10419) * bug-fix: snprintf prints NULL in place of the last character We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string. * add comment about extra null-term byte requirement
b4302
ggml: load all backends from a user-provided search path (#10699) * feat: load all backends from a user-provided search path * fix: Windows search path * refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path` * refactor: rename `search_path` to `dir_path` * fix: change `NULL` to `nullptr` Co-authored-by: Diego Devesa <[email protected]> * fix: change `NULL` to `nullptr` --------- Co-authored-by: Diego Devesa <[email protected]>
b4301
vulkan: request round-to-even for fp16 in im2col/rope_head (#10767) Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls feature allows rounding mode to be requested if the implementation supports it.
b4300
vulkan: dynamic subgroup size for the remaining k quants (#10745) * q5_k q4_k q3_k q2_k q6_k multi row example * revert as multi row isnt faster for k quants
b4299
imatrix : Add imatrix to --no-context-shift (#10766) This allows for setting the --no-context-shift value in llama-imatrix which is required for models like DeepSeek
b4298
CUDA: rename macros to avoid conflicts with WinAPI (#10736) * Renames NVIDIA GPU-architecture flags to avoid name clashes with WinAPI. (e.g. CC_PASCAL, GPU architecture or WinAPI pascal compiler flag?) * Reverts erroneous rename in SYCL-code. * Renames GGML_CUDA_MIN_CC_DP4A to GGML_CUDA_CC_DP4A. * Renames the rest of the compute capability macros for consistency.
b4297
server : add flag to disable the web-ui (#10762) (#10751) Co-authored-by: eugenio.segala <[email protected]>
b4296
vulkan: disable spirv-opt for coopmat shaders (#10763) There are some bugs in the 1.3.296 SDK, so disable this. It isn't strictly necessary anyway. Add missing dependency on vulkan-shaders-gen, so shaders get recompiled when it changes. Fix coopmat support reporting when glslc doesn't support NV_coopmat2.
b4295
CUDA: fix shared memory access condition for mmv (#10740)
b4293
vulkan: fix compile warnings (#10731)