Skip to content

Releases: ggml-org/llama.cpp

b4304

11 Dec 14:32
484d2f3
Compare
Choose a tag to compare
bug-fix: snprintf prints NULL in place of the last character (#10419)

* bug-fix: snprintf prints NULL in place of the last character

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.

* add comment about extra null-term byte requirement

b4302

11 Dec 01:37
43041d2
Compare
Choose a tag to compare
ggml: load all backends from a user-provided search path (#10699)

* feat: load all backends from a user-provided search path

* fix: Windows search path

* refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path`

* refactor: rename `search_path` to `dir_path`

* fix: change `NULL` to `nullptr`

Co-authored-by: Diego Devesa <[email protected]>

* fix: change `NULL` to `nullptr`

---------

Co-authored-by: Diego Devesa <[email protected]>

b4301

10 Dec 21:08
b685daf
Compare
Choose a tag to compare
vulkan: request round-to-even for fp16 in im2col/rope_head (#10767)

Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls
feature allows rounding mode to be requested if the implementation supports it.

b4300

10 Dec 20:19
dafae66
Compare
Choose a tag to compare
vulkan: dynamic subgroup size for the remaining k quants (#10745)

* q5_k

q4_k

q3_k

q2_k

q6_k multi row example

* revert as multi row isnt faster for k quants

b4299

10 Dec 18:58
ae4b922
Compare
Choose a tag to compare
imatrix : Add imatrix to --no-context-shift (#10766)

This allows for setting the --no-context-shift value in llama-imatrix which is required for models like DeepSeek

b4298

10 Dec 18:51
750cb3e
Compare
Choose a tag to compare
CUDA: rename macros to avoid conflicts with WinAPI (#10736)

* Renames NVIDIA GPU-architecture flags to avoid name clashes with WinAPI. (e.g. CC_PASCAL, GPU architecture or WinAPI pascal compiler flag?)

* Reverts erroneous rename in SYCL-code.

* Renames GGML_CUDA_MIN_CC_DP4A to GGML_CUDA_CC_DP4A.

* Renames the rest of the compute capability macros for consistency.

b4297

10 Dec 18:36
a86ad84
Compare
Choose a tag to compare
server : add flag to disable the web-ui (#10762) (#10751)

Co-authored-by: eugenio.segala <[email protected]>

b4296

10 Dec 18:23
a05e2af
Compare
Choose a tag to compare
vulkan: disable spirv-opt for coopmat shaders (#10763)

There are some bugs in the 1.3.296 SDK, so disable this. It isn't strictly
necessary anyway.

Add missing dependency on vulkan-shaders-gen, so shaders get recompiled when it
changes.

Fix coopmat support reporting when glslc doesn't support NV_coopmat2.

b4295

09 Dec 19:50
26a8406
Compare
Choose a tag to compare
CUDA: fix shared memory access condition for mmv (#10740)

b4293

09 Dec 08:09
3d98b4c
Compare
Choose a tag to compare
vulkan: fix compile warnings (#10731)