Releases · ggml-org/llama.cpp

11 Dec 14:32

484d2f3

b4304

bug-fix: snprintf prints NULL in place of the last character (#10419)

* bug-fix: snprintf prints NULL in place of the last character

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.

* add comment about extra null-term byte requirement

Assets 22

11 Dec 01:37

github-actions

b4302

43041d2

b4302

ggml: load all backends from a user-provided search path (#10699)

* feat: load all backends from a user-provided search path

* fix: Windows search path

* refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path`

* refactor: rename `search_path` to `dir_path`

* fix: change `NULL` to `nullptr`

Co-authored-by: Diego Devesa <[email protected]>

* fix: change `NULL` to `nullptr`

---------

Co-authored-by: Diego Devesa <[email protected]>

Assets 22

10 Dec 21:08

github-actions

b4301

b685daf

b4301

vulkan: request round-to-even for fp16 in im2col/rope_head (#10767)

Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls
feature allows rounding mode to be requested if the implementation supports it.

Assets 22

10 Dec 20:19

github-actions

b4300

dafae66

b4300

vulkan: dynamic subgroup size for the remaining k quants (#10745)

* q5_k

q4_k

q3_k

q2_k

q6_k multi row example

* revert as multi row isnt faster for k quants

Assets 22

10 Dec 18:58

github-actions

b4299

ae4b922

b4299

imatrix : Add imatrix to --no-context-shift (#10766)

This allows for setting the --no-context-shift value in llama-imatrix which is required for models like DeepSeek

Assets 22

10 Dec 18:51

github-actions

b4298

750cb3e

b4298

CUDA: rename macros to avoid conflicts with WinAPI (#10736)

* Renames NVIDIA GPU-architecture flags to avoid name clashes with WinAPI. (e.g. CC_PASCAL, GPU architecture or WinAPI pascal compiler flag?)

* Reverts erroneous rename in SYCL-code.

* Renames GGML_CUDA_MIN_CC_DP4A to GGML_CUDA_CC_DP4A.

* Renames the rest of the compute capability macros for consistency.

Assets 22

10 Dec 18:36

github-actions

b4297

a86ad84

b4297

server : add flag to disable the web-ui (#10762) (#10751)

Co-authored-by: eugenio.segala <[email protected]>

Assets 22

10 Dec 18:23

github-actions

b4296

a05e2af

b4296

vulkan: disable spirv-opt for coopmat shaders (#10763)

There are some bugs in the 1.3.296 SDK, so disable this. It isn't strictly
necessary anyway.

Add missing dependency on vulkan-shaders-gen, so shaders get recompiled when it
changes.

Fix coopmat support reporting when glslc doesn't support NV_coopmat2.

Assets 22

09 Dec 19:50

github-actions

b4295

26a8406

b4295

CUDA: fix shared memory access condition for mmv (#10740)

Assets 22

09 Dec 08:09

github-actions

b4293

3d98b4c

b4293

vulkan: fix compile warnings (#10731)

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b4304

Uh oh!

b4302

Uh oh!

b4301

Uh oh!

b4300

Uh oh!

b4299

Uh oh!

b4298

Uh oh!

b4297

Uh oh!

b4296

Uh oh!

b4295

Uh oh!

b4293

Uh oh!