Releases · ggml-org/llama.cpp

07 Dec 10:05

3df784b

b4280

Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processi…

Assets 22

07 Dec 08:33

github-actions

b4279

86a1934

b4279

metal : Extend how Llama.cpp locates metal resources (#10676)

* metal : Extend how Llama.cpp locates metal resources (#10675)

  * It searches the resource file in the directory where the current
    binary is located as well.
  * Resolves symbolic links.

Rationale:

When we plug this dependency into a Bazel build and run it in the
context of Bazel (e.g. testing):

  * the execution directory is often very different from where the files
    are located and no direct control over this (Bazel sandboxing),
  * the Bazel sandbox often use symbolic links to make files available.

With this patch, we can have the resource file added to the target,
can build and run tests in the context of Bazel.

* Update ggml/src/ggml-metal/ggml-metal.m

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml-metal/ggml-metal.m

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 22

06 Dec 13:10

github-actions

b4276

f162d45

b4276

common : bring back --no-warmup to server (#10686)

Assets 22

05 Dec 19:59

github-actions

b4273

c9c6e01

b4273

vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash a…

Assets 22

05 Dec 19:25

github-actions

b4272

6fe6247

b4272

llama : add Minerva 7B model support (#10673)

* Support for Minerva 7B

* Update convert_hf_to_gguf_update.py

Assets 22

05 Dec 12:12

github-actions

b4271

0cd182e

b4271

sync : ggml

Assets 22

04 Dec 23:16

github-actions

b4267

f112d19

b4267

Update deprecation-warning.cpp (#10619)

Fixed Path Separator Handling for Cross-Platform Support (Windows File Systems)

Assets 22

04 Dec 21:22

github-actions

b4266

1da7b76

b4266

server : fix speculative decoding with context shift (#10641)

* server : fix speculative decoding with context shift

ggml-ci

* server : take into account speculative limits

ggml-ci

* server : add tests

Assets 22

04 Dec 14:49

github-actions

b4265

59f4db1

b4265

ggml : add predefined list of CPU backend variants to build (#10626)

* ggml : add predefined list of CPU backend variants to build

* update CPU dockerfiles

Assets 22

04 Dec 10:25

github-actions

b4262

8d0cfd5

b4262

llama: Support MiniCPM-1B (with & w/o longrope) (#10559)

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b4280

Uh oh!

b4279

Uh oh!

b4276

Uh oh!

b4273

Uh oh!

b4272

Uh oh!

b4271

Uh oh!

b4267

Uh oh!

b4266

Uh oh!

b4265

Uh oh!

b4262

Uh oh!