Releases · ggml-org/llama.cpp

04 Jun 07:50

0b4be4c

b5585

CUDA: fix FTZ in FA for Gemma 3 (#13991)

Assets 18

04 Jun 07:45

github-actions

b5584

e0e806f

b5584

kv-cache : fix unified::seq_rm to work with seq_id < 0 (#13985)

ggml-ci

Assets 18

03 Jun 00:49

github-actions

b5581

71e74a3

b5581

opencl: add `backend_synchronize` (#13939)

* This is not needed by the normal use where the result is read
  using `tensor_get`, but it allows perf mode of `test-backend-ops`
  to properly measure performance.

Assets 18

03 Jun 00:38

github-actions

b5580

bfb1e01

b5580

OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (#13840)

* add concat, pad, repeat, tsembd, tanh, upscale

* small fixes

Assets 18

02 Jun 19:30

github-actions

b5579

3637576

b5579

server : disable speculative decoding for SWA models (#13970)

* server : use swa-full fo draft context

ggml-ci

* server : disable speculative decoding for SWA models

Assets 18

02 Jun 19:26

github-actions

b5578

ea394d7

b5578

metal : use F32 accumulators in FA kernels (#13975)

ggml-ci

Assets 18

02 Jun 18:16

github-actions

b5577

5582c49

b5577

gemma : more consistent attention scaling for v2 and v3 (#13951)

* gemma : fix attn scale for 27B

* cont : apply scale before attn

* cont : consistent attention scaling

Assets 18

02 Jun 17:42

github-actions

b5576

c9bbc77

b5576

`server`: update deepseek reasoning format (pass reasoning_content as…

Assets 18

02 Jun 15:06

github-actions

b5575

bfd3227

b5575

mtmd : fix memory leak in mtmd_helper_eval_chunk_single (#13961)

* mtmd : fix memory in mtmd_helper_eval_chunk_single

* mtmd-cli : fix mem leak

* Update tools/mtmd/mtmd-cli.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 18

02 Jun 12:39

github-actions

b5574

093e3f1

b5574

cmake : Handle mixed-case 'Power' strings in POWER CPU detection (#13…

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5585

Uh oh!

b5584

Uh oh!

b5581

Uh oh!

b5580

Uh oh!

b5579

Uh oh!

b5578

Uh oh!

b5577

Uh oh!

b5576

Uh oh!

b5575

Uh oh!

b5574

Uh oh!