Releases · ggml-org/llama.cpp

04 Jun 14:21

2589ad3

b5588

ci : remove cuda 11.7 releases, switch runner to windows 2022 (#13997)

Assets 15

04 Jun 11:53

github-actions

b5587

4825487

b5587

releases : use dl backend for linux release, remove arm64 linux relea…

Assets 17

04 Jun 08:27

github-actions

b5586

3ac6753

b5586

llama-graph : use ggml_repeat_4d (#13998)

Assets 18

04 Jun 07:50

github-actions

b5585

0b4be4c

b5585

CUDA: fix FTZ in FA for Gemma 3 (#13991)

Assets 18

04 Jun 07:45

github-actions

b5584

e0e806f

b5584

kv-cache : fix unified::seq_rm to work with seq_id < 0 (#13985)

ggml-ci

Assets 18

03 Jun 00:49

github-actions

b5581

71e74a3

b5581

opencl: add `backend_synchronize` (#13939)

* This is not needed by the normal use where the result is read
  using `tensor_get`, but it allows perf mode of `test-backend-ops`
  to properly measure performance.

Assets 18

03 Jun 00:38

github-actions

b5580

bfb1e01

b5580

OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (#13840)

* add concat, pad, repeat, tsembd, tanh, upscale

* small fixes

Assets 18

02 Jun 19:30

github-actions

b5579

3637576

b5579

server : disable speculative decoding for SWA models (#13970)

* server : use swa-full fo draft context

ggml-ci

* server : disable speculative decoding for SWA models

Assets 18

02 Jun 19:26

github-actions

b5578

ea394d7

b5578

metal : use F32 accumulators in FA kernels (#13975)

ggml-ci

Assets 18

02 Jun 18:16

github-actions

b5577

5582c49

b5577

gemma : more consistent attention scaling for v2 and v3 (#13951)

* gemma : fix attn scale for 27B

* cont : apply scale before attn

* cont : consistent attention scaling

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5588

Uh oh!

b5587

Uh oh!

b5586

Uh oh!

b5585

Uh oh!

b5584

Uh oh!

b5581

Uh oh!

b5580

Uh oh!

b5579

Uh oh!

b5578

Uh oh!

b5577

Uh oh!