Releases · ggml-org/llama.cpp

19 Dec 03:39

7909e85

b4359

llama-run : improve progress bar (#10821)

Set default width to whatever the terminal is. Also fixed a small bug around
default n_gpu_layers value.

Signed-off-by: Eric Curtin <[email protected]>

Assets 23

18 Dec 23:06

github-actions

b4358

9177484

b4358

ggml : fix arm build (#10890)

* ggml: GGML_NATIVE uses -mcpu=native on ARM

Signed-off-by: Adrien Gallouët <[email protected]>

* ggml: Show detected features with GGML_NATIVE

Signed-off-by: Adrien Gallouët <[email protected]>

* remove msvc support, add GGML_CPU_ARM_ARCH option

* disable llamafile in android example

* march -> mcpu, skip adding feature macros

ggml-ci

---------

Signed-off-by: Adrien Gallouët <[email protected]>
Co-authored-by: Adrien Gallouët <[email protected]>

Assets 23

18 Dec 20:11

github-actions

b4357

0bf2d10

b4357

tts : add OuteTTS support (#10784)

* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : be explicit about the pooling type in the tests

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* llama : add OuteTTS support (wip)

* wip

* extract features

* first conv

* group norm

* resnet conv

* resnet

* attn

* pos net

* layer norm

* convnext

* head

* hann window

* fix n_embd + remove llama.cpp hacks

* compute hann window

* fft

* spectrum processing

* clean-up

* tts : receive input text and generate codes

* clip : fix new conv name

* tts : minor fix

* tts : add header + minor fixes

ggml-ci

* tts : add matchematical constant

ggml-ci

* tts : fix sampling + cut initial noise

* tts : fixes

* tts : update default samplers

ggml-ci

* tts : text pre-processing

* tts : outetts-voc -> wavtokenizer-dec

* tts : remove hardcoded constants

ggml-ci

* tts : fix tensor shapes

* llama : refactor wavtokenizer tensors

ggml-ci

* cont

ggml-ci

* cont [no ci]

* llama : update WavTokenizer to non-causal attn

* llama : handle no-vocab detokenization

* tts : add Python example for OuteTTS (wip)

* tts : extend python example to generate spectrogram

ggml-ci

* server : fix rebase artifacts

* tts : enable "return_tokens" in Python example

ggml-ci

* tts : minor fixes

* common : support HF download for vocoder

Assets 21

18 Dec 11:17

github-actions

b4354

0e70ba6

b4354

server : add "tokens" output (#10853)

* server : add "tokens" output

ggml-ci

* server : update readme

ggml-ci

* server : return tokens ids only if requested

ggml-ci

* tests : improve "tokens" type check

Co-authored-by: Xuan Son Nguyen <[email protected]>

* server : remove "tokens" from the OAI endpoint

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

Assets 23

18 Dec 11:17

github-actions

b4353

4682887

b4353

server : (embeddings) using same format for "input" and "content" (#1…

Assets 23

18 Dec 01:15

github-actions

b4351

4da69d1

b4351

Revert "llama : add Falcon3 support (#10864)" (#10876)

This reverts commit 382bc7f2e8ffd0b89f23e840d097e21f301197ba.

Assets 23

17 Dec 23:17

github-actions

b4350

d62b532

b4350

Use model->gguf_kv for loading the template instead of using the C AP…

Assets 23

17 Dec 21:24

github-actions

b4349

081b29b

b4349

tests: add tests for GGUF (#10830)

Assets 23

17 Dec 20:27

github-actions

b4348

5437d4a

b4348

sync : ggml

Assets 23

17 Dec 20:19

github-actions

b4343

0006f5a

b4343

ggml : update ggml_backend_cpu_device_supports_op (#10867)

* ggml : fix cpy op for IQ-quants to use reference impl

ggml-ci

* ggml : disable tests involving i-matrix quantization

* ggml : update ggml_backend_cpu_device_supports_op

ggml-ci

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b4359

Uh oh!

b4358

Uh oh!

b4357

Uh oh!

b4354

Uh oh!

b4353

Uh oh!

b4351

Uh oh!

b4350

Uh oh!

b4349

Uh oh!

b4348

Uh oh!

b4343

Uh oh!