Releases · ngxson/llama.cpp

30 May 17:24

b47ab7b

b5547

sched : avoid changing cur_copy when a graph is already allocated (#1…

Assets 18

30 May 17:08

github-actions

b5546

dd665cc

b5546

parallel : increase the variability of the prompt lengths (#13927)

ggml-ci

Assets 18

30 May 15:13

github-actions

b5545

df0c0c7

b5545

cuda : prevent using split buffers with 3d/4d matrices (#13919)

Assets 18

30 May 14:30

github-actions

b5544

b49a8ff

b5544

SYCL: Add mrope kernel (#13755)

* SYCL: Add mrope kernel

* feat: Optimize rope operations with vectorization

Uses `sycl::vec` to load and store two elements at a time,
significantly improving performance in `rope_norm`,
`rope_neox`, and `rope_multi`. This reduces the number of memory
accesses and leverages SIMD instructions for faster execution.

* Use ceil_div

Assets 18

30 May 13:51

github-actions

b5543

53f9250

b5543

sync : vendor (#13901)

* sync : vendor

ggml-ci

* cont : fix httplib version

ggml-ci

* cont : fix lint

* cont : fix lint

* vendor : move to common folder /vendor

ggml-ci

* cont : fix lint

* cont : move httplib to /vendor + use json_fwd.hpp

ggml-ci

* cont : fix server build

ggml-ci

* cont : add missing headers

ggml-ci

* cont : header clean-up

ggml-ci

Assets 18

30 May 10:51

github-actions

b5541

07e4351

b5541

convert : allow partial update to the chkhsh pre-tokenizer list (#13847)

* convert : allow partial update to the chkhsh pre-tokenizer list

* code style

* update tokenizer out

* rm inp/out files for models not having gguf

* fixed hash for glm

* skip nomic-bert-moe test

* Update convert_hf_to_gguf_update.py

* fix minerva-7b hash

* rm redundant import

Assets 18

30 May 10:20

github-actions

b5540

291f2b6

b5540

llama : add support for DistilBert (#13907)

* add distilbert

* small fixes

* add note for LLM_ARCH_DISTIL_BERT

* Use MODEL_ARCH.BERT for DistilBert

---------

Co-authored-by: dinhhuy <[email protected]>

Assets 18

30 May 08:57

github-actions

b5539

2c90da4

b5539

llama : use llm_build_granite for minicpm (#13911)

Assets 18

29 May 23:59

github-actions

b5538

ec9e030

b5538

cmake: Guard GGML_CPU_ALL_VARIANTS by architecture (#13890)

Assets 18

29 May 20:09

github-actions

b5537

e83ba3e

b5537

llama : add support for jina-reranker-v2 (#13900)

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b5547

Uh oh!

b5546

Uh oh!

b5545

Uh oh!

b5544

Uh oh!

b5543

Uh oh!

b5541

Uh oh!

b5540

Uh oh!

b5539

Uh oh!

b5538

Uh oh!

b5537

Uh oh!