Skip to content

Releases: ngxson/llama.cpp

b5547

30 May 17:24
b47ab7b
Compare
Choose a tag to compare
sched : avoid changing cur_copy when a graph is already allocated (#1…

b5546

30 May 17:08
dd665cc
Compare
Choose a tag to compare
parallel : increase the variability of the prompt lengths (#13927)

ggml-ci

b5545

30 May 15:13
df0c0c7
Compare
Choose a tag to compare
cuda : prevent using split buffers with 3d/4d matrices (#13919)

b5544

30 May 14:30
b49a8ff
Compare
Choose a tag to compare
SYCL: Add mrope kernel (#13755)

* SYCL: Add mrope kernel

* feat: Optimize rope operations with vectorization

Uses `sycl::vec` to load and store two elements at a time,
significantly improving performance in `rope_norm`,
`rope_neox`, and `rope_multi`. This reduces the number of memory
accesses and leverages SIMD instructions for faster execution.

* Use ceil_div

b5543

30 May 13:51
53f9250
Compare
Choose a tag to compare
sync : vendor (#13901)

* sync : vendor

ggml-ci

* cont : fix httplib version

ggml-ci

* cont : fix lint

* cont : fix lint

* vendor : move to common folder /vendor

ggml-ci

* cont : fix lint

* cont : move httplib to /vendor + use json_fwd.hpp

ggml-ci

* cont : fix server build

ggml-ci

* cont : add missing headers

ggml-ci

* cont : header clean-up

ggml-ci

b5541

30 May 10:51
07e4351
Compare
Choose a tag to compare
convert : allow partial update to the chkhsh pre-tokenizer list (#13847)

* convert : allow partial update to the chkhsh pre-tokenizer list

* code style

* update tokenizer out

* rm inp/out files for models not having gguf

* fixed hash for glm

* skip nomic-bert-moe test

* Update convert_hf_to_gguf_update.py

* fix minerva-7b hash

* rm redundant import

b5540

30 May 10:20
291f2b6
Compare
Choose a tag to compare
llama : add support for DistilBert (#13907)

* add distilbert

* small fixes

* add note for LLM_ARCH_DISTIL_BERT

* Use MODEL_ARCH.BERT for DistilBert

---------

Co-authored-by: dinhhuy <[email protected]>

b5539

30 May 08:57
2c90da4
Compare
Choose a tag to compare
llama : use llm_build_granite for minicpm (#13911)

b5538

29 May 23:59
ec9e030
Compare
Choose a tag to compare
cmake: Guard GGML_CPU_ALL_VARIANTS by architecture (#13890)

b5537

29 May 20:09
e83ba3e
Compare
Choose a tag to compare
llama : add support for jina-reranker-v2 (#13900)