Skip to content

Releases: ngxson/llama.cpp

b4034

05 Nov 13:27
b8deef0
Compare
Choose a tag to compare
llama : add <|tool_call|> formatting to Granite template (#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <[email protected]>

b4033

04 Nov 23:42
a9e8a9a
Compare
Choose a tag to compare
ggml : fix arch check in bf16_to_fp32 (#10164)

b4027

04 Nov 13:38
ea02c75
Compare
Choose a tag to compare
cuda : clear error after changing peer access (#10153)

b4024

04 Nov 12:54
329ed91
Compare
Choose a tag to compare
CANN: adjust backend registry refactor. (#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

b4023

04 Nov 10:33
ce027ad
Compare
Choose a tag to compare
sync : ggml

b4020

03 Nov 20:35
9f40989
Compare
Choose a tag to compare
ggml : move CPU backend to a separate file (#10144)

b4019

03 Nov 14:32
08828a6
Compare
Choose a tag to compare
metal : minor fixup in FA kernel (#10143)

* metal : minor fixup in FA kernel

ggml-ci

* metal : use the unrolled loop variable

* metal : remove unused var

b4016

02 Nov 18:34
42cadc7
Compare
Choose a tag to compare
server : fix slot selection by lru (#10126)

* server : fix slot selection by lru, migrate lcs to `size_t`

* minor debug log fix

b4014

02 Nov 14:36
1926d6e
Compare
Choose a tag to compare
llama : adjust default context size + print warnings (#10136)

* llama : adjust default context size + print warnings

ggml-ci

* ggml-ci : add missing gpu-layers + adjust context sizes

b4013

02 Nov 13:24
b634f8a
Compare
Choose a tag to compare
simple-chat : only add bos on first prompt (#10129)