Releases · ngxson/llama.cpp

28 Nov 13:53

76b27d2

b4209

ggml : fix row condition for i8mm kernels (#10561)

ggml-ci

Assets 22

28 Nov 12:47

github-actions

b4206

2025fa6

b4206

kompute : improve backend to pass test_backend_ops (#10542)

* kompute: op_unary: reject unsupported parameters

Signed-off-by: Sergio Lopez <[email protected]>

* kompute: softmax: implement ALiBi support

Signed-off-by: Sergio Lopez <[email protected]>

* kompute: rope: implement neox and phi3 support

Signed-off-by: Sergio Lopez <[email protected]>

* kompute: op_mul_mat_q4_k permutted support

Signed-off-by: Sergio Lopez <[email protected]>

* kompute: op_mul_mat_[q4_0|q4_1|q8_0] permutted support

Signed-off-by: Sergio Lopez <[email protected]>

* kompute: op_mul_mat_f16 permutted support

Signed-off-by: Sergio Lopez <[email protected]>

* kompute: op_mul_mat_q6_k permutted support

Signed-off-by: Sergio Lopez <[email protected]>

---------

Signed-off-by: Sergio Lopez <[email protected]>

Assets 22

28 Nov 08:50

github-actions

b4205

c6bc739

b4205

CANN: Update cann.md to display correctly in CLion (#10538)

Assets 22

28 Nov 07:47

github-actions

b4203

b742013

b4203

CANN: ROPE operator optimization (#10540)

* [cann] ROPE operator optimization

Co-authored-by: noemotiovon <[email protected]>

Assets 22

27 Nov 22:45

github-actions

b4202

9f91251

b4202

common : fix duplicated file name with hf_repo and hf_file (#10550)

Assets 22

27 Nov 17:51

github-actions

b4201

3ad5451

b4201

Add some minimal optimizations for CDNA (#10498)

* Add some minimal optimizations for CDNA

* ggml_cuda: set launch bounds also for GCN as it helps there too

Assets 22

27 Nov 10:53

github-actions

b4200

46c69e0

b4200

ci : faster CUDA toolkit installation method and use ccache (#10537)

* ci : faster CUDA toolkit installation method and use ccache

* remove fetch-depth

* only pack CUDA runtime on master

Assets 22

27 Nov 08:57

github-actions

b4196

c31ed2a

b4196

vulkan: define all quant data structures in types.comp (#10440)

Assets 22

26 Nov 12:04

github-actions

b4177

811872a

b4177

speculative : simplify the implementation (#10504)

ggml-ci

Assets 21

26 Nov 11:21

github-actions

b4175

7066b4c

b4175

CANN: RoPE and CANCAT operator optimization (#10488)

Co-authored-by: noemotiovon <[email protected]>

Assets 21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b4209

Uh oh!

b4206

Uh oh!

b4205

Uh oh!

b4203

Uh oh!

b4202

Uh oh!

b4201

Uh oh!

b4200

Uh oh!

b4196

Uh oh!

b4177

Uh oh!

b4175

Uh oh!