You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -10,12 +10,14 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
10
10
11
11
### Recent API changes
12
12
13
+
-[2024 Mar 13] Add `llama_synchronize()` + `llama_context_params.n_ubatch`https://github.com/ggerganov/llama.cpp/pull/6017
13
14
-[2024 Mar 8]`llama_kv_cache_seq_rm()` returns a `bool` instead of `void`, and new `llama_n_seq_max()` returns the upper limit of acceptable `seq_id` in batches (relevant when dealing with multiple sequences) https://github.com/ggerganov/llama.cpp/pull/5328
14
15
-[2024 Mar 4] Embeddings API updated https://github.com/ggerganov/llama.cpp/pull/5796
15
16
-[2024 Mar 3]`struct llama_context_params`https://github.com/ggerganov/llama.cpp/pull/5849
16
17
17
18
### Hot topics
18
19
20
+
- Multi-GPU pipeline parallelizm support https://github.com/ggerganov/llama.cpp/pull/6017
19
21
- Looking for contributions to add Deepseek support: https://github.com/ggerganov/llama.cpp/issues/5981
0 commit comments