Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b4371
SYCL: Migrate away from deprecated ggml_tensor->backend (#10840) * Migrate to tensor->buffer for checking backend buffer type: 1 * SYCL: common.cpp try to migrate away from tensor->backend * SYCL: fix assertions and add proper comments * SYCL: remove extra space * SYCL: Add back static to ggml_backend_buffer_is_sycl_split function * SYCL: Add pragma directive to suppress warning spam * SYCL: Integrate debug logs with GGML_LOG and other fixes * Revert "SYCL: Integrate debug logs with GGML_LOG and other fixes" This reverts commit 2607b7de0f0d2f4f1f690226f86fa861aa39cb97. Let's keep the current SYCL specific logging mechanism for now * SYCL: Use GGML_SYCL_DEBUG after reverting * SYCL: reg_get_proc_address func, update to the current func signature * SYCL: Refactor SYCL buffer checks in ggml_sycl_cpy_tensor_2d
b4369
ggml : add test for SVE and disable when it fails (#10906)
b4368
convert : fix RWKV v6 model conversion (#10913) * Enable --no-context-shift for llama-perplexity example Signed-off-by: Molly Sophia <[email protected]> * RWKV 6: Fix error in ggml_cuda_op_bin_bcast Signed-off-by: Molly Sophia <[email protected]> --------- Signed-off-by: Molly Sophia <[email protected]>
b4367
clip : disable GPU support (#10896) ggml-ci
b4366
llama : minor grammar refactor (#10897) ggml-ci
b4365
tts : small QoL for easy model fetch (#10903)
b4363
ggml: fix arm build with gcc (#10895) Signed-off-by: Adrien Gallouët <[email protected]>
b4362
llama : fix Roberta embeddings (#10856) * fix: Use gpt2 tokenizer for roberta and add eos/bos tokens Branch: RobertaTokenizer Signed-off-by: Gabe Goodhart <[email protected]> * fixes to position embeddings Signed-off-by: Sukriti-Sharma4 <[email protected]> * map roberta-bpe to gpt-2 Signed-off-by: Sukriti-Sharma4 <[email protected]> * fix linting Signed-off-by: Sukriti-Sharma4 <[email protected]> --------- Signed-off-by: Gabe Goodhart <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]>
b4361
convert : Add support for Microsoft Phi-4 model (#10817) * convert : use GPT2 vocab for Phi-4 model * convert : use null value of sliding_window to distinguish Phi-4 from other PHI3-based models * llama : do not use sliding window attention mask for Phi-4 model --------- Co-authored-by: Stanisław Szymczyk <[email protected]>
b4360
tests: disable GGUF test for bad value size (#10886)