Skip to content

Commit c84a0d7

Browse files
committed
kompute : disable GPU offload for Mixtral
We haven't implemented the necessary GPU kernels yet. Fixes this crash: ggml_vk_graph_compute: error: unsupported op 'ARGSORT' GGML_ASSERT: /home/jared/src/forks/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml-kompute.cpp:1508: !"unsupported op" Signed-off-by: Jared Van Bortel <[email protected]>
1 parent f65df7e commit c84a0d7

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

llama.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4735,6 +4735,7 @@ static int llama_model_load(const std::string & fname, llama_model & model, llam
47354735
#ifdef GGML_USE_KOMPUTE
47364736
if (params.n_gpu_layers > 0 && (
47374737
!(model.arch == LLM_ARCH_LLAMA || model.arch == LLM_ARCH_FALCON)
4738+
|| model.hparams.n_expert > 0
47384739
|| !(
47394740
model.ftype == LLAMA_FTYPE_ALL_F32 ||
47404741
model.ftype == LLAMA_FTYPE_MOSTLY_F16 ||

0 commit comments

Comments
 (0)