Skip to content

Commit e8bec6f

Browse files
committed
kompute : disable GPU offload for Mixtral
We haven't implemented the necessary GPU kernels yet. Fixes this crash: ggml_vk_graph_compute: error: unsupported op 'ARGSORT' GGML_ASSERT: /home/jared/src/forks/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml-kompute.cpp:1508: !"unsupported op"
1 parent 16eae6e commit e8bec6f

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

src/llama.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8912,6 +8912,7 @@ static int llama_model_load(const std::string & fname, llama_model & model, llam
89128912
#ifdef GGML_USE_KOMPUTE
89138913
if (params.n_gpu_layers > 0 && (
89148914
!(model.arch == LLM_ARCH_LLAMA || model.arch == LLM_ARCH_FALCON)
8915+
|| model.hparams.n_expert > 0
89158916
|| !(
89168917
model.ftype == LLAMA_FTYPE_ALL_F32 ||
89178918
model.ftype == LLAMA_FTYPE_MOSTLY_F16 ||

0 commit comments

Comments
 (0)