Skip to content

Commit 90c12e6

Browse files
committed
ggml : do not use BLAS with ggml_mul_mat_id
1 parent ea4402b commit 90c12e6

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

ggml.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9508,8 +9508,11 @@ static bool ggml_compute_forward_mul_mat_use_blas(
95089508
const int64_t ne0 = dst->ne[0];
95099509
const int64_t ne1 = dst->ne[1];
95109510

9511+
// NOTE: with GGML_OP_MUL_MAT_ID we don't want to go through the BLAS branch because it will dequantize (to_float)
9512+
// all the experts for each batch element and the processing would become incredibly slow
95119513
// TODO: find the optimal values for these
9512-
if (ggml_is_contiguous(src0) &&
9514+
if (dst->op != GGML_OP_MUL_MAT_ID &&
9515+
ggml_is_contiguous(src0) &&
95139516
ggml_is_contiguous(src1) &&
95149517
//src0->type == GGML_TYPE_F32 &&
95159518
src1->type == GGML_TYPE_F32 &&

0 commit comments

Comments
 (0)