Skip to content

Commit c766229

Browse files
CUDA: fix MMQ stream-k for --split-mode row ggml-org#8167
Co-Authored-By: Johannes Gäßler <[email protected]>
1 parent 1801594 commit c766229

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

ggml-cuda/mmq.cuh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2476,7 +2476,7 @@ static void launch_mul_mat_q(ggml_backend_cuda_context & ctx, const mmq_args & a
24762476

24772477
const dim3 block_nums_mmq(nsm, 1, 1);
24782478

2479-
ggml_cuda_pool & pool = ctx.pool();
2479+
ggml_cuda_pool & pool = ctx.pool(id);
24802480
ggml_cuda_pool_alloc<float> tmp_fixup(pool, block_nums_mmq.x * mmq_x*mmq_y);
24812481

24822482
if (args.ne01 % mmq_y == 0) {

0 commit comments

Comments
 (0)