Skip to content

Commit 4d3465c

Browse files
authored
ggml: Fix data race in ggml threadpool (#11736)
After the barrier in last iteration is executed, still the loop termination condition will be executed. However main thread can destroy the cgraph object and its nodes already, then another thread will access it, but the thing is already gone. Also trouble can happen when n_nodes == 0 or abort is called, but I'm not sure if the prior situation is possible. Last syncronization should be done after the loop to ensure the cgraph/cplan won't be accessed after the main thread exits from the function.
1 parent d80be89 commit 4d3465c

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

ggml/src/ggml-cpu/ggml-cpu.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13856,9 +13856,13 @@ static thread_ret_t ggml_graph_compute_thread(void * data) {
1385613856
tp->ec = GGML_STATUS_ABORTED;
1385713857
}
1385813858

13859-
ggml_barrier(state->threadpool);
13859+
if (node_n + 1 < cgraph->n_nodes) {
13860+
ggml_barrier(state->threadpool);
13861+
}
1386013862
}
1386113863

13864+
ggml_barrier(state->threadpool);
13865+
1386213866
return 0;
1386313867
}
1386413868

0 commit comments

Comments
 (0)