Skip to content

[OpenMP] Fix node destruction race in __kmpc_omp_taskwait_deps_51 #86130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

uweigand
Copy link
Member

The __kmpc_omp_taskwait_deps_51 allocates a kmp_depnode_t node on its stack, and there is currently a race condition where another thread might still be accessing that node after the function has returned and its stack frame was released.

While the function does wait until the node's npredecessors count has reached zero before exiting, there is still a window where the function that last decremented the npredecessors count assumes the node is still accessible.

For heap-allocated kmp_depnode_t nodes, this normally works via a separate ndeps count that only reaches zero at the point where no accesses to the node are expected at all; in fact, at this point the heap allocation will be freed.

For this case of a stack-allocated kmp_depnode_t node, it therefore makes sense to similarly respect the ndeps count; we need to wait until this reaches 1 (not 0, because it is not heap-allocated so there's always one extra count to prevent it from being freed), before we can safely deallocate our stack frame.

As this is expected to be a short race window of only a few instructions, it should be fine to just use a busy wait loop checking the ndeps count.

Fixes: #85963

The __kmpc_omp_taskwait_deps_51 allocates a kmp_depnode_t node on
its stack, and there is currently a race condition where another
thread might still be accessing that node after the function has
returned and its stack frame was released.

While the function does wait until the node's npredecessors count
has reached zero before exiting, there is still a window where the
function that last decremented the npredecessors count assumes the
node is still accessible.

For heap-allocated kmp_depnode_t nodes, this normally works via a
separate ndeps count that only reaches zero at the point where no
accesses to the node are expected at all; in fact, at this point
the heap allocation will be freed.

For this case of a stack-allocated kmp_depnode_t node, it therefore
makes sense to similarly respect the ndeps count; we need to wait
until this reaches 1 (not 0, because it is not heap-allocated so
there's always one extra count to prevent it from being freed),
before we can safely deallocate our stack frame.

As this is expected to be a short race window of only a few
instructions, it should be fine to just use a busy wait loop
checking the ndeps count.

Fixes: llvm#85963
@llvmbot llvmbot added the openmp:libomp OpenMP host runtime label Mar 21, 2024
@uweigand uweigand requested review from jdoerfert and jhuber6 March 21, 2024 16:02
// Wait until the last __kmp_release_deps is finished before we free the
// current stack frame holding the "node" variable; once its nrefs count
// reaches 1, we're sure nobody else can try to reference it again.
while (node.dn.nrefs > 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right way to do that, especially using task yield. @jprotze might have a better idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative solution would be to just not use a stack-allocated node, but allocate it on the heap like everywhere else. Then there'd still be a race which thread would be the one to free it, but that race would be harmless ... Let me know if you'd like me to try this (or any other) approach.

@jprotze
Copy link
Collaborator

jprotze commented Mar 28, 2024

This fix makes sense to me.

Looking at the code, I realized that the current implementation is still stupid. The function should really create an empty task to be inserted into the DAG, if has_no_wait is set.

@uweigand uweigand merged commit b999e63 into llvm:main Mar 28, 2024
@uweigand uweigand deleted the openmp-depnode-race branch March 28, 2024 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[OpenMP] Race condition in __kmpc_omp_taskwait_deps_51
4 participants