-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Libomptarget] Remove global ctor and use reference counting #80499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -20,25 +20,37 @@ | |||
extern void llvm::omp::target::ompt::connectLibrary(); | |||
#endif | |||
|
|||
__attribute__((constructor(101))) void init() { | |||
static std::mutex PluginMtx; | |||
static std::atomic<uint32_t> RefCount = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this atomic, it seems all accesses are guarded by PluginMtx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought that myself, was erring on the side of caution in case someone somewhere wants to read it, but likely a non-issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather put the counter into PM, and wrap PM into ProtectedObj? That makes it clear. I dislike complexity that is not needed, but I get why you went here. Still, this way one thinks the RefCount is accessed concurrently. ProtectedObj has benefits over a standalone Mtx too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems more complicated to me, since we'd need to somehow put the reference count into it, whereas as far as I'm aware the protected object simply wraps a mutex around a single object.
delete PM; | ||
void deinitRuntime() { | ||
std::scoped_lock<decltype(PluginMtx)> Lock(PluginMtx); | ||
if (PM == nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a bad idea as it might hide problems. assert(PM) rather. Maybe we should not even do PM as a pointer? just init/deinit on an object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, honestly it's probably correct to just assert that deinit
should never be called without initializing it first, rather than handling it. We expect these things to be paired up, so we should assert as such.
|
||
// CHECK: PASS | ||
printf("PASS\n"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is not perfect but I'm unsure how to improve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could add a test that dlopen
s a bunch of different libraries in parallel or something. Realistically this needs to be a unit test, but we're missing a bit of infrastructure right now. Figured that would come later.
ping |
It's weird to have a mutex and an atomic reference count. Suggest we drop the mutex and just reference count it. fetch_add or cas into it and finding a zero says you're the first thread to get there. I think std::mutex has a constructor, in which case you're replacing a global ctor with a global ctor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refcount it correctly and drop the mutex, much simpler (in terms of runtime behaviour) and removes failure modes (in terms of runtime behaviour)
edit: ^ voted down on that in favour of more complicated runtime behaviour in exchange for more familiar programming constructs
@@ -48,7 +48,8 @@ void deinitRuntime() { | |||
std::scoped_lock<decltype(PluginMtx)> Lock(PluginMtx); | |||
assert(PM && "Runtime not initialized"); | |||
|
|||
if (RefCount-- == 0) { | |||
RefCount--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wrong - you're separating the decrement from the load, should be a fetch_sub followed by a branch on the old value
I've been persuaded that the rest of the team thinks mutex is much simpler than atomics so we should go with that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The init/deinit code should not be performance critical, so i think having a mutex is ok. Now that RefCount is not atomic, there should not be any confusion about any future usage, it would have to be protected by the mutex.
In addition, I did some testing with OMPT and this patch since OMPT has some ordering constraints. That appeared ok too.
4547c7d
to
ec4a76a
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460 Fix decrement
) Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on llvm#80460 Change-Id: I70815457fab9b5d68db8e48b3b5e1c75951c05f5
Summary:
Currently we rely on global constructors to initialize and shut down the
OpenMP runtime library and plugin manager. This causes some issues
because we do not have a defined lifetime that we can rely on to release
and allocate resources. This patch instead adds some simple reference
counted initialization and deinitialization function.
A future patch will use the
deinit
interface to more intelligentlyhandle plugin deinitilization. Right now we do nothing and rely on
atexit
inside of the plugins to tear them down. This isn't greatbecause it limits our ability to control these things.
Note that I made the
__tgt_register_lib
functions do theinitialization instead of adding calls to the new runtime functions in
the linker wrapper. The reason for this is because in the past it's been
easier to not introduce a new function call, since sometimes the user's
compiler will link against an older
libomptarget
. Maybe if we changethe name with offloading in the future we can simplify this.
Depends on #80460