-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Clang] Prevent mlink-builtin-bitcode
from internalizing the RPC client
#118661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-clang Author: Joseph Huber (jhuber6) ChangesSummary: Full diff: https://github.com/llvm/llvm-project/pull/118661.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp
index cc927f44e0326e..db762b22560f28 100644
--- a/clang/lib/CodeGen/CodeGenAction.cpp
+++ b/clang/lib/CodeGen/CodeGenAction.cpp
@@ -246,7 +246,8 @@ bool BackendConsumer::LinkInModules(llvm::Module *M) {
*M, std::move(LM.Module), LM.LinkFlags,
[](llvm::Module &M, const llvm::StringSet<> &GVS) {
internalizeModule(M, [&GVS](const llvm::GlobalValue &GV) {
- return !GV.hasName() || (GVS.count(GV.getName()) == 0);
+ return !GV.hasName() || (GVS.count(GV.getName()) == 0) ||
+ GV.getName().starts_with("__llvm_rpc_client");
});
});
} else
diff --git a/offload/DeviceRTL/src/Misc.cpp b/offload/DeviceRTL/src/Misc.cpp
index c1df477365bcb6..e6f1a341e4d769 100644
--- a/offload/DeviceRTL/src/Misc.cpp
+++ b/offload/DeviceRTL/src/Misc.cpp
@@ -113,10 +113,7 @@ void *indirectCallLookup(void *HstPtr) {
}
/// The openmp client instance used to communicate with the server.
-/// FIXME: This is marked as 'retain' so that it is not removed via
-/// `-mlink-builtin-bitcode`
-[[gnu::visibility("protected"), gnu::weak,
- gnu::retain]] rpc::Client Client asm("__llvm_rpc_client");
+[[gnu::visibility("protected"), gnu::weak]] rpc::Client Client asm("__llvm_rpc_client");
} // namespace impl
} // namespace ompx
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
I'm surprised there isn't one already. Also seems better if you're going to special case the symbol to special case it by just setting this rather than skipping internalize for it |
I'm not sure I want it in the generic case though, because I don't know the side-effects that LLVM attribute has. This is only necessary because of this stupid |
Ping, this is a little hacky since the real solution is to stop using |
…ient Summary: Currently, we only use `-mmlink-builtin-bitcode` for non-LTO NVIDIA compiliations. THis has the problem that it will internalize the RPC client symbol which needs to be visible to the host. To counteract that, I put `retain` on it, but this also prevents optimizations on the global itself, so the passes we have that remove the symbol don't work on OpenMP anymore. This patch does the dumbest solution, adding a special string check for it in clang. Not the best solution, the runner up would be to habe a clang attribute for `externally_inititliazed` because those can't be internalized, but that might have some unfortunate side-effects. Alternatively we could make NVIDIA compilations do LTO all the time, but that would affect some users and it's harder than I thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -233,6 +233,10 @@ bool InternalizePass::internalizeModule(Module &M) { | |||
else | |||
AlwaysPreserved.insert("__stack_chk_guard"); | |||
|
|||
// Preserve the RPC interface for GPU host callbacks when internalizing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the FIXME suggested, this might be done by introducing a lambda argument when creating this pass, like MustPreserveGV, or extend MustPreserveGV to be able to preserve functions. Then when AMDGPU backend creates this pass, it may put the name of the functions that needs preserving in that lamba.
However, since there is precedence to preserve specific functions by name, I won't put a hard requirement on this PR. It is up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what a previous version of this was, but I don't think it's correct in every case. The ideal solution would just be to remove the -mlink-builtin-bitcode
option entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follows the "hacky" SOTA and gets us a little closer to what we want.
Summary:
Currently, we only use
-mlink-builtin-bitcode
for non-LTO NVIDIAcompiliations. This has the problem that it will internalize the RPC
client symbol which needs to be visible to the host. To counteract that,
I put
retain
on it, but this also prevents optimizations on the globalitself, so the passes we have that remove the symbol don't work on
OpenMP anymore. This patch does the dumbest solution, adding a special
string check for it in clang. Not the best solution, the runner up would
be to have a clang attribute for
externally_initialized
because thosecan't be internalized, but that might have some unfortunate
side-effects. Alternatively we could make NVIDIA compilations do LTO all
the time, but that would affect some users and it's harder than I
thought.