Skip to content

[Clang][AMDGPU] Enable avail-extern-to-local for ThinLTO in HIP #134476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Apr 5, 2025

In HIP, the Clang driver already sets force-import-all when ThinLTO is
enabled. As a result, all imported functions get the available_externally
linkage. However, these functions are later removed by the
EliminateAvailableExternallyPass, effectively undoing the forced import and
eventually leading to link errors.

The EliminateAvailableExternallyPass provides an option to convert
available_externally functions into local functions, renaming them to avoid
conflicts. This behavior is exactly what we need for HIP. This PR enables that
option (avail-extern-to-local) alongside force-import-all when ThinLTO is
used.

With this change, ThinLTO almost works correctly on AMDGPU. The only remaining
issue is an undefined reference to __assert_fail, but that falls outside the
scope of this PR.

Copy link
Contributor Author

shiltian commented Apr 5, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Apr 5, 2025
@shiltian shiltian requested review from scchan, yxsamliu and jhuber6 April 5, 2025 02:59
@llvmbot
Copy link
Member

llvmbot commented Apr 5, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Shilei Tian (shiltian)

Changes

In HIP, the Clang driver already sets force-import-all when ThinLTO is
enabled. As a result, all imported functions get the available_externally
linkage. However, these functions are later removed by the
EliminateAvailableExternallyPass, effectively undoing the forced import and
eventually leading to link errors.

The EliminateAvailableExternallyPass provides an option to convert
available_externally functions into local functions, renaming them to avoid
conflicts. This behavior is exactly what we need for HIP. This PR enables that
option (avail-extern-to-local) alongside force-import-all when ThinLTO is
used.

With this change, ThinLTO almost works correctly on AMDGPU. The only remaining
issue is an undefined reference to __assert_fail, but that falls outside the
scope of this PR.


Full diff: https://github.com/llvm/llvm-project/pull/134476.diff

2 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+3-1)
  • (added) clang/test/Driver/hip-thinlto.hip (+6)
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..0c25a40d94a10 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -103,8 +103,10 @@ void AMDGCN::Linker::constructLldCommand(Compilation &C, const JobAction &JA,
   // ToDo: Remove this option after AMDGPU backend supports ISA-level linking.
   // Since AMDGPU backend currently does not support ISA-level linking, all
   // called functions need to be imported.
-  if (IsThinLTO)
+  if (IsThinLTO) {
     LldArgs.push_back(Args.MakeArgString("-plugin-opt=-force-import-all"));
+    LldArgs.push_back(Args.MakeArgString("-plugin-opt=-avail-extern-to-local"));
+  }
 
   for (const Arg *A : Args.filtered(options::OPT_mllvm)) {
     LldArgs.push_back(
diff --git a/clang/test/Driver/hip-thinlto.hip b/clang/test/Driver/hip-thinlto.hip
new file mode 100644
index 0000000000000..0f5ff58ac547f
--- /dev/null
+++ b/clang/test/Driver/hip-thinlto.hip
@@ -0,0 +1,6 @@
+// RUN: %clang -foffload-lto=thin --offload-link %s -### 2>&1 | FileCheck %s
+
+// CHECK: "-plugin-opt=-force-import-all"{{.*}}"-plugin-opt=-avail-extern-to-local"
+int main(int, char *[]) {
+  return 0;
+}

@shiltian
Copy link
Contributor Author

shiltian commented Apr 6, 2025

#134541 resolves the missing __assert_fail issue.

@shiltian shiltian force-pushed the users/shiltian/hip-thinlto-disable-elim-avail-extern branch from 1423792 to b537a91 Compare April 6, 2025 15:24
@shiltian
Copy link
Contributor Author

shiltian commented Apr 6, 2025

Hmm, the failure is weird. I can't reproduce it locally. Change it to something else and hopefully this can "resolve" the issue.

@shiltian shiltian force-pushed the users/shiltian/hip-thinlto-disable-elim-avail-extern branch from b537a91 to 4602513 Compare April 8, 2025 02:35
@shiltian
Copy link
Contributor Author

shiltian commented Apr 8, 2025

I literally have no idea why the newly added test case fails…It can all pass in zsh, bash, and sh locally.

@jhuber6
Copy link
Contributor

jhuber6 commented Apr 8, 2025

Implicit arguments reorganizing stuff?

@shiltian
Copy link
Contributor Author

shiltian commented Apr 8, 2025

Implicit arguments reorganizing stuff?

It doesn't look like that. The failure in BB looks like it outputs nothing.

In HIP, the Clang driver already sets `force-import-all` when ThinLTO is
enabled. As a result, all imported functions get the `available_externally`
linkage. However, these functions are later removed by the
`EliminateAvailableExternallyPass`, effectively undoing the forced import and
eventually leading to link errors.

The `EliminateAvailableExternallyPass` provides an option to convert
`available_externally` functions into local functions, renaming them to avoid
conflicts. This behavior is exactly what we need for HIP. This PR enables that
option (`avail-extern-to-local`) alongside `force-import-all` when ThinLTO is
used.

With this change, ThinLTO almost works correctly on AMDGPU. The only remaining
issue is an undefined reference to `__assert_fail`, but that falls outside the
scope of this PR.
@shiltian
Copy link
Contributor Author

ping

@shiltian shiltian merged commit fcf0f81 into main Apr 14, 2025
11 checks passed
@shiltian shiltian deleted the users/shiltian/hip-thinlto-disable-elim-avail-extern branch April 14, 2025 15:52
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…vm#134476)

In HIP, the Clang driver already sets `force-import-all` when ThinLTO is
enabled. As a result, all imported functions get the
`available_externally`
linkage. However, these functions are later removed by the
`EliminateAvailableExternallyPass`, effectively undoing the forced
import and
eventually leading to link errors.

The `EliminateAvailableExternallyPass` provides an option to convert
`available_externally` functions into local functions, renaming them to
avoid
conflicts. This behavior is exactly what we need for HIP. This PR
enables that
option (`avail-extern-to-local`) alongside `force-import-all` when
ThinLTO is
used.

With this change, ThinLTO almost works correctly on AMDGPU. The only
remaining
issue is an undefined reference to `__assert_fail`, but that falls
outside the
scope of this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants