Skip to content

[BOLT][AArch64] Add support for short LLD thunks/veneers #118422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 3, 2024

Conversation

maksfb
Copy link
Contributor

@maksfb maksfb commented Dec 3, 2024

When a callee function is closer than 256MB from its call site, LLD linker can strategically create a short thunk for the function with a single branch instruction (that covers +/-128MB). Detect and convert such thunks into direct calls in BOLT.

When a callee function is closer than 256MB away from its call site, LLD
linker can strategically create a short thunk for the function with a
single branch instruction (that covers +/-128MB). Detect and convert
such thunks into direct calls in BOLT.
@llvmbot
Copy link
Member

llvmbot commented Dec 3, 2024

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

When a callee function is closer than 256MB from its call site, LLD linker can strategically create a short thunk for the function with a single branch instruction (that covers +/-128MB). Detect and convert such thunks into direct calls in BOLT.


Full diff: https://github.com/llvm/llvm-project/pull/118422.diff

2 Files Affected:

  • (modified) bolt/lib/Passes/VeneerElimination.cpp (+6-5)
  • (modified) bolt/test/AArch64/veneer-lld-abs.s (+38-18)
diff --git a/bolt/lib/Passes/VeneerElimination.cpp b/bolt/lib/Passes/VeneerElimination.cpp
index b386b2756a2b87..99d0ffeca8cc2b 100644
--- a/bolt/lib/Passes/VeneerElimination.cpp
+++ b/bolt/lib/Passes/VeneerElimination.cpp
@@ -46,16 +46,17 @@ Error VeneerElimination::runOnFunctions(BinaryContext &BC) {
     if (BF.isIgnored())
       continue;
 
+    MCInst &FirstInstruction = *(BF.begin()->begin());
     const MCSymbol *VeneerTargetSymbol = 0;
     uint64_t TargetAddress;
-    if (BC.MIB->matchAbsLongVeneer(BF, TargetAddress)) {
+    if (BC.MIB->isTailCall(FirstInstruction)) {
+      VeneerTargetSymbol = BC.MIB->getTargetSymbol(FirstInstruction);
+    } else if (BC.MIB->matchAbsLongVeneer(BF, TargetAddress)) {
       if (BinaryFunction *TargetBF =
               BC.getBinaryFunctionAtAddress(TargetAddress))
         VeneerTargetSymbol = TargetBF->getSymbol();
-    } else {
-      MCInst &FirstInstruction = *(BF.begin()->begin());
-      if (BC.MIB->hasAnnotation(FirstInstruction, "AArch64Veneer"))
-        VeneerTargetSymbol = BC.MIB->getTargetSymbol(FirstInstruction, 1);
+    } else if (BC.MIB->hasAnnotation(FirstInstruction, "AArch64Veneer")) {
+      VeneerTargetSymbol = BC.MIB->getTargetSymbol(FirstInstruction, 1);
     }
 
     if (!VeneerTargetSymbol)
diff --git a/bolt/test/AArch64/veneer-lld-abs.s b/bolt/test/AArch64/veneer-lld-abs.s
index d10ff46e2cb016..5aca616d5d0c3b 100644
--- a/bolt/test/AArch64/veneer-lld-abs.s
+++ b/bolt/test/AArch64/veneer-lld-abs.s
@@ -1,5 +1,5 @@
-## Check that llvm-bolt correctly recognizes long absolute thunks generated
-## by LLD.
+## Check that llvm-bolt correctly recognizes veneers/thunks for absolute code
+## generated by LLD.
 
 # RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
 # RUN: %clang %cflags -fno-PIC -no-pie %t.o -o %t.exe -nostdlib \
@@ -12,40 +12,60 @@
 
 .text
 .balign 4
-.global foo
-.type foo, %function
-foo:
-  adrp x1, foo
+.global far_function
+.type far_function, %function
+far_function:
   ret
-.size foo, .-foo
+.size far_function, .-far_function
+
+.global near_function
+.type near_function, %function
+near_function:
+  ret
+.size near_function, .-near_function
 
 .section ".mytext", "ax"
 .balign 4
 
-.global __AArch64AbsLongThunk_foo
-.type __AArch64AbsLongThunk_foo, %function
-__AArch64AbsLongThunk_foo:
+## This version of a thunk is always generated by LLD for function calls
+## spanning more than 256MB.
+.global __AArch64AbsLongThunk_far_function
+.type __AArch64AbsLongThunk_far_function, %function
+__AArch64AbsLongThunk_far_function:
   ldr x16, .L1
   br x16
-# CHECK-INPUT-LABEL: <__AArch64AbsLongThunk_foo>:
+# CHECK-INPUT-LABEL: <__AArch64AbsLongThunk_far_function>:
 # CHECK-INPUT-NEXT:    ldr
 # CHECK-INPUT-NEXT:    br
 .L1:
-  .quad foo
-.size __AArch64AbsLongThunk_foo, .-__AArch64AbsLongThunk_foo
+  .quad far_function
+.size __AArch64AbsLongThunk_far_function, .-__AArch64AbsLongThunk_far_function
+
+## If a callee is closer than 256MB away, LLD may generate a thunk with a direct
+## jump to the callee. Note, that the name might still include "AbSLong".
+.global __AArch64AbsLongThunk_near_function
+.type __AArch64AbsLongThunk_near_function, %function
+__AArch64AbsLongThunk_near_function:
+  b near_function
+# CHECK-INPUT-LABEL: <__AArch64AbsLongThunk_near_function>:
+# CHECK-INPUT-NEXT:    b {{.*}} <near_function>
+.size __AArch64AbsLongThunk_near_function, .-__AArch64AbsLongThunk_near_function
 
-## Check that the thunk was removed from .text and _start() calls foo()
+## Check that thunks were removed from .text, and _start calls functions
 ## directly.
 
-# CHECK-OUTPUT-NOT: __AArch64AbsLongThunk_foo
+# CHECK-OUTPUT-NOT: __AArch64AbsLongThunk_{{.*}}
 
 .global _start
 .type _start, %function
 _start:
 # CHECK-INPUT-LABEL:  <_start>:
 # CHECK-OUTPUT-LABEL: <_start>:
-  bl __AArch64AbsLongThunk_foo
-# CHECK-INPUT-NEXT:     bl {{.*}} <__AArch64AbsLongThunk_foo>
-# CHECK-OUTPUT-NEXT:    bl {{.*}} <foo>
+  bl __AArch64AbsLongThunk_far_function
+  bl __AArch64AbsLongThunk_near_function
+# CHECK-INPUT-NEXT:     bl {{.*}} <__AArch64AbsLongThunk_far_function>
+# CHECK-INPUT-NEXT:     bl {{.*}} <__AArch64AbsLongThunk_near_function>
+# CHECK-OUTPUT-NEXT:    bl {{.*}} <far_function>
+# CHECK-OUTPUT-NEXT:    bl {{.*}} <near_function>
   ret
 .size _start, .-_start

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too

@maksfb maksfb merged commit d5956fb into llvm:main Dec 3, 2024
7 checks passed
@maksfb maksfb deleted the gh-short-veneer branch March 6, 2025 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants