Skip to content

[BOLT] Set call to continuation count in pre-aggregated profile in BAT mode #115334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

aaupov
Copy link
Contributor

@aaupov aaupov commented Nov 7, 2024

Follow-up to #109486.

Encode call continuation landing pads in BAT to determine if a branch
target is not an entry point or a landing pad. In this case, consider
the branch to be a return and assign its count to call to continuation.

Call continuation landing pads are added to secondary entry points
table, offset by the function size. This way, old BOLT versions would
ignore them (parse and never use them as secondary entry points).

Test Plan: updated callcont-fallthru.s

Created using spr 1.3.4
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2024

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

Use landing pad information encoded in BAT
(#114602) to determine if
branch target is not an entry point or landing pad.

Test Plan: updated callcont-fallthru.s


Full diff: https://github.com/llvm/llvm-project/pull/115334.diff

4 Files Affected:

  • (modified) bolt/include/bolt/Profile/BoltAddressTranslation.h (+3)
  • (modified) bolt/lib/Profile/BoltAddressTranslation.cpp (+11)
  • (modified) bolt/lib/Profile/DataAggregator.cpp (+1-3)
  • (modified) bolt/test/X86/callcont-fallthru.s (+25-10)
diff --git a/bolt/include/bolt/Profile/BoltAddressTranslation.h b/bolt/include/bolt/Profile/BoltAddressTranslation.h
index aaf361b093bfdb..77398eb7e82347 100644
--- a/bolt/include/bolt/Profile/BoltAddressTranslation.h
+++ b/bolt/include/bolt/Profile/BoltAddressTranslation.h
@@ -186,6 +186,9 @@ class BoltAddressTranslation {
   const static uint32_t LPENTRY = 0x1;
 
 public:
+  /// Returns whether a given \p Offset is a secondary entry point or a landing pad in function with address \p Address.
+  bool isSecondaryEntry(uint64_t Address, uint32_t Offset) const;
+
   /// Map basic block input offset to a basic block index and hash pair.
   class BBHashMapTy {
     struct EntryTy {
diff --git a/bolt/lib/Profile/BoltAddressTranslation.cpp b/bolt/lib/Profile/BoltAddressTranslation.cpp
index a423315df0793b..7ddf174f15fb8b 100644
--- a/bolt/lib/Profile/BoltAddressTranslation.cpp
+++ b/bolt/lib/Profile/BoltAddressTranslation.cpp
@@ -649,5 +649,16 @@ BoltAddressTranslation::translateSymbol(const BinaryContext &BC,
   return std::pair(ParentBF, SecondaryEntryId);
 }
 
+bool BoltAddressTranslation::isSecondaryEntry(uint64_t Address,
+                                              uint32_t Offset) const {
+  auto FunctionIt = SecondaryEntryPointsMap.find(Address);
+  if (FunctionIt == SecondaryEntryPointsMap.end())
+    return false;
+  const std::vector<uint32_t> &Offsets = FunctionIt->second;
+  uint64_t InputOffset = translate(Address, Offset, /*IsBranchSrc*/ false);
+  auto OffsetIt = llvm::lower_bound(Offsets, InputOffset << 1);
+  return OffsetIt != Offsets.end() && *OffsetIt >> 1 == InputOffset;
+}
+
 } // namespace bolt
 } // namespace llvm
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index f667fcd4f049ae..33a9c04422741d 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -794,10 +794,8 @@ bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
     if (!Offset)
       return false;
 
-    // FIXME: support BAT case where the function might be in empty state
-    // (split fragments declared non-simple).
     if (!Func.hasCFG())
-      return false;
+      return BAT && !BAT->isSecondaryEntry(Func.getAddress(), Offset);
 
     // The offset should not be an entry point or a landing pad.
     const BinaryBasicBlock *ContBB = Func.getBasicBlockAtOffset(Offset);
diff --git a/bolt/test/X86/callcont-fallthru.s b/bolt/test/X86/callcont-fallthru.s
index 641beb79ecf2ac..467600abfb6940 100644
--- a/bolt/test/X86/callcont-fallthru.s
+++ b/bolt/test/X86/callcont-fallthru.s
@@ -5,7 +5,6 @@
 # RUN: link_fdata %s %t %t.pa1 PREAGG
 # RUN: link_fdata %s %t %t.pa2 PREAGG2
 # RUN: link_fdata %s %t %t.pa3 PREAGG3
-# RUN: link_fdata %s %t %t.pa4 PREAGG4
 
 ## Check normal case: fallthrough is not LP or secondary entry.
 # RUN: llvm-strip --strip-unneeded %t -o %t.exe
@@ -18,12 +17,12 @@
 # RUN:   --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK2
 
 ## Check that we don't treat secondary entry points as call continuation sites.
-# RUN: llvm-bolt %t --pa -p %t.pa3 -o %t.out \
+# RUN: llvm-bolt %t --pa -p %t.pa3 -o %t.out3 \
 # RUN:   --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK3
 
 ## Check fallthrough to a landing pad case.
-# RUN: llvm-bolt %t.exe --pa -p %t.pa4 -o %t.out --enable-bat \
-# RUN:   --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK4
+# RUN: llvm-bolt %t.exe --pa -p %t.pa3 -o %t.out4 --enable-bat \
+# RUN:   --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK3
 
 ## Check that a landing pad is emitted in BAT
 # RUN: llvm-bat-dump %t.out --dump-all | FileCheck %s --check-prefix=CHECK-BAT
@@ -31,6 +30,26 @@
 # CHECK-BAT:      1 secondary entry points:
 # CHECK-BAT-NEXT: 0x38 (lp)
 
+## Check BAT case of a fallthrough to a call continuation
+# link_fdata %s %t.out3 %t.pa.bat PREAGG
+# RUN: perf2bolt %t.out3 -p %t.pa.bat --pa -o %t.fdata
+# RUN: FileCheck %s --check-prefix=CHECK-BAT-CC --input-file=%t.fdata
+# CHECK-BAT-CC: main
+
+## Check BAT case of a fallthrough to a secondary entry point or a landing pad
+# link_fdata %s %t.out3 %t.pa.bat2 PREAGG3
+
+## Secondary entry
+# RUN: perf2bolt %t.out3 -p %t.pa.bat2 --pa -o %t.fdata2
+# RUN: FileCheck %s --check-prefix=CHECK-BAT-ENTRY --input-file=%t.fdata2
+# CHECK-BAT-ENTRY: main
+
+## Landing pad
+# RUN: llvm-strip --strip-unneeded %t.out3
+# RUN: perf2bolt %t.out3 -p %t.pa.bat2 --pa -o %t.fdata3
+# RUN: FileCheck %s --check-prefix=CHECK-BAT-LP --input-file=%t.fdata3
+# CHECK-BAT-LP: main
+
   .globl foo
   .type foo, %function
 foo:
@@ -83,16 +102,12 @@ Ltmp4:
 # CHECK2:      callq foo
 # CHECK2-NEXT: count: 3
 
-## Target is a secondary entry point
+## Target is a secondary entry point (non-stripped) or a landing pad
+## (strip-unneeded)
 # PREAGG3: B X:0 #Ltmp3# 2 0
 # CHECK3:      callq foo
 # CHECK3-NEXT: count: 0
 
-## Target is a landing pad
-# PREAGG4: B X:0 #Ltmp3# 2 0
-# CHECK4:      callq puts@PLT
-# CHECK4-NEXT: count: 0
-
 Ltmp3:
 	cmpl	$0x0, -0x18(%rbp)
 Ltmp3_br:

Copy link

github-actions bot commented Nov 7, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 30af6fb163add17a6be515200881afdff91d213a 4310965f2e0fcf28d4a7f33c9e91cbeddeb2cae4 --extensions cpp,h -- bolt/include/bolt/Profile/BoltAddressTranslation.h bolt/lib/Core/BinaryEmitter.cpp bolt/lib/Profile/BoltAddressTranslation.cpp bolt/lib/Profile/DataAggregator.cpp
View the diff from clang-format here.
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 32c1d82c8c..5fb9b70e26 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -816,7 +816,8 @@ bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
       // Check if offset is a secondary entry point or a call continuation
       // landing pad (offset shifted by function size).
       return !BAT->getSecondaryEntryPointId(Address, InputOffset) &&
-             !BAT->getSecondaryEntryPointId(Address, Func.getSize() + InputOffset);
+             !BAT->getSecondaryEntryPointId(Address,
+                                            Func.getSize() + InputOffset);
     }
 
     // The offset should not be an entry point or a landing pad.

@aaupov aaupov changed the base branch from users/aaupov/spr/main.boltwip-support-ret-converted-call-cont-fallthru-in-bat-mode to main November 27, 2024 15:08
…e them in secondary entry points table

Created using spr 1.3.4
@aaupov aaupov changed the title [BOLT][WIP] Support ret-converted call-cont fallthru in BAT mode [BOLT] Set call to continuation count in pre-aggregated profile in BAT mode Nov 27, 2024
Created using spr 1.3.4
@aaupov aaupov marked this pull request as draft November 27, 2024 18:24
@aaupov aaupov closed this Jun 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants