Skip to content

[BOLT] Never call fixBranches() on non-simple functions #141112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 22, 2025

Conversation

maksfb
Copy link
Contributor

@maksfb maksfb commented May 22, 2025

We should never call fixBranches() on a function with invalid CFG. E.g., ValidateInternalCalls modifies CFG for its internal analysis purposes. At the same time, it marks the function as non-simple with an assumption that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can lead to all sorts of issues, including assertions firing in fixBranches().

The fix is to use the original size for non-simple functions in calculateEmittedSize() since we are supposed to emit the function unmodified. Additionally, add an assertion at the start of fixBranches().

We should never call fixBranches() on a function with invalid CFG. E.g.,
ValidateInternalCalls modifies CFG for its internal analysis purposes.
At the same time, it marks the function as non-simple with an assumption
that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can
lead to all sorts of issues, including assertions firing in
fixBranches().

The fix is to use the original size for non-simple functions in
calculateEmittedSize() since we are supposed to emit the function
unmodified. Additionally, add an assertion at the start of
fixBranches().
@llvmbot
Copy link
Member

llvmbot commented May 22, 2025

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

We should never call fixBranches() on a function with invalid CFG. E.g., ValidateInternalCalls modifies CFG for its internal analysis purposes. At the same time, it marks the function as non-simple with an assumption that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can lead to all sorts of issues, including assertions firing in fixBranches().

The fix is to use the original size for non-simple functions in calculateEmittedSize() since we are supposed to emit the function unmodified. Additionally, add an assertion at the start of fixBranches().


Full diff: https://github.com/llvm/llvm-project/pull/141112.diff

3 Files Affected:

  • (modified) bolt/lib/Core/BinaryContext.cpp (+4)
  • (modified) bolt/lib/Core/BinaryFunction.cpp (+2)
  • (added) bolt/test/X86/fix-branches-broken-cfg.s (+42)
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index 73059ef23775a..36b789d429363 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -2425,6 +2425,10 @@ BinaryContext::createInstructionPatch(uint64_t Address,
 
 std::pair<size_t, size_t>
 BinaryContext::calculateEmittedSize(BinaryFunction &BF, bool FixBranches) {
+  // Use the original size for non-simple functions.
+  if (!BF.isSimple() || BF.isIgnored())
+    return std::make_pair(BF.getSize(), 0);
+
   // Adjust branch instruction to match the current layout.
   if (FixBranches)
     BF.fixBranches();
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 22bd8f8ddb110..ceb9390b546e1 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -3579,6 +3579,8 @@ bool BinaryFunction::validateCFG() const {
 }
 
 void BinaryFunction::fixBranches() {
+  assert(isSimple() && "Expected function with valid CFG.");
+
   auto &MIB = BC.MIB;
   MCContext *Ctx = BC.Ctx.get();
 
diff --git a/bolt/test/X86/fix-branches-broken-cfg.s b/bolt/test/X86/fix-branches-broken-cfg.s
new file mode 100644
index 0000000000000..0de1d7ea92f9b
--- /dev/null
+++ b/bolt/test/X86/fix-branches-broken-cfg.s
@@ -0,0 +1,42 @@
+## Check that fixBranches() is not invoked on a broken CFG which could lead to
+## unintended consequences including a firing assertion.
+
+# RUN: llvm-mc --filetype=obj --triple x86_64-unknown-unknown %s -o %t.o
+# RUN: link_fdata %s %t.o %t.fdata
+# RUN: llvm-strip --strip-unneeded %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
+# RUN: llvm-bolt %t.exe -o %t.bolt --split-functions --split-strategy=cdsplit \
+# RUN:         --data=%t.fdata --reorder-blocks=ext-tsp 2>&1 | FileCheck %s
+
+# CHECK: internal call detected
+
+        .text
+
+        .globl  foo
+        .type   foo, @function
+foo:
+        ret
+        .size foo, .-foo
+
+## main contains an internal call. ValidateInternalCalls pass will modify CFG
+## (making it invalid) and mark the function as non-simple. After that, we
+## cannot make any assumption about the CFG.
+
+        .globl  main
+        .type   main, @function
+main:
+        call .L1
+        ret
+.L1:
+        pushq   %rbp
+        movq    %rsp, %rbp
+        movl    $1, %edi
+LLmain_foo1:
+        call    foo
+# FDATA: 1 main #LLmain_foo1# 1 foo 0 0 600
+        movl    $4, %edi
+        xorl    %eax, %eax
+        popq    %rbp
+        retq
+.Lmain_end:
+        .size   main, .Lmain_end-main

@maksfb maksfb merged commit 778801c into llvm:main May 22, 2025
10 of 11 checks passed
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
We should never call fixBranches() on a function with invalid CFG. E.g.,
ValidateInternalCalls modifies CFG for its internal analysis purposes.
At the same time, it marks the function as non-simple with an assumption
that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can
lead to all sorts of issues, including assertions firing in
fixBranches().

The fix is to use the original size for non-simple functions in
calculateEmittedSize() since we are supposed to emit the function
unmodified. Additionally, add an assertion at the start of
fixBranches().
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
We should never call fixBranches() on a function with invalid CFG. E.g.,
ValidateInternalCalls modifies CFG for its internal analysis purposes.
At the same time, it marks the function as non-simple with an assumption
that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can
lead to all sorts of issues, including assertions firing in
fixBranches().

The fix is to use the original size for non-simple functions in
calculateEmittedSize() since we are supposed to emit the function
unmodified. Additionally, add an assertion at the start of
fixBranches().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants