[RISCV] [MachineOutliner] Analyze all candidates #127659

svs-quic · 2025-02-18T16:02:50Z

#117700 made a change from analyzing all the candidates to analyzing just the first candidate before deciding to either delete or keep all of them.

Even though the candidates all have the same instructions, the basic blocks in which they are present are different and we will need to check each of them before deciding whether to keep or erase them. Particularly, isAvailableAcrossAndOutOfSeq checks to see if the register (x5 in this case) is available from the end of the MBB to the beginning of the candidate and not checking this for each candidate led to incorrect candidates being outlined resulting in correctness issues in a few downstream benchmarks.

Similarly, deleting all the candidates if the first one is not viable will result in missed outlining opportunities.

llvmbot · 2025-02-18T16:04:19Z

@llvm/pr-subscribers-backend-risc-v

Author: Sudharsan Veeravalli (svs-quic)

Changes

#117700 made a change from analyzing all the candidates to analyzing just the first candidate before deciding to either delete or keep all of them.

Even though the candidates all have the same instructions, the basic blocks in which they are present are different and we will need to check each of them before deciding whether to keep or erase them. Particularly, isAvailableAcrossAndOutOfSeq checks to see if the register (x5 in this case) is available from the end of the MBB to the beginning of the candidate and not checking this for each candidate led to incorrect candidates being outlined resulting in correctness issues in a few downstream benchmarks.

Similarly, deleting all the candidates if the first one is not viable will result in missed outlining opportunities.

Full diff: https://github.com/llvm/llvm-project/pull/127659.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+22-30)
(added) llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir (+146)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 456fb66917216..bc859887c4aac 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -3015,30 +3015,25 @@ static bool cannotInsertTailCall(const MachineBasicBlock &MBB) {
   return false;
 }
 
-static std::optional<MachineOutlinerConstructionID>
-analyzeCandidate(outliner::Candidate &C) {
+static bool analyzeCandidate(outliner::Candidate &C) {
   // If last instruction is return then we can rely on
   // the verification already performed in the getOutliningTypeImpl.
   if (C.back().isReturn()) {
     assert(!cannotInsertTailCall(*C.getMBB()) &&
            "The candidate who uses return instruction must be outlined "
            "using tail call");
-    return MachineOutlinerTailCall;
+    return false;
   }
 
-  auto CandidateUsesX5 = [](outliner::Candidate &C) {
-    const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo();
-    if (std::any_of(C.begin(), C.end(), [TRI](const MachineInstr &MI) {
-          return isMIModifiesReg(MI, TRI, RISCV::X5);
-        }))
-      return true;
-    return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI);
-  };
-
-  if (!CandidateUsesX5(C))
-    return MachineOutlinerDefault;
+  // Filter out candidates where the X5 register (t0) can't be used to setup
+  // the function call.
+  const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo();
+  if (std::any_of(C.begin(), C.end(), [TRI](const MachineInstr &MI) {
+        return isMIModifiesReg(MI, TRI, RISCV::X5);
+      }))
+    return true;
 
-  return std::nullopt;
+  return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI);
 }
 
 std::optional<std::unique_ptr<outliner::OutlinedFunction>>
@@ -3047,35 +3042,32 @@ RISCVInstrInfo::getOutliningCandidateInfo(
     std::vector<outliner::Candidate> &RepeatedSequenceLocs,
     unsigned MinRepeats) const {
 
-  // Each RepeatedSequenceLoc is identical.
-  outliner::Candidate &Candidate = RepeatedSequenceLocs[0];
-  auto CandidateInfo = analyzeCandidate(Candidate);
-  if (!CandidateInfo)
-    RepeatedSequenceLocs.clear();
+  // Analyze each candidate and erase the ones that are not viable.
+  llvm::erase_if(RepeatedSequenceLocs, analyzeCandidate);
 
   // If the sequence doesn't have enough candidates left, then we're done.
   if (RepeatedSequenceLocs.size() < MinRepeats)
     return std::nullopt;
 
+  // Each RepeatedSequenceLoc is identical.
+  outliner::Candidate &Candidate = RepeatedSequenceLocs[0];
   unsigned InstrSizeCExt =
       Candidate.getMF()->getSubtarget<RISCVSubtarget>().hasStdExtCOrZca() ? 2
                                                                           : 4;
   unsigned CallOverhead = 0, FrameOverhead = 0;
 
-  MachineOutlinerConstructionID MOCI = CandidateInfo.value();
-  switch (MOCI) {
-  case MachineOutlinerDefault:
-    // call t0, function = 8 bytes.
-    CallOverhead = 8;
-    // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled.
-    FrameOverhead = InstrSizeCExt;
-    break;
-  case MachineOutlinerTailCall:
+  MachineOutlinerConstructionID MOCI = MachineOutlinerDefault;
+  if (Candidate.back().isReturn()) {
+    MOCI = MachineOutlinerTailCall;
     // tail call = auipc + jalr in the worst case without linker relaxation.
     CallOverhead = 4 + InstrSizeCExt;
     // Using tail call we move ret instruction from caller to callee.
     FrameOverhead = 0;
-    break;
+  } else {
+    // call t0, function = 8 bytes.
+    CallOverhead = 8;
+    // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled.
+    FrameOverhead = InstrSizeCExt;
   }
 
   for (auto &C : RepeatedSequenceLocs)
diff --git a/llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir b/llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir
new file mode 100644
index 0000000000000..5da06850e5a67
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir
@@ -0,0 +1,146 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=riscv32 -x mir -run-pass=machine-outliner -simplify-mir -verify-machineinstrs < %s \
+# RUN: | FileCheck -check-prefixes=RV32I-MO %s
+# RUN: llc -mtriple=riscv64 -x mir -run-pass=machine-outliner -simplify-mir -verify-machineinstrs < %s \
+# RUN: | FileCheck -check-prefixes=RV64I-MO %s
+
+# MIR has been edited by hand to have x5 as live out in @dont_outline
+
+--- |
+  define i32 @outline_0(i32 %a, i32 %b) { ret i32 0 }
+
+  ; Can't outline if x5 is live out of the MBB
+  define i32 @dont_outline(i32 %a, i32 %b) { ret i32 0 }
+
+  define i32 @outline_1(i32 %a, i32 %b) { ret i32 0 }
+
+  define i32 @outline_2(i32 %a, i32 %b) { ret i32 0 }
+...
+---
+
+name:            outline_0
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  bb.0:
+    liveins: $x10, $x11
+
+    ; RV32I-MO-LABEL: name: outline_0
+    ; RV32I-MO: liveins: $x10, $x11
+    ; RV32I-MO-NEXT: {{  $}}
+    ; RV32I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV32I-MO-NEXT: PseudoRET implicit $x11
+    ;
+    ; RV64I-MO-LABEL: name: outline_0
+    ; RV64I-MO: liveins: $x10, $x11
+    ; RV64I-MO-NEXT: {{  $}}
+    ; RV64I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV64I-MO-NEXT: PseudoRET implicit $x11
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    PseudoRET implicit $x11
+
+...
+---
+
+name:            dont_outline
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  ; RV32I-MO-LABEL: name: dont_outline
+  ; RV32I-MO: bb.0:
+  ; RV32I-MO-NEXT:   liveins: $x10, $x11
+  ; RV32I-MO-NEXT: {{  $}}
+  ; RV32I-MO-NEXT:   $x11 = ORI $x11, 1023
+  ; RV32I-MO-NEXT:   $x12 = ADDI $x10, 17
+  ; RV32I-MO-NEXT:   $x11 = AND $x12, $x11
+  ; RV32I-MO-NEXT:   $x10 = SUB $x10, $x11
+  ; RV32I-MO-NEXT:   BEQ $x10, $x11, %bb.1
+  ; RV32I-MO-NEXT: {{  $}}
+  ; RV32I-MO-NEXT: bb.1:
+  ; RV32I-MO-NEXT:   liveins: $x10, $x5
+  ; RV32I-MO-NEXT: {{  $}}
+  ; RV32I-MO-NEXT:   PseudoRET implicit $x10, implicit $x5
+  ;
+  ; RV64I-MO-LABEL: name: dont_outline
+  ; RV64I-MO: bb.0:
+  ; RV64I-MO-NEXT:   liveins: $x10, $x11
+  ; RV64I-MO-NEXT: {{  $}}
+  ; RV64I-MO-NEXT:   $x11 = ORI $x11, 1023
+  ; RV64I-MO-NEXT:   $x12 = ADDI $x10, 17
+  ; RV64I-MO-NEXT:   $x11 = AND $x12, $x11
+  ; RV64I-MO-NEXT:   $x10 = SUB $x10, $x11
+  ; RV64I-MO-NEXT:   BEQ $x10, $x11, %bb.1
+  ; RV64I-MO-NEXT: {{  $}}
+  ; RV64I-MO-NEXT: bb.1:
+  ; RV64I-MO-NEXT:   liveins: $x10, $x5
+  ; RV64I-MO-NEXT: {{  $}}
+  ; RV64I-MO-NEXT:   PseudoRET implicit $x10, implicit $x5
+  bb.0:
+    liveins: $x10, $x11
+
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    BEQ $x10, $x11, %bb.1
+
+  bb.1:
+    liveins: $x10, $x5
+    PseudoRET implicit $x10, implicit $x5
+
+...
+---
+
+name:            outline_1
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  bb.0:
+    liveins: $x10, $x11
+
+    ; RV32I-MO-LABEL: name: outline_1
+    ; RV32I-MO: liveins: $x10, $x11
+    ; RV32I-MO-NEXT: {{  $}}
+    ; RV32I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV32I-MO-NEXT: PseudoRET implicit $x10
+    ;
+    ; RV64I-MO-LABEL: name: outline_1
+    ; RV64I-MO: liveins: $x10, $x11
+    ; RV64I-MO-NEXT: {{  $}}
+    ; RV64I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV64I-MO-NEXT: PseudoRET implicit $x10
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    PseudoRET implicit $x10
+
+...
+---
+name:            outline_2
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  bb.0:
+    liveins: $x10, $x11
+
+    ; RV32I-MO-LABEL: name: outline_2
+    ; RV32I-MO: liveins: $x10, $x11
+    ; RV32I-MO-NEXT: {{  $}}
+    ; RV32I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV32I-MO-NEXT: PseudoRET implicit $x12
+    ;
+    ; RV64I-MO-LABEL: name: outline_2
+    ; RV64I-MO: liveins: $x10, $x11
+    ; RV64I-MO-NEXT: {{  $}}
+    ; RV64I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV64I-MO-NEXT: PseudoRET implicit $x12
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    PseudoRET implicit $x12
+...

llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir

lenary

Nice Find.

I think we can afford to be a little clearer about the difference between checks on the candidate's context (i.e. where it would be called from) vs its content (the instructions inside it), but maybe that's not helpful as when looking at the instructions most of our analysis is about e.g. liveness.

I'm not sure we need to fix the costing issue today, but I'm not sure.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

lenary

I'm happy with how this works and fixes the issue.

I think this should also be backported to the release branch, as I think it will affect a lot of tests that use -Oz.

lenary · 2025-02-19T21:59:39Z

I've added this to the 20.x release as I do think this is a big enough bug that we should fix it on the branch. I hope that's not too controversial.

lenary · 2025-02-21T05:03:16Z

@mga-sc @wangpc-pp @resistor Please can one of you also take a look at this review? I feel relatively new to the MachineOutliner, so I would appreciate a second +1 before this is merged.

wangpc-pp

LGTM.

svs-quic · 2025-02-21T07:25:52Z

/cherry-pick 6757cf4

llvmbot · 2025-02-21T07:31:15Z

Failed to cherry-pick: 6757cf4

https://github.com/llvm/llvm-project/actions/runs/13452069337

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

llvmbot · 2025-02-21T08:01:21Z

/pull-request #128146

llvm#117700 made a change from analyzing all the candidates to analyzing just the first candidate before deciding to either delete or keep all of them. Even though the candidates all have the same instructions, the basic blocks in which they are present are different and we will need to check each of them before deciding whether to keep or erase them. Particularly, `isAvailableAcrossAndOutOfSeq` checks to see if the register (x5 in this case) is available from the end of the MBB to the beginning of the candidate and not checking this for each candidate led to incorrect candidates being outlined resulting in correctness issues in a few downstream benchmarks. Similarly, deleting all the candidates if the first one is not viable will result in missed outlining opportunities. (cherry picked from commit 6757cf4)

svs-quic added 2 commits February 18, 2025 20:57

Pre commit test

0e094e1

Analyze all candidates

588fef8

llvmbot added the backend:RISC-V label Feb 18, 2025

svs-quic requested review from lenary, ilovepi, wangpc-pp and topperc February 18, 2025 16:06

topperc reviewed Feb 18, 2025

View reviewed changes

llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir Outdated Show resolved Hide resolved

lenary reviewed Feb 18, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp Show resolved Hide resolved

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp Show resolved Hide resolved

Remove IR from test

16e0ce7

lenary approved these changes Feb 19, 2025

View reviewed changes

lenary added this to the LLVM 20.X Release milestone Feb 19, 2025

wangpc-pp approved these changes Feb 21, 2025

View reviewed changes

svs-quic merged commit 6757cf4 into llvm:main Feb 21, 2025
8 checks passed

llvmbot added release:cherry-pick-failed and removed release:cherry-pick-failed labels Feb 21, 2025

svs-quic deleted the outlinerfix branch February 22, 2025 06:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] [MachineOutliner] Analyze all candidates #127659

[RISCV] [MachineOutliner] Analyze all candidates #127659

Uh oh!

svs-quic commented Feb 18, 2025

Uh oh!

llvmbot commented Feb 18, 2025

Uh oh!

Uh oh!

lenary left a comment

Uh oh!

Uh oh!

Uh oh!

lenary left a comment

Uh oh!

lenary commented Feb 19, 2025

Uh oh!

lenary commented Feb 21, 2025

Uh oh!

wangpc-pp left a comment

Uh oh!

Uh oh!

svs-quic commented Feb 21, 2025 •

edited

Loading

Uh oh!

llvmbot commented Feb 21, 2025

Uh oh!

llvmbot commented Feb 21, 2025

Uh oh!

Uh oh!

[RISCV] [MachineOutliner] Analyze all candidates #127659

[RISCV] [MachineOutliner] Analyze all candidates #127659

Uh oh!

Conversation

svs-quic commented Feb 18, 2025

Uh oh!

llvmbot commented Feb 18, 2025

Uh oh!

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

lenary commented Feb 19, 2025

Uh oh!

lenary commented Feb 21, 2025

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

svs-quic commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 21, 2025

Uh oh!

llvmbot commented Feb 21, 2025

Uh oh!

Uh oh!

svs-quic commented Feb 21, 2025 •

edited

Loading