Skip to content

release/20.x: [RISCV] [MachineOutliner] Analyze all candidates (#127659) #128146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 21, 2025

Conversation

llvmbot
Copy link
Member

@llvmbot llvmbot commented Feb 21, 2025

Backport 6757cf4

Requested by: @svs-quic

@llvmbot llvmbot added this to the LLVM 20.X Release milestone Feb 21, 2025
@llvmbot
Copy link
Member Author

llvmbot commented Feb 21, 2025

@wangpc-pp What do you think about merging this PR to the release branch?

@llvmbot
Copy link
Member Author

llvmbot commented Feb 21, 2025

@llvm/pr-subscribers-backend-risc-v

Author: None (llvmbot)

Changes

Backport 6757cf4

Requested by: @svs-quic


Full diff: https://github.com/llvm/llvm-project/pull/128146.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+22-30)
  • (added) llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir (+136)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 12a7af0750813..87f1f35835cbe 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -3010,30 +3010,25 @@ static bool cannotInsertTailCall(const MachineBasicBlock &MBB) {
   return false;
 }
 
-static std::optional<MachineOutlinerConstructionID>
-analyzeCandidate(outliner::Candidate &C) {
+static bool analyzeCandidate(outliner::Candidate &C) {
   // If last instruction is return then we can rely on
   // the verification already performed in the getOutliningTypeImpl.
   if (C.back().isReturn()) {
     assert(!cannotInsertTailCall(*C.getMBB()) &&
            "The candidate who uses return instruction must be outlined "
            "using tail call");
-    return MachineOutlinerTailCall;
+    return false;
   }
 
-  auto CandidateUsesX5 = [](outliner::Candidate &C) {
-    const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo();
-    if (std::any_of(C.begin(), C.end(), [TRI](const MachineInstr &MI) {
-          return isMIModifiesReg(MI, TRI, RISCV::X5);
-        }))
-      return true;
-    return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI);
-  };
-
-  if (!CandidateUsesX5(C))
-    return MachineOutlinerDefault;
+  // Filter out candidates where the X5 register (t0) can't be used to setup
+  // the function call.
+  const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo();
+  if (std::any_of(C.begin(), C.end(), [TRI](const MachineInstr &MI) {
+        return isMIModifiesReg(MI, TRI, RISCV::X5);
+      }))
+    return true;
 
-  return std::nullopt;
+  return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI);
 }
 
 std::optional<std::unique_ptr<outliner::OutlinedFunction>>
@@ -3042,35 +3037,32 @@ RISCVInstrInfo::getOutliningCandidateInfo(
     std::vector<outliner::Candidate> &RepeatedSequenceLocs,
     unsigned MinRepeats) const {
 
-  // Each RepeatedSequenceLoc is identical.
-  outliner::Candidate &Candidate = RepeatedSequenceLocs[0];
-  auto CandidateInfo = analyzeCandidate(Candidate);
-  if (!CandidateInfo)
-    RepeatedSequenceLocs.clear();
+  // Analyze each candidate and erase the ones that are not viable.
+  llvm::erase_if(RepeatedSequenceLocs, analyzeCandidate);
 
   // If the sequence doesn't have enough candidates left, then we're done.
   if (RepeatedSequenceLocs.size() < MinRepeats)
     return std::nullopt;
 
+  // Each RepeatedSequenceLoc is identical.
+  outliner::Candidate &Candidate = RepeatedSequenceLocs[0];
   unsigned InstrSizeCExt =
       Candidate.getMF()->getSubtarget<RISCVSubtarget>().hasStdExtCOrZca() ? 2
                                                                           : 4;
   unsigned CallOverhead = 0, FrameOverhead = 0;
 
-  MachineOutlinerConstructionID MOCI = CandidateInfo.value();
-  switch (MOCI) {
-  case MachineOutlinerDefault:
-    // call t0, function = 8 bytes.
-    CallOverhead = 8;
-    // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled.
-    FrameOverhead = InstrSizeCExt;
-    break;
-  case MachineOutlinerTailCall:
+  MachineOutlinerConstructionID MOCI = MachineOutlinerDefault;
+  if (Candidate.back().isReturn()) {
+    MOCI = MachineOutlinerTailCall;
     // tail call = auipc + jalr in the worst case without linker relaxation.
     CallOverhead = 4 + InstrSizeCExt;
     // Using tail call we move ret instruction from caller to callee.
     FrameOverhead = 0;
-    break;
+  } else {
+    // call t0, function = 8 bytes.
+    CallOverhead = 8;
+    // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled.
+    FrameOverhead = InstrSizeCExt;
   }
 
   for (auto &C : RepeatedSequenceLocs)
diff --git a/llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir b/llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir
new file mode 100644
index 0000000000000..f7bea33e52885
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/machine-outliner-call-x5-liveout.mir
@@ -0,0 +1,136 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=riscv32 -x mir -run-pass=machine-outliner -simplify-mir -verify-machineinstrs < %s \
+# RUN: | FileCheck -check-prefixes=RV32I-MO %s
+# RUN: llc -mtriple=riscv64 -x mir -run-pass=machine-outliner -simplify-mir -verify-machineinstrs < %s \
+# RUN: | FileCheck -check-prefixes=RV64I-MO %s
+
+# MIR has been edited by hand to have x5 as live out in @dont_outline
+
+---
+
+name:            outline_0
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  bb.0:
+    liveins: $x10, $x11
+
+    ; RV32I-MO-LABEL: name: outline_0
+    ; RV32I-MO: liveins: $x10, $x11
+    ; RV32I-MO-NEXT: {{  $}}
+    ; RV32I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV32I-MO-NEXT: PseudoRET implicit $x11
+    ;
+    ; RV64I-MO-LABEL: name: outline_0
+    ; RV64I-MO: liveins: $x10, $x11
+    ; RV64I-MO-NEXT: {{  $}}
+    ; RV64I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV64I-MO-NEXT: PseudoRET implicit $x11
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    PseudoRET implicit $x11
+
+...
+---
+
+name:            dont_outline
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  ; RV32I-MO-LABEL: name: dont_outline
+  ; RV32I-MO: bb.0:
+  ; RV32I-MO-NEXT:   liveins: $x10, $x11
+  ; RV32I-MO-NEXT: {{  $}}
+  ; RV32I-MO-NEXT:   $x11 = ORI $x11, 1023
+  ; RV32I-MO-NEXT:   $x12 = ADDI $x10, 17
+  ; RV32I-MO-NEXT:   $x11 = AND $x12, $x11
+  ; RV32I-MO-NEXT:   $x10 = SUB $x10, $x11
+  ; RV32I-MO-NEXT:   BEQ $x10, $x11, %bb.1
+  ; RV32I-MO-NEXT: {{  $}}
+  ; RV32I-MO-NEXT: bb.1:
+  ; RV32I-MO-NEXT:   liveins: $x10, $x5
+  ; RV32I-MO-NEXT: {{  $}}
+  ; RV32I-MO-NEXT:   PseudoRET implicit $x10, implicit $x5
+  ;
+  ; RV64I-MO-LABEL: name: dont_outline
+  ; RV64I-MO: bb.0:
+  ; RV64I-MO-NEXT:   liveins: $x10, $x11
+  ; RV64I-MO-NEXT: {{  $}}
+  ; RV64I-MO-NEXT:   $x11 = ORI $x11, 1023
+  ; RV64I-MO-NEXT:   $x12 = ADDI $x10, 17
+  ; RV64I-MO-NEXT:   $x11 = AND $x12, $x11
+  ; RV64I-MO-NEXT:   $x10 = SUB $x10, $x11
+  ; RV64I-MO-NEXT:   BEQ $x10, $x11, %bb.1
+  ; RV64I-MO-NEXT: {{  $}}
+  ; RV64I-MO-NEXT: bb.1:
+  ; RV64I-MO-NEXT:   liveins: $x10, $x5
+  ; RV64I-MO-NEXT: {{  $}}
+  ; RV64I-MO-NEXT:   PseudoRET implicit $x10, implicit $x5
+  bb.0:
+    liveins: $x10, $x11
+
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    BEQ $x10, $x11, %bb.1
+
+  bb.1:
+    liveins: $x10, $x5
+    PseudoRET implicit $x10, implicit $x5
+
+...
+---
+
+name:            outline_1
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  bb.0:
+    liveins: $x10, $x11
+
+    ; RV32I-MO-LABEL: name: outline_1
+    ; RV32I-MO: liveins: $x10, $x11
+    ; RV32I-MO-NEXT: {{  $}}
+    ; RV32I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV32I-MO-NEXT: PseudoRET implicit $x10
+    ;
+    ; RV64I-MO-LABEL: name: outline_1
+    ; RV64I-MO: liveins: $x10, $x11
+    ; RV64I-MO-NEXT: {{  $}}
+    ; RV64I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV64I-MO-NEXT: PseudoRET implicit $x10
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    PseudoRET implicit $x10
+
+...
+---
+name:            outline_2
+tracksRegLiveness: true
+isOutlined: false
+body:             |
+  bb.0:
+    liveins: $x10, $x11
+
+    ; RV32I-MO-LABEL: name: outline_2
+    ; RV32I-MO: liveins: $x10, $x11
+    ; RV32I-MO-NEXT: {{  $}}
+    ; RV32I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV32I-MO-NEXT: PseudoRET implicit $x12
+    ;
+    ; RV64I-MO-LABEL: name: outline_2
+    ; RV64I-MO: liveins: $x10, $x11
+    ; RV64I-MO-NEXT: {{  $}}
+    ; RV64I-MO-NEXT: $x5 = PseudoCALLReg target-flags(riscv-call) @OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit $x10, implicit $x11
+    ; RV64I-MO-NEXT: PseudoRET implicit $x12
+    $x11 = ORI $x11, 1023
+    $x12 = ADDI $x10, 17
+    $x11 = AND $x12, $x11
+    $x10 = SUB $x10, $x11
+    PseudoRET implicit $x12
+...

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

llvm#117700 made a change from analyzing all the candidates to analyzing
just the first candidate before deciding to either delete or keep all of
them.

Even though the candidates all have the same instructions, the basic
blocks in which they are present are different and we will need to check
each of them before deciding whether to keep or erase them.
Particularly, `isAvailableAcrossAndOutOfSeq` checks to see if the
register (x5 in this case) is available from the end of the MBB to the
beginning of the candidate and not checking this for each candidate led
to incorrect candidates being outlined resulting in correctness issues
in a few downstream benchmarks.

Similarly, deleting all the candidates if the first one is not viable
will result in missed outlining opportunities.

(cherry picked from commit 6757cf4)
@tstellar tstellar merged commit 3076a68 into llvm:release/20.x Feb 21, 2025
7 of 10 checks passed
Copy link

@svs-quic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

4 participants