-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Reapply "[RISCV] Implement tail call optimization in machine outliner" #117700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@ilovepi , @wangpc-pp , please, quick review this and merge. |
Hi @mga-sc : I reverted the original PR to unbreak the CI. In general we revert unless we can land a fix-forward immediately. |
Don't worry and please cherry-pick the reverted commit here and reland it with your fix:
|
4ebe64b
to
8c3b63e
Compare
Following up issue llvm#89822, this patch adds opportunity to use tail call in machine outliner pass. Also it enables outline patterns with X5(T0) register.
@llvm/pr-subscribers-backend-risc-v Author: Mark Goncharov (mga-sc) ChangesThis test has been failed after commit Changes to the same file were merged today earlier Patch is 42.19 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/117700.diff 11 Files Affected:
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
index 19103e219cb800..ca2f868cd4e764 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
@@ -208,6 +208,12 @@ static inline unsigned getVLOpNum(const MCInstrDesc &Desc) {
return Desc.getNumOperands() - Offset;
}
+static inline unsigned getTailExpandUseRegNo(const FeatureBitset &FeatureBits) {
+ // For Zicfilp, PseudoTAIL should be expanded to a software guarded branch.
+ // It means to use t2(x7) as rs1 of JALR to expand PseudoTAIL.
+ return FeatureBits[RISCV::FeatureStdExtZicfilp] ? RISCV::X7 : RISCV::X6;
+}
+
static inline unsigned getSEWOpNum(const MCInstrDesc &Desc) {
const uint64_t TSFlags = Desc.TSFlags;
assert(hasSEWOp(TSFlags));
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
index 6bb49e2bb85fe1..a28bf1186589d9 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
@@ -124,11 +124,7 @@ void RISCVMCCodeEmitter::expandFunctionCall(const MCInst &MI,
MCRegister Ra;
if (MI.getOpcode() == RISCV::PseudoTAIL) {
Func = MI.getOperand(0);
- Ra = RISCV::X6;
- // For Zicfilp, PseudoTAIL should be expanded to a software guarded branch.
- // It means to use t2(x7) as rs1 of JALR to expand PseudoTAIL.
- if (STI.hasFeature(RISCV::FeatureStdExtZicfilp))
- Ra = RISCV::X7;
+ Ra = RISCVII::getTailExpandUseRegNo(STI.getFeatureBits());
} else if (MI.getOpcode() == RISCV::PseudoCALLReg) {
Func = MI.getOperand(1);
Ra = MI.getOperand(0).getReg();
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 933e776da47404..47273d6bc06d65 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -11,6 +11,7 @@
//===----------------------------------------------------------------------===//
#include "RISCVInstrInfo.h"
+#include "MCTargetDesc/RISCVBaseInfo.h"
#include "MCTargetDesc/RISCVMatInt.h"
#include "RISCV.h"
#include "RISCVMachineFunctionInfo.h"
@@ -2927,6 +2928,7 @@ bool RISCVInstrInfo::isMBBSafeToOutlineFrom(MachineBasicBlock &MBB,
// Enum values indicating how an outlined call should be constructed.
enum MachineOutlinerConstructionID {
+ MachineOutlinerTailCall,
MachineOutlinerDefault
};
@@ -2935,46 +2937,118 @@ bool RISCVInstrInfo::shouldOutlineFromFunctionByDefault(
return MF.getFunction().hasMinSize();
}
+static bool isCandidatePatchable(const MachineBasicBlock &MBB) {
+ const MachineFunction *MF = MBB.getParent();
+ const Function &F = MF->getFunction();
+ return F.getFnAttribute("fentry-call").getValueAsBool() ||
+ F.hasFnAttribute("patchable-function-entry");
+}
+
+static bool isMIReadsReg(const MachineInstr &MI, const TargetRegisterInfo *TRI,
+ unsigned RegNo) {
+ return MI.readsRegister(RegNo, TRI) ||
+ MI.getDesc().hasImplicitUseOfPhysReg(RegNo);
+}
+
+static bool isMIModifiesReg(const MachineInstr &MI,
+ const TargetRegisterInfo *TRI, unsigned RegNo) {
+ return MI.modifiesRegister(RegNo, TRI) ||
+ MI.getDesc().hasImplicitDefOfPhysReg(RegNo);
+}
+
+static bool cannotInsertTailCall(const MachineBasicBlock &MBB) {
+ if (!MBB.back().isReturn())
+ return true;
+ if (isCandidatePatchable(MBB))
+ return true;
+
+ // If the candidate reads the pre-set register
+ // that can be used for expanding PseudoTAIL instruction,
+ // then we cannot insert tail call.
+ const TargetSubtargetInfo &STI = MBB.getParent()->getSubtarget();
+ unsigned TailExpandUseRegNo =
+ RISCVII::getTailExpandUseRegNo(STI.getFeatureBits());
+ for (const MachineInstr &MI : MBB) {
+ if (isMIReadsReg(MI, STI.getRegisterInfo(), TailExpandUseRegNo))
+ return true;
+ if (isMIModifiesReg(MI, STI.getRegisterInfo(), TailExpandUseRegNo))
+ break;
+ }
+ return false;
+}
+
+static std::optional<MachineOutlinerConstructionID>
+analyzeCandidate(outliner::Candidate &C) {
+ // If last instruction is return then we can rely on
+ // the verification already performed in the getOutliningTypeImpl.
+ if (C.back().isReturn()) {
+ assert(!cannotInsertTailCall(*C.getMBB()) &&
+ "The candidate who uses return instruction must be outlined "
+ "using tail call");
+ return MachineOutlinerTailCall;
+ }
+
+ auto CandidateUsesX5 = [](outliner::Candidate &C) {
+ const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo();
+ if (std::any_of(C.begin(), C.end(), [TRI](const MachineInstr &MI) {
+ return isMIModifiesReg(MI, TRI, RISCV::X5);
+ }))
+ return true;
+ return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI);
+ };
+
+ if (!CandidateUsesX5(C))
+ return MachineOutlinerDefault;
+
+ return std::nullopt;
+}
+
std::optional<std::unique_ptr<outliner::OutlinedFunction>>
RISCVInstrInfo::getOutliningCandidateInfo(
const MachineModuleInfo &MMI,
std::vector<outliner::Candidate> &RepeatedSequenceLocs,
unsigned MinRepeats) const {
- // First we need to filter out candidates where the X5 register (IE t0) can't
- // be used to setup the function call.
- auto CannotInsertCall = [](outliner::Candidate &C) {
- const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo();
- return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI);
- };
-
- llvm::erase_if(RepeatedSequenceLocs, CannotInsertCall);
+ // Each RepeatedSequenceLoc is identical.
+ outliner::Candidate &Candidate = RepeatedSequenceLocs[0];
+ auto CandidateInfo = analyzeCandidate(Candidate);
+ if (!CandidateInfo)
+ RepeatedSequenceLocs.clear();
// If the sequence doesn't have enough candidates left, then we're done.
if (RepeatedSequenceLocs.size() < MinRepeats)
return std::nullopt;
- unsigned SequenceSize = 0;
-
- for (auto &MI : RepeatedSequenceLocs[0])
- SequenceSize += getInstSizeInBytes(MI);
+ unsigned InstrSizeCExt =
+ Candidate.getMF()->getSubtarget<RISCVSubtarget>().hasStdExtCOrZca() ? 2
+ : 4;
+ unsigned CallOverhead = 0, FrameOverhead = 0;
+
+ MachineOutlinerConstructionID MOCI = CandidateInfo.value();
+ switch (MOCI) {
+ case MachineOutlinerDefault:
+ // call t0, function = 8 bytes.
+ CallOverhead = 8;
+ // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled.
+ FrameOverhead = InstrSizeCExt;
+ break;
+ case MachineOutlinerTailCall:
+ // tail call = auipc + jalr in the worst case without linker relaxation.
+ CallOverhead = 4 + InstrSizeCExt;
+ // Using tail call we move ret instruction from caller to callee.
+ FrameOverhead = 0;
+ break;
+ }
- // call t0, function = 8 bytes.
- unsigned CallOverhead = 8;
for (auto &C : RepeatedSequenceLocs)
- C.setCallInfo(MachineOutlinerDefault, CallOverhead);
+ C.setCallInfo(MOCI, CallOverhead);
- // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled.
- unsigned FrameOverhead = 4;
- if (RepeatedSequenceLocs[0]
- .getMF()
- ->getSubtarget<RISCVSubtarget>()
- .hasStdExtCOrZca())
- FrameOverhead = 2;
+ unsigned SequenceSize = 0;
+ for (auto &MI : Candidate)
+ SequenceSize += getInstSizeInBytes(MI);
return std::make_unique<outliner::OutlinedFunction>(
- RepeatedSequenceLocs, SequenceSize, FrameOverhead,
- MachineOutlinerDefault);
+ RepeatedSequenceLocs, SequenceSize, FrameOverhead, MOCI);
}
outliner::InstrType
@@ -2995,15 +3069,8 @@ RISCVInstrInfo::getOutliningTypeImpl(const MachineModuleInfo &MMI,
return F.needsUnwindTableEntry() ? outliner::InstrType::Illegal
: outliner::InstrType::Invisible;
- // We need support for tail calls to outlined functions before return
- // statements can be allowed.
- if (MI.isReturn())
- return outliner::InstrType::Illegal;
-
- // Don't allow modifying the X5 register which we use for return addresses for
- // these outlined functions.
- if (MI.modifiesRegister(RISCV::X5, TRI) ||
- MI.getDesc().hasImplicitDefOfPhysReg(RISCV::X5))
+ if (cannotInsertTailCall(*MBB) &&
+ (MI.isReturn() || isMIModifiesReg(MI, TRI, RISCV::X5)))
return outliner::InstrType::Illegal;
// Make sure the operands don't reference something unsafe.
@@ -3039,6 +3106,9 @@ void RISCVInstrInfo::buildOutlinedFrame(
}
}
+ if (OF.FrameConstructionID == MachineOutlinerTailCall)
+ return;
+
MBB.addLiveIn(RISCV::X5);
// Add in a return instruction to the end of the outlined frame.
@@ -3052,6 +3122,13 @@ MachineBasicBlock::iterator RISCVInstrInfo::insertOutlinedCall(
Module &M, MachineBasicBlock &MBB, MachineBasicBlock::iterator &It,
MachineFunction &MF, outliner::Candidate &C) const {
+ if (C.CallConstructionID == MachineOutlinerTailCall) {
+ It = MBB.insert(It, BuildMI(MF, DebugLoc(), get(RISCV::PseudoTAIL))
+ .addGlobalAddress(M.getNamedValue(MF.getName()),
+ /*Offset=*/0, RISCVII::MO_CALL));
+ return It;
+ }
+
// Add in a call instruction to the outlined function at the given location.
It = MBB.insert(It,
BuildMI(MF, DebugLoc(), get(RISCV::PseudoCALLReg), RISCV::X5)
diff --git a/llvm/test/CodeGen/RISCV/compress-opt-select.ll b/llvm/test/CodeGen/RISCV/compress-opt-select.ll
index f9333a45016a06..733c84ac236133 100644
--- a/llvm/test/CodeGen/RISCV/compress-opt-select.ll
+++ b/llvm/test/CodeGen/RISCV/compress-opt-select.ll
@@ -13,11 +13,9 @@ define i32 @ne_small_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, 20
; RV32IFDC-NEXT: bne a0, a1, .LBB0_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB0_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_small_pos:
; RV32IFD: # %bb.0:
@@ -41,11 +39,9 @@ define i32 @ne_small_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, -20
; RV32IFDC-NEXT: bne a0, a1, .LBB1_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB1_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_small_neg:
; RV32IFD: # %bb.0:
@@ -69,11 +65,9 @@ define i32 @ne_small_edge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, 31
; RV32IFDC-NEXT: bne a0, a1, .LBB2_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB2_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_small_edge_pos:
; RV32IFD: # %bb.0:
@@ -97,11 +91,9 @@ define i32 @ne_small_edge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, -32
; RV32IFDC-NEXT: bne a0, a1, .LBB3_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB3_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_small_edge_neg:
; RV32IFD: # %bb.0:
@@ -126,11 +118,9 @@ define i32 @ne_medium_ledge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, -33
; RV32IFDC-NEXT: c.bnez a0, .LBB4_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB4_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_medium_ledge_pos:
; RV32IFD: # %bb.0:
@@ -155,11 +145,9 @@ define i32 @ne_medium_ledge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, 33
; RV32IFDC-NEXT: c.bnez a0, .LBB5_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB5_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_medium_ledge_neg:
; RV32IFD: # %bb.0:
@@ -184,11 +172,9 @@ define i32 @ne_medium_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, -63
; RV32IFDC-NEXT: c.bnez a0, .LBB6_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB6_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_medium_pos:
; RV32IFD: # %bb.0:
@@ -213,11 +199,9 @@ define i32 @ne_medium_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, 63
; RV32IFDC-NEXT: c.bnez a0, .LBB7_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB7_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_medium_neg:
; RV32IFD: # %bb.0:
@@ -242,11 +226,9 @@ define i32 @ne_medium_bedge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, -2047
; RV32IFDC-NEXT: c.bnez a0, .LBB8_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB8_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_medium_bedge_pos:
; RV32IFD: # %bb.0:
@@ -271,11 +253,9 @@ define i32 @ne_medium_bedge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, 2047
; RV32IFDC-NEXT: c.bnez a0, .LBB9_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB9_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_medium_bedge_neg:
; RV32IFD: # %bb.0:
@@ -300,11 +280,9 @@ define i32 @ne_big_ledge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: c.slli a1, 11
; RV32IFDC-NEXT: bne a0, a1, .LBB10_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB10_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_big_ledge_pos:
; RV32IFD: # %bb.0:
@@ -329,11 +307,9 @@ define i32 @ne_big_ledge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a1, zero, -2048
; RV32IFDC-NEXT: bne a0, a1, .LBB11_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB11_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: ne_big_ledge_neg:
; RV32IFD: # %bb.0:
@@ -360,11 +336,9 @@ define i32 @eq_small_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, 20
; RV32IFDC-NEXT: beq a0, a1, .LBB12_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB12_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_small_pos:
; RV32IFD: # %bb.0:
@@ -388,11 +362,9 @@ define i32 @eq_small_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, -20
; RV32IFDC-NEXT: beq a0, a1, .LBB13_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB13_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_small_neg:
; RV32IFD: # %bb.0:
@@ -416,11 +388,9 @@ define i32 @eq_small_edge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, 31
; RV32IFDC-NEXT: beq a0, a1, .LBB14_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB14_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_small_edge_pos:
; RV32IFD: # %bb.0:
@@ -444,11 +414,9 @@ define i32 @eq_small_edge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: c.li a1, -32
; RV32IFDC-NEXT: beq a0, a1, .LBB15_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB15_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_small_edge_neg:
; RV32IFD: # %bb.0:
@@ -473,11 +441,9 @@ define i32 @eq_medium_ledge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, -33
; RV32IFDC-NEXT: c.beqz a0, .LBB16_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB16_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_medium_ledge_pos:
; RV32IFD: # %bb.0:
@@ -502,11 +468,9 @@ define i32 @eq_medium_ledge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, 33
; RV32IFDC-NEXT: c.beqz a0, .LBB17_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB17_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_medium_ledge_neg:
; RV32IFD: # %bb.0:
@@ -531,11 +495,9 @@ define i32 @eq_medium_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, -63
; RV32IFDC-NEXT: c.beqz a0, .LBB18_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB18_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_medium_pos:
; RV32IFD: # %bb.0:
@@ -560,11 +522,9 @@ define i32 @eq_medium_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, 63
; RV32IFDC-NEXT: c.beqz a0, .LBB19_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB19_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_medium_neg:
; RV32IFD: # %bb.0:
@@ -589,11 +549,9 @@ define i32 @eq_medium_bedge_pos(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, -2047
; RV32IFDC-NEXT: c.beqz a0, .LBB20_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a0, zero, 42
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_1
; RV32IFDC-NEXT: .LBB20_2:
-; RV32IFDC-NEXT: addi a0, zero, -99
-; RV32IFDC-NEXT: c.jr ra
+; RV32IFDC-NEXT: tail OUTLINED_FUNCTION_0
;
; RV32IFD-LABEL: eq_medium_bedge_pos:
; RV32IFD: # %bb.0:
@@ -618,11 +576,9 @@ define i32 @eq_medium_bedge_neg(i32 %in0) minsize {
; RV32IFDC-NEXT: addi a0, a0, 2047
; RV32IFDC-NEXT: c.beqz a0, .LBB21_2
; RV32IFDC-NEXT: # %bb.1:
-; RV32IFDC-NEXT: addi a...
[truncated]
|
8c3b63e
to
d09b4bd
Compare
@wangpc-pp , done, CI has been passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@wangpc-pp , could you please merge in this case |
#117700 made a change from analyzing all the candidates to analyzing just the first candidate before deciding to either delete or keep all of them. Even though the candidates all have the same instructions, the basic blocks in which they are present are different and we will need to check each of them before deciding whether to keep or erase them. Particularly, `isAvailableAcrossAndOutOfSeq` checks to see if the register (x5 in this case) is available from the end of the MBB to the beginning of the candidate and not checking this for each candidate led to incorrect candidates being outlined resulting in correctness issues in a few downstream benchmarks. Similarly, deleting all the candidates if the first one is not viable will result in missed outlining opportunities.
llvm#117700 made a change from analyzing all the candidates to analyzing just the first candidate before deciding to either delete or keep all of them. Even though the candidates all have the same instructions, the basic blocks in which they are present are different and we will need to check each of them before deciding whether to keep or erase them. Particularly, `isAvailableAcrossAndOutOfSeq` checks to see if the register (x5 in this case) is available from the end of the MBB to the beginning of the candidate and not checking this for each candidate led to incorrect candidates being outlined resulting in correctness issues in a few downstream benchmarks. Similarly, deleting all the candidates if the first one is not viable will result in missed outlining opportunities. (cherry picked from commit 6757cf4)
This MR fixes failed test
CodeGen/RISCV/compress-opt-select.ll
.It was failed due to previously merged commit
[TTI][RISCV] Unconditionally break critical edges to sink ADDI (PR #108889)
.So, regenerated
compress-opt-select
test.