Skip to content

[RISCV][WIP] Let RA do the CSR saves. #90819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions llvm/include/llvm/CodeGen/ReachingDefAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,11 @@ class ReachingDefAnalysis : public MachineFunctionPass {
private:
MachineFunction *MF = nullptr;
const TargetRegisterInfo *TRI = nullptr;
const TargetInstrInfo *TII = nullptr;
LoopTraversal::TraversalOrder TraversedMBBOrder;
unsigned NumRegUnits = 0;
unsigned NumStackObjects = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think NumStackObjects and ObjectIndexBegin are available in enterBasicBlock through MBB->getParent()->getFrameInfo().getNumObjects() and MBB->getParent()->getFrameInfo().getObjectIndexBegin().

We could probably avoid storing them as member variables in ReachingDefAnalysis to keep the object smaller.

int ObjectIndexBegin = 0;
/// Instruction that defined each register, relative to the beginning of the
/// current basic block. When a LiveRegsDefInfo is used to represent a
/// live-out register, this value is relative to the end of the basic block,
Expand All @@ -138,6 +141,9 @@ class ReachingDefAnalysis : public MachineFunctionPass {
DenseMap<MachineInstr *, int> InstIds;

MBBReachingDefsInfo MBBReachingDefs;
using MBBFrameObjsReachingDefsInfo =
std::vector<std::vector<std::vector<int>>>;
MBBFrameObjsReachingDefsInfo MBBFrameObjsReachingDefs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a docstring that describes this object. My understanding is that this is representing a map of MBBNumber -> Frame Object Index -> reaching def instruction numbers, since it is not clear by the data type definition.


/// Default values are 'nothing happened a long time ago'.
const int ReachingDefDefaultVal = -(1 << 21);
Expand Down
24 changes: 15 additions & 9 deletions llvm/include/llvm/CodeGen/TargetFrameLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,16 @@ namespace llvm {
class CalleeSavedInfo;
class MachineFunction;
class RegScavenger;

namespace TargetStackID {
enum Value {
Default = 0,
SGPRSpill = 1,
ScalableVector = 2,
WasmLocal = 3,
NoAlloc = 255
};
class ReachingDefAnalysis;

namespace TargetStackID {
enum Value {
Default = 0,
SGPRSpill = 1,
ScalableVector = 2,
WasmLocal = 3,
NoAlloc = 255
};
}

/// Information about stack frame layout on the target. It holds the direction
Expand Down Expand Up @@ -210,6 +211,11 @@ class TargetFrameLowering {
/// for noreturn nounwind functions.
virtual bool enableCalleeSaveSkip(const MachineFunction &MF) const;

virtual void emitCFIsForCSRsHandledByRA(MachineFunction &MF,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a comment here to tell backend authors what they are supposed to do with this hook and when they should consider implementing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @qcolombet I'll make sure I'll add a comment when I create a separate MR for it.

ReachingDefAnalysis *RDA) const {
return;
}

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.
virtual void emitPrologue(MachineFunction &MF,
Expand Down
2 changes: 2 additions & 0 deletions llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,8 @@ class TargetSubtargetInfo : public MCSubtargetInfo {
return false;
}

virtual bool doCSRSavesInRA() const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto missing comments


/// Classify a global function reference. This mainly used to fetch target
/// special flags for lowering a function address. For example mark a function
/// call should be plt or pc-related addressing.
Expand Down
48 changes: 36 additions & 12 deletions llvm/lib/CodeGen/MachineLICM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -262,15 +262,21 @@ namespace {
void HoistOutOfLoop(MachineDomTreeNode *HeaderN, MachineLoop *CurLoop,
MachineBasicBlock *CurPreheader);

void InitRegPressure(MachineBasicBlock *BB);
void InitRegPressure(MachineBasicBlock *BB, const MachineLoop *Loop);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the original shrink wrapping paper:

There is one more consideration in performing the above shrink-wrap optimization. If the shrink-wrapped region lies completely inside a loop, there will be serious performance impact: instead of saving and restoring once per invocation of the procedure, this is now repeated for each iteration of the loop. To prevent this from occurring, whenever a register is used inside a loop, we propagate its APP attribute throughout the entire region of the loop. Thus, any shrink-wrap is not allowed to penetrate loop boundaries.

I'm not sure I fully understand what's going on here but I wanted to check and see if this patch allows shrink wrapping on a per register basis inside loops. If so, is this something we should avoid? Is this concept at all related to this code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also help me understand why this patch (without this machine-licm change) leads to longer live ranges?

Copy link
Contributor Author

@mgudim mgudim Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the original shrink wrapping paper:

There is one more consideration in performing the above shrink-wrap optimization. If the shrink-wrapped region lies completely inside a loop, there will be serious performance impact: instead of saving and restoring once per invocation of the procedure, this is now repeated for each iteration of the loop. To prevent this from occurring, whenever a register is used inside a loop, we propagate its APP attribute throughout the entire region of the loop. Thus, any shrink-wrap is not allowed to penetrate loop boundaries.

I'm not sure I fully understand what's going on here but I wanted to check and see if this patch allows shrink wrapping on a per register basis inside loops. If so, is this something we should avoid? Is this concept at all related to this code?

with this patch, shrink-wrapping becomes regular spilling. The change to the register allocator was made because it gets rid of a spill inside a loop that I saw due to this patch. Probably we will see other spills inside loops, but we will be fixing them in RA now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also help me understand why this patch (without this machine-licm change) leads to longer live ranges?

because on entry to the function we copy each CSR into a virtual register which will be live for the entire function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also help me understand why this patch (without this machine-licm change) leads to longer live ranges?

because on entry to the function we copy each CSR into a virtual register which will be live for the entire function.

Was it not the case that when we spilled the CSR in PrologEpilogInserter that the register was also live for the entire function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no single virtual register corresponding to each Callee Saved Register. The PrologEpilogEmitter approach looks after regalloc to see which callee saved registers are used in the function and spills/restores them.


SmallDenseMap<unsigned, int> calcRegisterCost(const MachineInstr *MI,
bool ConsiderSeen,
bool ConsiderUnseenAsDef);
bool ConsiderUnseenAsDef,
bool IgnoreDefs = false);

bool allDefsAreOnlyUsedOutsideOfTheLoop(const MachineInstr &MI,
const MachineLoop *Loop);
void UpdateRegPressure(const MachineInstr *MI,
bool ConsiderUnseenAsDef = false);
bool ConsiderUnseenAsDef = false,
bool IgnoreDefs = false);

void UpdateRegPressureForUsesOnly(const MachineInstr *MI,
bool ConsiderUnseenAsDef = false);
MachineInstr *ExtractHoistableLoad(MachineInstr *MI, MachineLoop *CurLoop);

MachineInstr *LookForDuplicate(const MachineInstr *MI,
Expand Down Expand Up @@ -884,7 +890,7 @@ void MachineLICMImpl::HoistOutOfLoop(MachineDomTreeNode *HeaderN,
// Compute registers which are livein into the loop headers.
RegSeen.clear();
BackTrace.clear();
InitRegPressure(Preheader);
InitRegPressure(Preheader, CurLoop);

// Now perform LICM.
for (MachineDomTreeNode *Node : Scopes) {
Expand Down Expand Up @@ -934,7 +940,8 @@ static bool isOperandKill(const MachineOperand &MO, MachineRegisterInfo *MRI) {
/// Find all virtual register references that are liveout of the preheader to
/// initialize the starting "register pressure". Note this does not count live
/// through (livein but not used) registers.
void MachineLICMImpl::InitRegPressure(MachineBasicBlock *BB) {
void MachineLICMImpl::InitRegPressure(MachineBasicBlock *BB,
const MachineLoop *Loop) {
std::fill(RegPressure.begin(), RegPressure.end(), 0);

// If the preheader has only a single predecessor and it ends with a
Expand All @@ -945,17 +952,34 @@ void MachineLICMImpl::InitRegPressure(MachineBasicBlock *BB) {
MachineBasicBlock *TBB = nullptr, *FBB = nullptr;
SmallVector<MachineOperand, 4> Cond;
if (!TII->analyzeBranch(*BB, TBB, FBB, Cond, false) && Cond.empty())
InitRegPressure(*BB->pred_begin());
InitRegPressure(*BB->pred_begin(), Loop);
}

for (const MachineInstr &MI : *BB)
UpdateRegPressure(&MI, /*ConsiderUnseenAsDef=*/true);
for (const MachineInstr &MI : *BB) {
bool IgnoreDefs = allDefsAreOnlyUsedOutsideOfTheLoop(MI, Loop);
UpdateRegPressure(&MI, /*ConsiderUnseenAsDef=*/true, IgnoreDefs);
}
}

bool MachineLICMImpl::allDefsAreOnlyUsedOutsideOfTheLoop(
const MachineInstr &MI, const MachineLoop *Loop) {
for (const MachineOperand DefMO : MI.all_defs()) {
if (!DefMO.isReg())
continue;
for (const MachineInstr &UseMI : MRI->use_instructions(DefMO.getReg())) {
if (Loop->contains(UseMI.getParent()))
return false;
}
}
return true;
}

/// Update estimate of register pressure after the specified instruction.
void MachineLICMImpl::UpdateRegPressure(const MachineInstr *MI,
bool ConsiderUnseenAsDef) {
auto Cost = calcRegisterCost(MI, /*ConsiderSeen=*/true, ConsiderUnseenAsDef);
bool ConsiderUnseenAsDef,
bool IgnoreDefs) {
auto Cost = calcRegisterCost(MI, /*ConsiderSeen=*/true, ConsiderUnseenAsDef,
IgnoreDefs);
for (const auto &RPIdAndCost : Cost) {
unsigned Class = RPIdAndCost.first;
if (static_cast<int>(RegPressure[Class]) < -RPIdAndCost.second)
Expand All @@ -973,7 +997,7 @@ void MachineLICMImpl::UpdateRegPressure(const MachineInstr *MI,
/// FIXME: Figure out a way to consider 'RegSeen' from all code paths.
SmallDenseMap<unsigned, int>
MachineLICMImpl::calcRegisterCost(const MachineInstr *MI, bool ConsiderSeen,
bool ConsiderUnseenAsDef) {
bool ConsiderUnseenAsDef, bool IgnoreDefs) {
SmallDenseMap<unsigned, int> Cost;
if (MI->isImplicitDef())
return Cost;
Expand All @@ -991,7 +1015,7 @@ MachineLICMImpl::calcRegisterCost(const MachineInstr *MI, bool ConsiderSeen,

RegClassWeight W = TRI->getRegClassWeight(RC);
int RCCost = 0;
if (MO.isDef())
if (MO.isDef() && !IgnoreDefs)
RCCost = W.RegWeight;
else {
bool isKill = isOperandKill(MO, MRI);
Expand Down
7 changes: 7 additions & 0 deletions llvm/lib/CodeGen/PrologEpilogInserter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
#include "llvm/CodeGen/MachineOperand.h"
#include "llvm/CodeGen/MachineOptimizationRemarkEmitter.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/ReachingDefAnalysis.h"
#include "llvm/CodeGen/RegisterScavenging.h"
#include "llvm/CodeGen/TargetFrameLowering.h"
#include "llvm/CodeGen/TargetInstrInfo.h"
Expand Down Expand Up @@ -95,6 +96,7 @@ class PEI : public MachineFunctionPass {
bool runOnMachineFunction(MachineFunction &MF) override;

private:
ReachingDefAnalysis *RDA = nullptr;
RegScavenger *RS = nullptr;

// MinCSFrameIndex, MaxCSFrameIndex - Keeps the range of callee saved
Expand Down Expand Up @@ -153,6 +155,7 @@ INITIALIZE_PASS_BEGIN(PEI, DEBUG_TYPE, "Prologue/Epilogue Insertion", false,
INITIALIZE_PASS_DEPENDENCY(MachineLoopInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(MachineDominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(MachineOptimizationRemarkEmitterPass)
INITIALIZE_PASS_DEPENDENCY(ReachingDefAnalysis)
INITIALIZE_PASS_END(PEI, DEBUG_TYPE,
"Prologue/Epilogue Insertion & Frame Finalization", false,
false)
Expand All @@ -169,6 +172,7 @@ void PEI::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addPreserved<MachineLoopInfoWrapperPass>();
AU.addPreserved<MachineDominatorTreeWrapperPass>();
AU.addRequired<MachineOptimizationRemarkEmitterPass>();
AU.addRequired<ReachingDefAnalysis>();
MachineFunctionPass::getAnalysisUsage(AU);
}

Expand Down Expand Up @@ -227,6 +231,7 @@ bool PEI::runOnMachineFunction(MachineFunction &MF) {
RS = TRI->requiresRegisterScavenging(MF) ? new RegScavenger() : nullptr;
FrameIndexVirtualScavenging = TRI->requiresFrameIndexScavenging(MF);
ORE = &getAnalysis<MachineOptimizationRemarkEmitterPass>().getORE();
RDA = &getAnalysis<ReachingDefAnalysis>();

// Spill frame pointer and/or base pointer registers if they are clobbered.
// It is placed before call frame instruction elimination so it will not mess
Expand Down Expand Up @@ -262,6 +267,7 @@ bool PEI::runOnMachineFunction(MachineFunction &MF) {
// called functions. Because of this, calculateCalleeSavedRegisters()
// must be called before this function in order to set the AdjustsStack
// and MaxCallFrameSize variables.
RDA->reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a call to reset() needed? Can it be dirty into this function or after the getAnalysis<ReachingDefAnalysis>() above? If it is needed, could we have done this instead in case the function had the Naked attribute:

  if (!F.hasFnAttribute(Attribute::Naked)) {
    RDA->reset();
    insertPrologEpilogCode(MF);
}

if (!F.hasFnAttribute(Attribute::Naked))
insertPrologEpilogCode(MF);

Expand Down Expand Up @@ -1164,6 +1170,7 @@ void PEI::calculateFrameObjectOffsets(MachineFunction &MF) {
void PEI::insertPrologEpilogCode(MachineFunction &MF) {
const TargetFrameLowering &TFI = *MF.getSubtarget().getFrameLowering();

TFI.emitCFIsForCSRsHandledByRA(MF, RDA);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to add a comment explaining that CSRs that are handled by RA still need CFI information and emitPrologue does not emit CFI info for those, since the CSRs may no longer be located in the prologue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a separate note, RISCVFrameLowering::emitPrologue has code to iterate over list of callee-saved registers and emit .cfi_offset directive. Can you help me understand how we know to skip over those instructions?

Copy link
Contributor Author

@mgudim mgudim Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the registers added by RISCVFrameLowering::determineMustCalleeSaves will be handled the same way as before (with usual prologue / epilogue).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the CFIInserter pass needed if we are inserting CFI information here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose we had the following code (before block layout is finalized):

ENTRY:
save x1
.cfi_offset x1, ...
...
beq ... %RET

BB:
...

RET:
ld x1, ...
.cfi_register x1, x1
...
ret

after the block layout the code may look like this (now %RET is layout successor of %ENTRY):

ENTRY:
save x1
.cfi_offset x1, ...
...
bne ... %BB

RET:
ld x1, ...
.cfi_register x1, x1

BB:
...
j %RET

...
ret

The cfi directives apply to all the code below it in the layout, so the cfi's are wrong now. The CFIInserter compares the CFI info from the layout successor to what it has to be and corrects it if necessary.

I basically copied that pass from the existing one CodeGen/ CFIInstrInserter.cpp and modified it.

Copy link
Contributor

@michaelmaitland michaelmaitland Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically copied that pass from the existing one CodeGen/CFIInstrInserter.cpp and modified it.

Is it possible to modify CodeGen/CFIInstrInserter.cpp instead of having a RISC-V specific copy? Maybe using target hooks where necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's possible. In some cases (notably when we have vector spills), we need to use cfi expression instead of simple .cfi_offset. I can add a small cfi-expression decoder to CFIInserter for that.

Here, I recorded cfi info into RISCVMachineFunctionInfo so it can be used by RISCVCFIInserter.

// Add prologue to the function...
for (MachineBasicBlock *SaveBlock : SaveBlocks)
TFI.emitPrologue(MF, *SaveBlock);
Expand Down
59 changes: 55 additions & 4 deletions llvm/lib/CodeGen/ReachingDefAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
#include "llvm/ADT/SetOperations.h"
#include "llvm/ADT/SmallSet.h"
#include "llvm/CodeGen/LiveRegUnits.h"
#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/Support/Debug.h"
Expand Down Expand Up @@ -48,12 +50,28 @@ static bool isValidRegDefOf(const MachineOperand &MO, MCRegister PhysReg,
return TRI->regsOverlap(MO.getReg(), PhysReg);
}

static bool isFIDef(const MachineInstr &MI, int FrameIndex,
const TargetInstrInfo *TII) {
int DefFrameIndex = 0;
int SrcFrameIndex = 0;
if (TII->isStoreToStackSlot(MI, DefFrameIndex) ||
TII->isStackSlotCopy(MI, DefFrameIndex, SrcFrameIndex)) {
return DefFrameIndex == FrameIndex;
}
return false;
}

void ReachingDefAnalysis::enterBasicBlock(MachineBasicBlock *MBB) {
unsigned MBBNumber = MBB->getNumber();
assert(MBBNumber < MBBReachingDefs.numBlockIDs() &&
"Unexpected basic block number.");
MBBReachingDefs.startBasicBlock(MBBNumber, NumRegUnits);

MBBFrameObjsReachingDefs[MBBNumber].resize(NumStackObjects);
for (unsigned FOIdx = 0; FOIdx < NumStackObjects; ++FOIdx) {
MBBFrameObjsReachingDefs[MBBNumber][FOIdx].push_back(-1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid initializing with -1 for all elements here by returning LatestDef when MBBFrameObjsReachingDefs[MBBNumber][FOIdx].empty() below?

}

// Reset instruction counter in each basic block.
CurInstr = 0;

Expand Down Expand Up @@ -126,6 +144,13 @@ void ReachingDefAnalysis::processDefs(MachineInstr *MI) {
"Unexpected basic block number.");

for (auto &MO : MI->operands()) {
if (MO.isFI()) {
int FrameIndex = MO.getIndex();
if (!isFIDef(*MI, FrameIndex, TII))
continue;
MBBFrameObjsReachingDefs[MBBNumber][FrameIndex - ObjectIndexBegin]
.push_back(CurInstr);
}
if (!isValidRegDef(MO))
continue;
for (MCRegUnit Unit : TRI->regunits(MO.getReg().asMCReg())) {
Expand Down Expand Up @@ -211,7 +236,9 @@ void ReachingDefAnalysis::processBasicBlock(

bool ReachingDefAnalysis::runOnMachineFunction(MachineFunction &mf) {
MF = &mf;
TRI = MF->getSubtarget().getRegisterInfo();
const TargetSubtargetInfo &STI = MF->getSubtarget();
TRI = STI.getRegisterInfo();
TII = STI.getInstrInfo();
LLVM_DEBUG(dbgs() << "********** REACHING DEFINITION ANALYSIS **********\n");
init();
traverse();
Expand All @@ -222,6 +249,7 @@ void ReachingDefAnalysis::releaseMemory() {
// Clear the internal vectors.
MBBOutRegsInfos.clear();
MBBReachingDefs.clear();
MBBFrameObjsReachingDefs.clear();
InstIds.clear();
LiveRegs.clear();
}
Expand All @@ -234,7 +262,10 @@ void ReachingDefAnalysis::reset() {

void ReachingDefAnalysis::init() {
NumRegUnits = TRI->getNumRegUnits();
NumStackObjects = MF->getFrameInfo().getNumObjects();
ObjectIndexBegin = MF->getFrameInfo().getObjectIndexBegin();
MBBReachingDefs.init(MF->getNumBlockIDs());
MBBFrameObjsReachingDefs.resize(MF->getNumBlockIDs());
// Initialize the MBBOutRegsInfos
MBBOutRegsInfos.resize(MF->getNumBlockIDs());
LoopTraversal Traversal;
Expand Down Expand Up @@ -269,6 +300,19 @@ int ReachingDefAnalysis::getReachingDef(MachineInstr *MI,
assert(MBBNumber < MBBReachingDefs.numBlockIDs() &&
"Unexpected basic block number.");
int LatestDef = ReachingDefDefaultVal;

if (Register::isStackSlot(PhysReg)) {
int FrameIndex = Register::stackSlot2Index(PhysReg);
for (int Def :
MBBFrameObjsReachingDefs[MBBNumber][FrameIndex - ObjectIndexBegin]) {
if (Def >= InstId)
break;
DefRes = Def;
}
LatestDef = std::max(LatestDef, DefRes);
return LatestDef;
}

for (MCRegUnit Unit : TRI->regunits(PhysReg)) {
for (int Def : MBBReachingDefs.defs(MBBNumber, Unit)) {
if (Def >= InstId)
Expand Down Expand Up @@ -425,7 +469,7 @@ void ReachingDefAnalysis::getLiveOuts(MachineBasicBlock *MBB,
VisitedBBs.insert(MBB);
LiveRegUnits LiveRegs(*TRI);
LiveRegs.addLiveOuts(*MBB);
if (LiveRegs.available(PhysReg))
if (Register::isPhysicalRegister(PhysReg) && LiveRegs.available(PhysReg))
return;

if (auto *Def = getLocalLiveOutMIDef(MBB, PhysReg))
Expand Down Expand Up @@ -508,7 +552,7 @@ bool ReachingDefAnalysis::isReachingDefLiveOut(MachineInstr *MI,
MachineBasicBlock *MBB = MI->getParent();
LiveRegUnits LiveRegs(*TRI);
LiveRegs.addLiveOuts(*MBB);
if (LiveRegs.available(PhysReg))
if (Register::isPhysicalRegister(PhysReg) && LiveRegs.available(PhysReg))
return false;

auto Last = MBB->getLastNonDebugInstr();
Expand All @@ -529,14 +573,21 @@ ReachingDefAnalysis::getLocalLiveOutMIDef(MachineBasicBlock *MBB,
MCRegister PhysReg) const {
LiveRegUnits LiveRegs(*TRI);
LiveRegs.addLiveOuts(*MBB);
if (LiveRegs.available(PhysReg))
if (Register::isPhysicalRegister(PhysReg) && LiveRegs.available(PhysReg))
return nullptr;

auto Last = MBB->getLastNonDebugInstr();
if (Last == MBB->end())
return nullptr;

int Def = getReachingDef(&*Last, PhysReg);

if (Register::isStackSlot(PhysReg)) {
int FrameIndex = Register::stackSlot2Index(PhysReg);
if (isFIDef(*Last, FrameIndex, TII))
return &*Last;
}

for (auto &MO : Last->operands())
if (isValidRegDefOf(MO, PhysReg, TRI))
return &*Last;
Expand Down
Loading
Loading