Skip to content

Commit 5f94b0c

Browse files
authored
AMDGPU: Try to reuse dest reg for s_add_i32 frame indexes (#111201)
Hack around the register scavenger doing the wrong thing. It does not find the result register as available in the case the frame index add isn't also reading the dest register. This is the quick fix for a regression where the scavenge would create a broken spill of SGPR to memory. I believe this is still broken for cases we cannot use the result register. I'm confused about what position the scavenger iterator is supposed to be in, and what RestoreAfter is for. The scavenger is missing a full set of forward/backward APIs and there seems to be an off by one somewhere.
1 parent 5fdda41 commit 5f94b0c

File tree

3 files changed

+123
-69
lines changed

3 files changed

+123
-69
lines changed

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2678,12 +2678,18 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
26782678

26792679
Register TmpReg;
26802680

2681+
// FIXME: Scavenger should figure out that the result register is
2682+
// available. Also should do this for the v_add case.
2683+
if (OtherOp.isReg() && OtherOp.getReg() != DstOp.getReg())
2684+
TmpReg = DstOp.getReg();
2685+
26812686
if (FrameReg && !ST.enableFlatScratch()) {
26822687
// FIXME: In the common case where the add does not also read its result
26832688
// (i.e. this isn't a reg += fi), it's not finding the dest reg as
26842689
// available.
2685-
TmpReg = RS->scavengeRegisterBackwards(AMDGPU::SReg_32_XM0RegClass, MI,
2686-
false, 0);
2690+
if (!TmpReg)
2691+
TmpReg = RS->scavengeRegisterBackwards(AMDGPU::SReg_32_XM0RegClass,
2692+
MI, false, 0);
26872693
BuildMI(*MBB, *MI, DL, TII->get(AMDGPU::S_LSHR_B32))
26882694
.addDef(TmpReg, RegState::Renamable)
26892695
.addReg(FrameReg)
@@ -2711,7 +2717,8 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
27112717

27122718
if (!TmpReg && MaterializedReg == FrameReg) {
27132719
TmpReg = RS->scavengeRegisterBackwards(AMDGPU::SReg_32_XM0RegClass,
2714-
MI, false, 0);
2720+
MI, /*RestoreAfter=*/false, 0,
2721+
/*AllowSpill=*/false);
27152722
DstReg = TmpReg;
27162723
}
27172724

0 commit comments

Comments
 (0)