-
Notifications
You must be signed in to change notification settings - Fork 14.3k
AMDGPU: Try to reuse dest reg for s_add_i32 frame indexes #111201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hack around the register scavenger doing the wrong thing. It does not find the result register as available in the case the frame index add isn't also reading the dest register. This is the quick fix for a regression where the scavenge would create a broken spill of SGPR to memory. I believe this is still broken for cases we cannot use the result register. I'm confused about what position the scavenger iterator is supposed to be in, and what RestoreAfter is for. The scavenger is missing a full set of forward/backward APIs and there seems to be an off by one somewhere.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-llvm-regalloc @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesHack around the register scavenger doing the wrong thing. I'm confused about what position the scavenger iterator Patch is 29.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111201.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index a8cdbbd8a3c5be..de9cbe403ab618 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -2678,12 +2678,18 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
Register TmpReg;
+ // FIXME: Scavenger should figure out that the result register is
+ // available. Also should do this for the v_add case.
+ if (OtherOp.isReg() && OtherOp.getReg() != DstOp.getReg())
+ TmpReg = DstOp.getReg();
+
if (FrameReg && !ST.enableFlatScratch()) {
// FIXME: In the common case where the add does not also read its result
// (i.e. this isn't a reg += fi), it's not finding the dest reg as
// available.
- TmpReg = RS->scavengeRegisterBackwards(AMDGPU::SReg_32_XM0RegClass, MI,
- false, 0);
+ if (!TmpReg)
+ TmpReg = RS->scavengeRegisterBackwards(AMDGPU::SReg_32_XM0RegClass,
+ MI, false, 0);
BuildMI(*MBB, *MI, DL, TII->get(AMDGPU::S_LSHR_B32))
.addDef(TmpReg, RegState::Renamable)
.addReg(FrameReg)
@@ -2711,7 +2717,8 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
if (!TmpReg && MaterializedReg == FrameReg) {
TmpReg = RS->scavengeRegisterBackwards(AMDGPU::SReg_32_XM0RegClass,
- MI, false, 0);
+ MI, /*RestoreAfter=*/false, 0,
+ /*AllowSpill=*/false);
DstReg = TmpReg;
}
diff --git a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
index 585bfb4c58eae2..001a72e3609768 100644
--- a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
+++ b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
@@ -258,31 +258,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__sgpr__fi_offset0
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
;
; MUBUFW32-LABEL: name: s_add_i32__sgpr__fi_offset0
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW64-LABEL: name: s_add_i32__sgpr__fi_offset0
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr4
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW32-LABEL: name: s_add_i32__sgpr__fi_offset0
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr4
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
renamable $sgpr7 = S_ADD_I32 $sgpr8, %stack.0, implicit-def dead $scc
SI_RETURN implicit $sgpr7
@@ -304,31 +304,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__fi_offset0__sgpr
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
;
; MUBUFW32-LABEL: name: s_add_i32__fi_offset0__sgpr
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW64-LABEL: name: s_add_i32__fi_offset0__sgpr
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr4
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW32-LABEL: name: s_add_i32__fi_offset0__sgpr
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr4
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
renamable $sgpr7 = S_ADD_I32 %stack.0, $sgpr8, implicit-def dead $scc
SI_RETURN implicit $sgpr7
@@ -351,31 +351,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__sgpr__fi_literal_offset
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 80, implicit-def dead $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
;
; MUBUFW32-LABEL: name: s_add_i32__sgpr__fi_literal_offset
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 80, implicit-def dead $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW64-LABEL: name: s_add_i32__sgpr__fi_literal_offset
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr4, 80, implicit-def dead $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 80, implicit-def dead $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW32-LABEL: name: s_add_i32__sgpr__fi_literal_offset
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr4, 80, implicit-def dead $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 80, implicit-def dead $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
renamable $sgpr7 = S_ADD_I32 $sgpr8, %stack.1, implicit-def dead $scc
SI_RETURN implicit $sgpr7
@@ -398,31 +398,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__fi_literal_offset__sgpr
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def $scc
; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 80, killed renamable $sgpr7, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_add_i32__fi_literal_offset__sgpr
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def $scc
; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 80, killed renamable $sgpr7, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_add_i32__fi_literal_offset__sgpr
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 80, killed renamable $sgpr4, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 80, killed renamable $sgpr7, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_add_i32__fi_literal_offset__sgpr
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 80, killed renamable $sgpr4, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 80, killed renamable $sgpr7, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_ADD_I32 %stack.1, $sgpr8, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -702,31 +702,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__sgpr__fi_offset0__live_scc
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def $scc
; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 0, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_add_i32__sgpr__fi_offset0__live_scc
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def $scc
; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 0, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_add_i32__sgpr__fi_offset0__live_scc
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr4, 0, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 0, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_add_i32__sgpr__fi_offset0__live_scc
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr4, 0, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 0, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_ADD_I32 $sgpr8, %stack.0, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -795,31 +795,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__sgpr__fi_literal_offset__live_scc
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def $scc
; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 96, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_add_i32__sgpr__fi_literal_offset__live_scc
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def $scc
; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 96, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_add_i32__sgpr__fi_literal_offset__live_scc
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr4, 96, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 96, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_add_i32__sgpr__fi_literal_offset__live_scc
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr4, 96, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed renamable $sgpr7, 96, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_ADD_I32 $sgpr8, %stack.1, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -1104,31 +1104,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__different_sgpr__fi_offset0
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
;
; MUBUFW32-LABEL: name: s_add_i32__different_sgpr__fi_offset0
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr7, $sgpr8, implicit-def dead $scc
; MUBUFW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW64-LABEL: name: s_add_i32__different_sgpr__fi_offset0
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr4
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
;
; FLATSCRW32-LABEL: name: s_add_i32__different_sgpr__fi_offset0
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: renamable $sgpr4 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr4
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr32, $sgpr8, implicit-def dead $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = COPY killed renamable $sgpr7
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
renamable $sgpr7 = S_ADD_I32 $sgpr8, %stack.0, implicit-def dead $scc
SI_RETURN implicit $sgpr7
@@ -1150,31 +1150,31 @@ body: |
; MUBUFW64-LABEL: name: s_add_i32__different_sgpr__fi_offset0_live_after
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 killed $sgpr4, $sgpr8, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hack around the register scavenger doing the wrong thing.
It does not find the result register as available in the
case the frame index add isn't also reading the dest register.
This is the quick fix for a regression where the scavenge would
create a broken spill of SGPR to memory. I believe this is still
broken for cases we cannot use the result register.
I'm confused about what position the scavenger iterator
is supposed to be in, and what RestoreAfter is for. The scavenger
is missing a full set of forward/backward APIs and there seems
to be an off by one somewhere.