-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] Fix GCUpwardRPTracker. #74328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,87 +47,46 @@ body: | | |
name: live_through_test | ||
tracksRegLiveness: true | ||
body: | | ||
; RPU-LABEL: name: live_through_test | ||
; RPU: bb.0: | ||
; RPU-NEXT: Live-in: | ||
; RPU-NEXT: SGPR VGPR | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 3 0 %0:sgpr_128 = IMPLICIT_DEF | ||
; RPU-NEXT: 3 0 | ||
; RPU-NEXT: Live-out: %0:00000000000000F3 | ||
; RPU-NEXT: Live-thr: | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: bb.1: | ||
; RPU-NEXT: Live-in: %0:00000000000000F3 | ||
; RPU-NEXT: SGPR VGPR | ||
; RPU-NEXT: 3 0 | ||
; RPU-NEXT: 3 0 S_NOP 0, implicit %0.sub0:sgpr_128 | ||
; RPU-NEXT: 2 0 | ||
; RPU-NEXT: 3 0 %0.sub0:sgpr_128 = IMPLICIT_DEF | ||
; RPU-NEXT: 3 0 | ||
; RPU-NEXT: 3 0 %0.sub1:sgpr_128 = IMPLICIT_DEF | ||
; RPU-NEXT: 3 0 | ||
; RPU-NEXT: 3 0 S_NOP 0, implicit %0.sub2:sgpr_128 | ||
; RPU-NEXT: 2 0 | ||
; RPU-NEXT: 3 0 %0.sub2:sgpr_128 = IMPLICIT_DEF | ||
; RPU-NEXT: 3 0 | ||
; RPU-NEXT: 3 0 S_NOP 0, implicit %0.sub2:sgpr_128 | ||
; RPU-NEXT: 2 0 | ||
; RPU-NEXT: 2 0 S_NOP 0, implicit %0.sub3:sgpr_128 | ||
; RPU-NEXT: 2 0 | ||
; RPU-NEXT: Live-out: %0:00000000000000C3 | ||
; RPU-NEXT: Live-thr: %0:00000000000000C0 | ||
; RPU-NEXT: 1 0 | ||
; RPU-NEXT: bb.2: | ||
; RPU-NEXT: Live-in: %0:00000000000000C3 | ||
; RPU-NEXT: SGPR VGPR | ||
; RPU-NEXT: 2 0 | ||
; RPU-NEXT: 2 0 S_NOP 0, implicit %0.sub3:sgpr_128, implicit %0.sub0:sgpr_128 | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: Live-out: | ||
; RPU-NEXT: Live-thr: | ||
; RPU-NEXT: 0 0 | ||
; | ||
; RPD-LABEL: name: live_through_test | ||
; RPD: bb.0: | ||
; RPD-NEXT: Live-in: | ||
; RPD-NEXT: SGPR VGPR | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: 4 0 %0:sgpr_128 = IMPLICIT_DEF | ||
; RPD-NEXT: 3 0 | ||
; RPD-NEXT: Live-out: %0:00000000000000F3 | ||
; RPD-NEXT: Live-thr: | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: bb.1: | ||
; RPD-NEXT: Live-in: %0:00000000000000F3 | ||
; RPD-NEXT: SGPR VGPR | ||
; RPD-NEXT: 3 0 | ||
; RPD-NEXT: 3 0 S_NOP 0, implicit %0.sub0:sgpr_128 | ||
; RPD-NEXT: 2 0 | ||
; RPD-NEXT: 3 0 %0.sub0:sgpr_128 = IMPLICIT_DEF | ||
; RPD-NEXT: 3 0 | ||
; RPD-NEXT: 4 0 %0.sub1:sgpr_128 = IMPLICIT_DEF | ||
; RPD-NEXT: 3 0 | ||
; RPD-NEXT: 3 0 S_NOP 0, implicit %0.sub2:sgpr_128 | ||
; RPD-NEXT: 2 0 | ||
; RPD-NEXT: 3 0 %0.sub2:sgpr_128 = IMPLICIT_DEF | ||
; RPD-NEXT: 3 0 | ||
; RPD-NEXT: 3 0 S_NOP 0, implicit %0.sub2:sgpr_128 | ||
; RPD-NEXT: 2 0 | ||
; RPD-NEXT: 2 0 S_NOP 0, implicit %0.sub3:sgpr_128 | ||
; RPD-NEXT: 2 0 | ||
; RPD-NEXT: Live-out: %0:00000000000000C3 | ||
; RPD-NEXT: Live-thr: %0:00000000000000C0 | ||
; RPD-NEXT: 1 0 | ||
; RPD-NEXT: bb.2: | ||
; RPD-NEXT: Live-in: %0:00000000000000C3 | ||
; RPD-NEXT: SGPR VGPR | ||
; RPD-NEXT: 2 0 | ||
; RPD-NEXT: 2 0 S_NOP 0, implicit %0.sub3:sgpr_128, implicit %0.sub0:sgpr_128 | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: Live-out: | ||
; RPD-NEXT: Live-thr: | ||
; RPD-NEXT: 0 0 | ||
; RP-LABEL: name: live_through_test | ||
; RP: bb.0: | ||
; RP-NEXT: Live-in: | ||
; RP-NEXT: SGPR VGPR | ||
; RP-NEXT: 0 0 | ||
; RP-NEXT: 4 0 %0:sgpr_128 = IMPLICIT_DEF | ||
; RP-NEXT: 3 0 | ||
; RP-NEXT: Live-out: %0:00000000000000F3 | ||
; RP-NEXT: Live-thr: | ||
; RP-NEXT: 0 0 | ||
; RP-NEXT: bb.1: | ||
; RP-NEXT: Live-in: %0:00000000000000F3 | ||
; RP-NEXT: SGPR VGPR | ||
; RP-NEXT: 3 0 | ||
; RP-NEXT: 3 0 S_NOP 0, implicit %0.sub0:sgpr_128 | ||
; RP-NEXT: 2 0 | ||
; RP-NEXT: 3 0 %0.sub0:sgpr_128 = IMPLICIT_DEF | ||
; RP-NEXT: 3 0 | ||
; RP-NEXT: 4 0 %0.sub1:sgpr_128 = IMPLICIT_DEF | ||
; RP-NEXT: 3 0 | ||
; RP-NEXT: 3 0 S_NOP 0, implicit %0.sub2:sgpr_128 | ||
; RP-NEXT: 2 0 | ||
; RP-NEXT: 3 0 %0.sub2:sgpr_128 = IMPLICIT_DEF | ||
; RP-NEXT: 3 0 | ||
; RP-NEXT: 3 0 S_NOP 0, implicit %0.sub2:sgpr_128 | ||
; RP-NEXT: 2 0 | ||
; RP-NEXT: 2 0 S_NOP 0, implicit %0.sub3:sgpr_128 | ||
; RP-NEXT: 2 0 | ||
; RP-NEXT: Live-out: %0:00000000000000C3 | ||
; RP-NEXT: Live-thr: %0:00000000000000C0 | ||
; RP-NEXT: 1 0 | ||
; RP-NEXT: bb.2: | ||
; RP-NEXT: Live-in: %0:00000000000000C3 | ||
; RP-NEXT: SGPR VGPR | ||
; RP-NEXT: 2 0 | ||
; RP-NEXT: 2 0 S_NOP 0, implicit %0.sub3:sgpr_128, implicit %0.sub0:sgpr_128 | ||
; RP-NEXT: 0 0 | ||
; RP-NEXT: Live-out: | ||
; RP-NEXT: Live-thr: | ||
; RP-NEXT: 0 0 | ||
bb.0: | ||
%0:sgpr_128 = IMPLICIT_DEF | ||
bb.1: | ||
|
@@ -223,7 +182,7 @@ body: | | |
; RPU-NEXT: 0 7 | ||
; RPU-NEXT: 0 7 %7:vgpr_32 = GLOBAL_LOAD_DWORD %5:vreg_64, 0, 0, implicit $exec | ||
; RPU-NEXT: 0 6 | ||
; RPU-NEXT: 0 7 %8:vreg_64 = IMPLICIT_DEF | ||
; RPU-NEXT: 0 8 %8:vreg_64 = IMPLICIT_DEF | ||
; RPU-NEXT: 0 7 | ||
; RPU-NEXT: 0 9 %9:vreg_64 = IMPLICIT_DEF | ||
; RPU-NEXT: 0 9 | ||
|
@@ -262,7 +221,7 @@ body: | | |
; RPU-NEXT: 0 12 | ||
; RPU-NEXT: 0 12 dead %21:vgpr_32 = GLOBAL_LOAD_DWORD %14:vreg_64, 0, 0, implicit $exec | ||
; RPU-NEXT: 0 10 | ||
; RPU-NEXT: 0 10 dead %22:vgpr_32 = GLOBAL_LOAD_DWORD %15:vreg_64, 0, 0, implicit $exec | ||
; RPU-NEXT: 0 11 dead %22:vgpr_32 = GLOBAL_LOAD_DWORD %15:vreg_64, 0, 0, implicit $exec | ||
; RPU-NEXT: 0 10 | ||
; RPU-NEXT: 0 10 %23:vreg_64 = V_LSHLREV_B64_e64 2, %8:vreg_64, implicit $exec | ||
; RPU-NEXT: 0 9 | ||
|
@@ -550,7 +509,7 @@ body: | | |
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 0 $sgpr0 = S_BUFFER_LOAD_DWORD_IMM $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0 | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 0 undef %0.sub5:vreg_512 = V_MOV_B32_e32 5, implicit $exec | ||
; RPU-NEXT: 0 1 undef %0.sub5:vreg_512 = V_MOV_B32_e32 5, implicit $exec | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 0 S_CMP_GT_U32 $sgpr0, 15, implicit-def $scc | ||
; RPU-NEXT: 0 0 | ||
|
@@ -569,7 +528,7 @@ body: | | |
; RPU-NEXT: 0 1 | ||
; RPU-NEXT: 0 1 $m0 = S_MOV_B32 killed $sgpr0 | ||
; RPU-NEXT: 0 1 | ||
; RPU-NEXT: 0 1 %0:vreg_512 = V_INDIRECT_REG_WRITE_MOVREL_B32_V16 %0:vreg_512(tied-def 0), 42, 3, implicit $m0, implicit $exec | ||
; RPU-NEXT: 0 16 %0:vreg_512 = V_INDIRECT_REG_WRITE_MOVREL_B32_V16 %0:vreg_512(tied-def 0), 42, 3, implicit $m0, implicit $exec | ||
; RPU-NEXT: 0 1 | ||
; RPU-NEXT: Live-out: %0:0000000000000C00 | ||
; RPU-NEXT: Live-thr: | ||
|
@@ -666,3 +625,70 @@ body: | | |
EXP_DONE 0, %49:vgpr_32, undef %51:vgpr_32, undef %53:vgpr_32, undef %55:vgpr_32, -1, 0, 1, implicit $exec | ||
S_ENDPGM 0 | ||
... | ||
--- | ||
name: early_clobber_def_used_on_rhs | ||
registers: | ||
- { id: 0, class: vgpr_32 } | ||
body: | | ||
; RPU-LABEL: name: early_clobber_def_used_on_rhs | ||
; RPU: bb.0: | ||
; RPU-NEXT: Live-in: | ||
; RPU-NEXT: SGPR VGPR | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 1 dead %3:vgpr_32 = COPY $vgpr0 | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 1 early-clobber %2:vgpr_32 = COPY %0:vgpr_32 | ||
Comment on lines
+638
to
+640
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where did %2 and %3 come from? The input only has %0. Anyway this PR seems to be failing its own tests, at least with assertions enabled. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mir parser renames those registers, I haven't managed to prevent it doing that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually this isn't MIR parser, this is done by LiveIntervals analysis. Firstly, I forgot to add
Verifier complains that there is no live segment at 32B slot index, that is at the entry to 32 instruction. LiveIntervals correctly determined that the value produced at 16R is dead because it's clobbered by the def at 32E before the use at 32R. It renamed %0 at 16R to %1 and marked it as dead but I think it should also mark the %0 use at 32R as 'undef' for this IR to be valid. If I add 'undef' flag there manually everything works fine:
So I think current implementation of GCNUpwardRPTracker is correct if it works on correct IR. LiveIntervals should be fixed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the other hand
This may mean that 'undef' flag should have been set manually in the testcase to be valid so I'm not sure. Maybe I should just add a test with 'undef' flag and get away with it. |
||
; RPU-NEXT: 0 1 | ||
; RPU-NEXT: 0 1 S_NOP 0, implicit %2:vgpr_32 | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: Live-out: | ||
; RPU-NEXT: Live-thr: | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: bb.1: | ||
; RPU-NEXT: Live-in: | ||
; RPU-NEXT: SGPR VGPR | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 1 dead %1:vgpr_32 = COPY $vgpr0 | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: 0 1 dead %0:vgpr_32 = COPY $vgpr0 | ||
; RPU-NEXT: 0 0 | ||
; RPU-NEXT: Live-out: | ||
; RPU-NEXT: Live-thr: | ||
; RPU-NEXT: 0 0 | ||
; | ||
; RPD-LABEL: name: early_clobber_def_used_on_rhs | ||
; RPD: bb.0: | ||
; RPD-NEXT: Live-in: | ||
; RPD-NEXT: SGPR VGPR | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: 0 1 dead %3:vgpr_32 = COPY $vgpr0 | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: 0 1 early-clobber %2:vgpr_32 = COPY %0:vgpr_32 | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: 0 0 S_NOP 0, implicit %2:vgpr_32 | ||
; RPD-NEXT: 0 -1 | ||
; RPD-NEXT: Live-out: | ||
; RPD-NEXT: mis LIS: | ||
; RPD-NEXT: Live-thr: | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: bb.1: | ||
; RPD-NEXT: Live-in: | ||
; RPD-NEXT: SGPR VGPR | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: 0 1 dead %1:vgpr_32 = COPY $vgpr0 | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: 0 1 dead %0:vgpr_32 = COPY $vgpr0 | ||
; RPD-NEXT: 0 0 | ||
; RPD-NEXT: Live-out: | ||
; RPD-NEXT: Live-thr: | ||
; RPD-NEXT: 0 0 | ||
bb.0: | ||
liveins: $vgpr0 | ||
%0 = COPY $vgpr0 | ||
early-clobber %0 = COPY %0 | ||
S_NOP 0, implicit %0 | ||
bb.1: | ||
liveins: $vgpr0 | ||
%0 = COPY $vgpr0 | ||
%0 = COPY $vgpr0 ; Force isSSA = false | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably worth to add ECDefPressure conditionally as early-clobbers are very rare (if any).