-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU][NFC] Precommit tests representing agpr spills. #115270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU][NFC] Precommit tests representing agpr spills. #115270
Conversation
Presently we are only marking implicit-def for the spilled AGPR tuple in the first spill instructions and not implicit.
@llvm/pr-subscribers-backend-amdgpu Author: Pravin Jagtap (pravinjagtap) ChangesPresently we are only marking implicit-def for the Full diff: https://github.com/llvm/llvm-project/pull/115270.diff 2 Files Affected:
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
new file mode 100644
index 00000000000000..f7680c418b8fa9
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
@@ -0,0 +1,69 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog,machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+# When VGPRs are available for spilling, prologepilog marks the tuple implicit-def as well as implicit in the first spill instruction.
+# As a consequence, machine-cp would NOT delete agpr2 copy here.
+
+---
+name: agpr-spill-to-vgpr-machine-cp
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr-machine-cp
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
+
+# When VGPRs are NOT available for spilling (stack is used), prologepilog marks the tuple implicit-def only and NOT implicit.
+# As a consequence, machine-cp would delete agpr2 copy here.
+
+---
+name: agpr-spill-to-vgpr-to-stack-machine-cp
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack-machine-cp
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+ ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
new file mode 100644
index 00000000000000..50bba1baed85f4
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
@@ -0,0 +1,70 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+# During spill expansion, when VGPRs are available for spilling (stack is unused), tuple is being marked as
+# implicit-def as well as implicit in the first spill instrunction.
+
+---
+name: agpr-spill-to-vgpr
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
+
+# During spill expansion, when VGPRs are NOT available for spilling (stack is used), tuple is being marked as
+# implicit-def ONLY and NOT implicit in the first spill instrunction.
+
+---
+name: agpr-spill-to-vgpr-to-stack
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+ ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
|
c886f81
to
8e9a718
Compare
@@ -0,0 +1,94 @@ | |||
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py | |||
# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908-PEI-MACHINE-CP %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify full triple.
11ee9fc
to
76bfb2f
Compare
76bfb2f
to
048c3d9
Compare
Presently we are only marking implicit-def for the spilled AGPR tuple in the first spill instructions and not implicit. Change-Id: I1667df1d89a54346dc55662f2c7f89d335376b77
Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.