Skip to content

[AMDGPU][NFC] Precommit tests representing agpr spills. #115270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pravinjagtap
Copy link
Contributor

Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.

Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Pravin Jagtap (pravinjagtap)

Changes

Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.


Full diff: https://github.com/llvm/llvm-project/pull/115270.diff

2 Files Affected:

  • (added) llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir (+69)
  • (added) llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir (+70)
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
new file mode 100644
index 00000000000000..f7680c418b8fa9
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
@@ -0,0 +1,69 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog,machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+# When VGPRs are available for spilling, prologepilog marks the tuple implicit-def as well as implicit in the first spill instruction.
+# As a consequence, machine-cp would NOT delete agpr2 copy here.
+
+---
+name:  agpr-spill-to-vgpr-machine-cp
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr-machine-cp
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
+
+# When VGPRs are NOT available for spilling (stack is used), prologepilog marks the tuple implicit-def only and NOT implicit.
+# As a consequence, machine-cp would delete agpr2 copy here.
+
+---
+name:  agpr-spill-to-vgpr-to-stack-machine-cp
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack-machine-cp
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+    ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
new file mode 100644
index 00000000000000..50bba1baed85f4
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
@@ -0,0 +1,70 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+# During spill expansion, when VGPRs are available for spilling (stack is unused), tuple is being marked as
+# implicit-def as well as implicit in the first spill instrunction.
+
+---
+name:  agpr-spill-to-vgpr
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
+
+# During spill expansion, when VGPRs are NOT available for spilling (stack is used), tuple is being marked as
+# implicit-def ONLY and NOT implicit in the first spill instrunction.
+
+---
+name:  agpr-spill-to-vgpr-to-stack
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+    ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...

@pravinjagtap pravinjagtap force-pushed the prjagtap/precommit-agpr-spilling-tests branch from c886f81 to 8e9a718 Compare November 7, 2024 06:26
@cdevadas cdevadas changed the title [AMDGPU][NFC] Precommit tests representing spills. [AMDGPU][NFC] Precommit tests representing agpr spills. Nov 7, 2024
@@ -0,0 +1,94 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908-PEI-MACHINE-CP %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify full triple.

@pravinjagtap pravinjagtap force-pushed the prjagtap/precommit-agpr-spilling-tests branch 2 times, most recently from 11ee9fc to 76bfb2f Compare November 7, 2024 08:49
@pravinjagtap pravinjagtap force-pushed the prjagtap/precommit-agpr-spilling-tests branch from 76bfb2f to 048c3d9 Compare November 7, 2024 08:55
@pravinjagtap pravinjagtap merged commit 9b909b8 into llvm:main Nov 7, 2024
5 of 7 checks passed
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Dec 19, 2024
Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.

Change-Id: I1667df1d89a54346dc55662f2c7f89d335376b77
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants