Skip to content

[AMDGPU] Fix s_singleuse_vdst not detecting exec mask changes #89401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ScottEgerton
Copy link
Member

Instead of erasing unused registers, setting them
to 0 fixes an issue where single use values were
being incorrectly detected across changes to the
exec mask.
This issue was occuring because erasing the
values was causing the algorithm to forget which
values had been marked as invalid single use
candidates.

Instead of erasing unused registers, setting them
to 0 fixes an issue where single use values were
being incorrectly detected across changes to the
exec mask.
This issue was occuring because erasing the
values was causing the algorithm to forget which
values had been marked as invalid single use
candidates.
@llvmbot
Copy link
Member

llvmbot commented Apr 19, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Scott Egerton (ScottEgerton)

Changes

Instead of erasing unused registers, setting them
to 0 fixes an issue where single use values were
being incorrectly detected across changes to the
exec mask.
This issue was occuring because erasing the
values was causing the algorithm to forget which
values had been marked as invalid single use
candidates.


Full diff: https://github.com/llvm/llvm-project/pull/89401.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInsertSingleUseVDST.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/insert-singleuse-vdst.mir (-3)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInsertSingleUseVDST.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInsertSingleUseVDST.cpp
index f6a63502ec2cda..1a7a01c1ea1821 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInsertSingleUseVDST.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInsertSingleUseVDST.cpp
@@ -112,7 +112,7 @@ class AMDGPUInsertSingleUseVDST : public MachineFunctionPass {
 
           // Reset uses count when a register is no longer live.
           for (const MCRegUnit &Unit : RegUnits)
-            RegisterUseCount.erase(Unit);
+            RegisterUseCount[Unit] = 0;
         }
         if (AllProducerOperandsAreSingleUse && SIInstrInfo::isVALU(MI)) {
           // TODO: Replace with candidate logging for instruction grouping
diff --git a/llvm/test/CodeGen/AMDGPU/insert-singleuse-vdst.mir b/llvm/test/CodeGen/AMDGPU/insert-singleuse-vdst.mir
index 135a101822bfa6..f12a8e95593fe2 100644
--- a/llvm/test/CodeGen/AMDGPU/insert-singleuse-vdst.mir
+++ b/llvm/test/CodeGen/AMDGPU/insert-singleuse-vdst.mir
@@ -398,7 +398,6 @@ body: |
   ; CHECK-NEXT:   successors: %bb.1(0x80000000)
   ; CHECK-NEXT:   liveins: $sgpr0_sgpr1
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   S_SINGLEUSE_VDST 1
   ; CHECK-NEXT:   $vgpr0 = V_MOV_B32_e32 0, implicit $exec
   ; CHECK-NEXT:   $exec = COPY $sgpr0_sgpr1
   ; CHECK-NEXT:   $vgpr0 = V_MOV_B32_e32 $vgpr0, implicit $exec
@@ -424,7 +423,6 @@ body: |
   ; CHECK-NEXT:   successors: %bb.1(0x80000000)
   ; CHECK-NEXT:   liveins: $sgpr0
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   S_SINGLEUSE_VDST 1
   ; CHECK-NEXT:   $vgpr0 = V_MOV_B32_e32 0, implicit $exec
   ; CHECK-NEXT:   $exec_lo = COPY $sgpr0
   ; CHECK-NEXT:   $vgpr0 = V_MOV_B32_e32 $vgpr0, implicit $exec
@@ -450,7 +448,6 @@ body: |
   ; CHECK-NEXT:   successors: %bb.1(0x80000000)
   ; CHECK-NEXT:   liveins: $sgpr0
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   S_SINGLEUSE_VDST 1
   ; CHECK-NEXT:   $vgpr0 = V_MOV_B32_e32 0, implicit $exec
   ; CHECK-NEXT:   $exec_hi = COPY $sgpr0
   ; CHECK-NEXT:   $vgpr0 = V_MOV_B32_e32 $vgpr0, implicit $exec

@@ -112,7 +112,7 @@ class AMDGPUInsertSingleUseVDST : public MachineFunctionPass {

// Reset uses count when a register is no longer live.
for (const MCRegUnit &Unit : RegUnits)
RegisterUseCount.erase(Unit);
RegisterUseCount[Unit] = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really hard to understand the effect of this change because of other problems in this loop-over-the-operands.

  1. The modifiesRegister check on line 104 is odd because it checks if any operand of the current instruction modifies Reg, not just the current operand. Can it just check Operand.isDef() instead?

  2. Really the loop should be split into two processing defs first and then uses. I think you had a downstream patch to do that.

Could you post a patch to fix either (1) or (2) first, and then we can revisit this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I do have a downstream patch for that. I'll post that first.

@jayfoad jayfoad changed the title [AMDGPU] Fix s_singleuse_vdst not detecing exec mask changes [AMDGPU] Fix s_singleuse_vdst not detecting exec mask changes May 1, 2024
@jayfoad
Copy link
Contributor

jayfoad commented May 1, 2024

Please rebase now that #89601 has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants