Skip to content

[MachineSink] Update register dependency correctly #109763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

ruiling
Copy link
Contributor

@ruiling ruiling commented Sep 24, 2024

The accumulateUsedDefed() was missing if block prologue interference check does not pass. This would cause incorrect register dependency, which cause incorrect sinking.

The accumulateUsedDefed() was missing if block prologue interference
check does not pass. This would cause incorrect register dependency,
which cause incorrect sinking.
@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Ruiling, Song (ruiling)

Changes

The accumulateUsedDefed() was missing if block prologue interference check does not pass. This would cause incorrect register dependency, which cause incorrect sinking.


Full diff: https://github.com/llvm/llvm-project/pull/109763.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineSink.cpp (+3-2)
  • (added) llvm/test/CodeGen/AMDGPU/postra-sink-update-dependency.mir (+66)
diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp
index 609f9af9767f5d..658ebd47488c71 100644
--- a/llvm/lib/CodeGen/MachineSink.cpp
+++ b/llvm/lib/CodeGen/MachineSink.cpp
@@ -2152,8 +2152,9 @@ bool PostRAMachineSinking::tryToSinkCopy(MachineBasicBlock &CurBB,
     MachineBasicBlock::iterator InsertPos =
         SuccBB->SkipPHIsAndLabels(SuccBB->begin());
     if (blockPrologueInterferes(SuccBB, InsertPos, MI, TRI, TII, nullptr)) {
-      LLVM_DEBUG(
-          dbgs() << " *** Not sinking: prologue interference\n");
+      LiveRegUnits::accumulateUsedDefed(MI, ModifiedRegUnits, UsedRegUnits,
+                                        TRI);
+      LLVM_DEBUG(dbgs() << " *** Not sinking: prologue interference\n");
       continue;
     }
 
diff --git a/llvm/test/CodeGen/AMDGPU/postra-sink-update-dependency.mir b/llvm/test/CodeGen/AMDGPU/postra-sink-update-dependency.mir
new file mode 100644
index 00000000000000..14617e066f954c
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/postra-sink-update-dependency.mir
@@ -0,0 +1,66 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -run-pass=postra-machine-sink -verify-machineinstrs -o - %s | FileCheck %s
+#
+# In the example, the ` $sgpr4 = COPY $sgpr2` was incorrectly sunk into bb.3. This happened because we did not update
+# register uses when we found that `$sgpr2 = COPY $sgpr3` should not be sunk because of conflict with the successor's
+# prologue instructions.
+---
+name:            update_dependency_correctly
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: update_dependency_correctly
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; CHECK-NEXT:   liveins: $sgpr0, $sgpr3, $sgpr2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $vgpr1 = IMPLICIT_DEF
+  ; CHECK-NEXT:   renamable $sgpr4 = COPY $sgpr2
+  ; CHECK-NEXT:   renamable $sgpr2 = COPY $sgpr3
+  ; CHECK-NEXT:   $vgpr1 = SI_SPILL_S32_TO_VGPR $sgpr0, 0, $vgpr1
+  ; CHECK-NEXT:   $sgpr1 = S_AND_SAVEEXEC_B32 $sgpr0, implicit-def $exec, implicit-def $scc, implicit $exec
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.1, implicit $exec
+  ; CHECK-NEXT:   S_BRANCH %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT:   liveins: $sgpr0, $sgpr2, $sgpr4, $vgpr1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $sgpr3 = SI_RESTORE_S32_FROM_VGPR $vgpr1, 0
+  ; CHECK-NEXT:   renamable $sgpr0_sgpr1 = S_GETPC_B64_pseudo
+  ; CHECK-NEXT:   renamable $sgpr5 = COPY $sgpr1
+  ; CHECK-NEXT:   renamable $sgpr0_sgpr1 = S_LOAD_DWORDX2_IMM renamable $sgpr4_sgpr5, 32, 0
+  ; CHECK-NEXT:   S_BRANCH %bb.1
+  bb.0:
+    successors: %bb.3(0x40000000), %bb.2(0x40000000)
+    liveins: $sgpr0, $sgpr3, $sgpr2
+
+    $vgpr1 = IMPLICIT_DEF
+
+    renamable $sgpr4 = COPY $sgpr2
+    renamable $sgpr2 = COPY $sgpr3
+
+    $vgpr1 = SI_SPILL_S32_TO_VGPR $sgpr0, 0, $vgpr1
+
+    $sgpr1 = S_AND_SAVEEXEC_B32 $sgpr0, implicit-def $exec, implicit-def $scc, implicit $exec
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+    S_BRANCH %bb.3
+
+  bb.2:
+    S_ENDPGM 0
+
+  bb.3:
+    successors: %bb.2(0x40000000)
+    liveins: $sgpr0, $sgpr2, $sgpr4, $vgpr1
+
+    $sgpr3 = SI_RESTORE_S32_FROM_VGPR $vgpr1, 0
+
+    renamable $sgpr0_sgpr1 = S_GETPC_B64_pseudo
+    renamable $sgpr5 = COPY $sgpr1
+    renamable $sgpr0_sgpr1 = S_LOAD_DWORDX2_IMM renamable $sgpr4_sgpr5, 32, 0
+
+    S_BRANCH %bb.2
+
+...

@ruiling ruiling merged commit e33e087 into llvm:main Sep 25, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants