-
Notifications
You must be signed in to change notification settings - Fork 14.3k
RegisterCoalescer: Add undef flags in removePartialRedundancy #75152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegisterCoalescer: Add undef flags in removePartialRedundancy #75152
Conversation
If the copy being hoisted was undef, we have the same problems that eliminateUndefCopy needs to solve. We would effectively be introducing a new live out implicit_def. We need to add an undef flag to avoid artificially introducing a live through undef value. Previously, the verifier would fail due to the dead def inside the loop providing the live in value for the %1 use.
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-regalloc Author: Matt Arsenault (arsenm) ChangesIf the copy being hoisted was undef, we have the same problems that eliminateUndefCopy needs to solve. We would effectively be introducing a new live out implicit_def. We need to add an undef flag to avoid artificially introducing a live through undef value. Previously, the verifier would fail due to the dead def inside the loop providing the live in value for the %1 use. Full diff: https://github.com/llvm/llvm-project/pull/75152.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index c067d87a9fd810..1b6b94c27fc614 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1214,6 +1214,19 @@ bool RegisterCoalescer::removePartialRedundancy(const CoalescerPair &CP,
LIS->pruneValue(*static_cast<LiveRange *>(&IntB), CopyIdx.getRegSlot(),
&EndPoints);
BValNo->markUnused();
+
+ if (CopyMI.getOperand(1).isUndef()) {
+ // We're introducing an undef phi def, and need to set undef on any users of
+ // the previously local def to avoid artifically extending the lifetime
+ // through the block.
+ for (MachineOperand &MO : MRI->use_nodbg_operands(IntB.reg())) {
+ const MachineInstr &MI = *MO.getParent();
+ SlotIndex UseIdx = LIS->getInstructionIndex(MI);
+ if (!IntB.liveAt(UseIdx))
+ MO.setIsUndef(true);
+ }
+ }
+
// Extend IntB to the EndPoints of its original live interval.
LIS->extendToIndices(IntB, EndPoints);
diff --git a/llvm/test/CodeGen/X86/coalescer-partial-redundancy-clear-dead-flag-undef-copy.mir b/llvm/test/CodeGen/X86/coalescer-partial-redundancy-clear-dead-flag-undef-copy.mir
new file mode 100644
index 00000000000000..5f33be0bc15559
--- /dev/null
+++ b/llvm/test/CodeGen/X86/coalescer-partial-redundancy-clear-dead-flag-undef-copy.mir
@@ -0,0 +1,47 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=x86_64-pc-linux-gnu -run-pass=register-coalescer -verify-coalescing -o - %s | FileCheck %s
+
+# Check for "Live range continues after dead def flag".
+
+# There are 2 copies of undef, but the registers also appear to be
+# live due to block live outs, and thus were not deleted as
+# eliminateUndefCopy only considered the live range, and not the undef
+# flag.
+#
+# removePartialRedundancy would move the COPY undef %0 in bb.1 to
+# bb.0. The live range of %1 would then be extended to be live out of
+# %bb.1 for the backedge phi. This would then fail the verifier, since
+# the dead flag was no longer valid. This was fixed by directly
+# considering the undef flag to avoid considering this special case.
+
+---
+name: partial_redundancy_coalesce_undef_copy_live_out
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: partial_redundancy_coalesce_undef_copy_live_out
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $rdi
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $rdi
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: dead [[XOR32ri:%[0-9]+]]:gr32 = XOR32ri undef [[XOR32ri]], 1, implicit-def dead $eflags
+ ; CHECK-NEXT: dead [[MOV32rr:%[0-9]+]]:gr32 = MOV32rr [[COPY]]
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = IMPLICIT_DEF
+ ; CHECK-NEXT: JMP_1 %bb.1
+ bb.0:
+ liveins: $rdi
+
+ %0:gr32 = COPY $rdi
+
+ bb.1:
+ %1:gr32 = COPY undef %0
+ dead %1:gr32 = XOR32ri %1, 1, implicit-def dead $eflags
+ dead %2:gr32 = MOV32rr killed %0
+ %0:gr32 = COPY killed undef %1
+ JMP_1 %bb.1
+
+...
|
If the copy being hoisted was undef, we have the same problems that eliminateUndefCopy needs to solve. We would effectively be introducing a new live out implicit_def. We need to add an undef flag to avoid artificially introducing a live through undef value. Previously, the verifier would fail due to the dead def inside the loop providing the live in value for the %1 use.