-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Coalescer] Consider NewMI's subreg index when updating lanemask. #121780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Coalescer] Consider NewMI's subreg index when updating lanemask. #121780
Conversation
@llvm/pr-subscribers-llvm-regalloc Author: Sander de Smalen (sdesmalen-arm) ChangesThe code added in #116191 that updated the lanemasks for rematerialized
which during rematerialization would have the following variables set:
When checking whether the lanemasks need to be generated, considering The added tests are a bit more involved, because I was not able to Full diff: https://github.com/llvm/llvm-project/pull/121780.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 712c64bb9cf6d5..df7193eae6fbc3 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1508,14 +1508,14 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
// undef %2.subreg:reg = INST %1:reg ; DefMI (rematerializable),
// ; DefSubIdx = subreg
// %3:reg = COPY %2 ; SrcIdx = DstIdx = 0
- // .... = SOMEINSTR %3:reg
+ // .... = SOMEINSTR %3:reg, %2
//
// there are no subranges for %3 so after rematerialization we need
// to explicitly create them. Undefined subranges are removed later on.
- if (DefSubIdx && !CP.getSrcIdx() && !CP.getDstIdx() &&
- MRI->shouldTrackSubRegLiveness(DstReg) && !DstInt.hasSubRanges()) {
+ if (NewIdx && !DstInt.hasSubRanges() &&
+ MRI->shouldTrackSubRegLiveness(DstReg)) {
LaneBitmask FullMask = MRI->getMaxLaneMaskForVReg(DstReg);
- LaneBitmask UsedLanes = TRI->getSubRegIndexLaneMask(DefSubIdx);
+ LaneBitmask UsedLanes = TRI->getSubRegIndexLaneMask(NewIdx);
LaneBitmask UnusedLanes = FullMask & ~UsedLanes;
VNInfo::Allocator &Alloc = LIS->getVNInfoAllocator();
DstInt.createSubRangeFrom(Alloc, UsedLanes, DstInt);
diff --git a/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir b/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir
index b61fa4be040070..08fc47d9480ce9 100644
--- a/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir
+++ b/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir
@@ -1,5 +1,5 @@
+# RUN: llc -mtriple=aarch64 -o /dev/null -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking -debug-only=regalloc %s 2>&1 | FileCheck %s --check-prefix=CHECK-DBG
# RUN: llc -mtriple=aarch64 -verify-machineinstrs -o - -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking %s | FileCheck %s --check-prefix=CHECK
-# RUN: llc -mtriple=aarch64 -verify-machineinstrs -o /dev/null -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking -debug-only=regalloc %s 2>&1 | FileCheck %s --check-prefix=CHECK-DBG
# REQUIRES: asserts
# CHECK-DBG: ********** REGISTER COALESCER **********
@@ -36,3 +36,94 @@ body: |
RET_ReallyLR
...
+# CHECK-DBG: ********** REGISTER COALESCER **********
+# CHECK-DBG: ********** Function: reproducer
+# CHECK-DBG: ********** JOINING INTERVALS ***********
+# CHECK-DBG: ********** INTERVALS **********
+# CHECK-DBG: %1 [32r,48B:2)[48B,320r:0)[320r,368B:1) 0@48B-phi 1@320r 2@32r
+# CHECK-DBG-SAME: weight:0.000000e+00
+# CHECK-DBG: %3 [80r,160B:2)[240r,272B:1)[288r,304B:0)[304B,320r:3) 0@288r 1@240r 2@80r 3@304B-phi
+# CHECK-DBG-SAME: L0000000000000080 [288r,304B:0)[304B,320r:3) 0@288r 1@x 2@x 3@304B-phi
+# CHECK-DBG-SAME: L0000000000000040 [80r,160B:2)[240r,272B:1)[288r,304B:0)[304B,320r:3) 0@288r 1@240r 2@80r 3@304B-phi
+# CHECK-DBG-SAME: weight:0.000000e+00
+---
+name: reproducer
+tracksRegLiveness: true
+body: |
+ bb.0:
+ %0:gpr32 = MOVi32imm 1
+ %1:gpr64 = IMPLICIT_DEF
+
+ bb.1:
+
+ bb.2:
+ %3:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
+
+ bb.3:
+ $nzcv = IMPLICIT_DEF
+ %4:gpr64 = COPY killed %3
+ Bcc 1, %bb.7, implicit killed $nzcv
+
+ bb.4:
+ $nzcv = IMPLICIT_DEF
+ Bcc 1, %bb.6, implicit killed $nzcv
+
+ bb.5:
+ %5:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
+ %4:gpr64 = COPY killed %5
+ B %bb.7
+
+ bb.6:
+ %4:gpr64 = COPY $xzr
+
+ bb.7:
+ %7:gpr64 = ADDXrs killed %1, killed %4, 1
+ %1:gpr64 = COPY killed %7
+ B %bb.1
+
+...
+# CHECK-DBG: ********** REGISTER COALESCER **********
+# CHECK-DBG: ********** Function: reproducer2
+# CHECK-DBG: ********** JOINING INTERVALS ***********
+# CHECK-DBG: ********** INTERVALS **********
+# CHECK-DBG: %1 [32r,48B:2)[48B,304r:0)[304r,352B:1) 0@48B-phi 1@304r 2@32r
+# CHECK-DBG-SAME: weight:0.000000e+00
+# CHECK-DBG: %3 [80r,160B:2)[224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@80r 3@288B-phi
+# CHECK-DBG-SAME: L0000000000000080 [224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@x 3@288B-phi
+# CHECK-DBG-SAME: L0000000000000040 [80r,160B:2)[224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@80r 3@288B-phi
+# CHECK-DBG-SAME: weight:0.000000e+00
+---
+name: reproducer2
+tracksRegLiveness: true
+body: |
+ bb.0:
+ %0:gpr32 = MOVi32imm 1
+ %1:gpr64 = IMPLICIT_DEF
+
+ bb.1:
+
+ bb.2:
+ %3:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
+
+ bb.3:
+ $nzcv = IMPLICIT_DEF
+ %4:gpr64 = COPY killed %3
+ Bcc 1, %bb.7, implicit killed $nzcv
+
+ bb.4:
+ $nzcv = IMPLICIT_DEF
+ Bcc 1, %bb.6, implicit killed $nzcv
+
+ bb.5:
+ %4:gpr64 = IMPLICIT_DEF
+ B %bb.7
+
+ bb.6:
+ %4:gpr64 = COPY $xzr
+
+ bb.7:
+ %5:gpr64 = ADDXrs killed %1, killed %4, 1
+ %1:gpr64 = COPY killed %5
+ B %bb.1
+
+...
|
@llvm/pr-subscribers-backend-aarch64 Author: Sander de Smalen (sdesmalen-arm) ChangesThe code added in #116191 that updated the lanemasks for rematerialized
which during rematerialization would have the following variables set:
When checking whether the lanemasks need to be generated, considering The added tests are a bit more involved, because I was not able to Full diff: https://github.com/llvm/llvm-project/pull/121780.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 712c64bb9cf6d5..df7193eae6fbc3 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1508,14 +1508,14 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
// undef %2.subreg:reg = INST %1:reg ; DefMI (rematerializable),
// ; DefSubIdx = subreg
// %3:reg = COPY %2 ; SrcIdx = DstIdx = 0
- // .... = SOMEINSTR %3:reg
+ // .... = SOMEINSTR %3:reg, %2
//
// there are no subranges for %3 so after rematerialization we need
// to explicitly create them. Undefined subranges are removed later on.
- if (DefSubIdx && !CP.getSrcIdx() && !CP.getDstIdx() &&
- MRI->shouldTrackSubRegLiveness(DstReg) && !DstInt.hasSubRanges()) {
+ if (NewIdx && !DstInt.hasSubRanges() &&
+ MRI->shouldTrackSubRegLiveness(DstReg)) {
LaneBitmask FullMask = MRI->getMaxLaneMaskForVReg(DstReg);
- LaneBitmask UsedLanes = TRI->getSubRegIndexLaneMask(DefSubIdx);
+ LaneBitmask UsedLanes = TRI->getSubRegIndexLaneMask(NewIdx);
LaneBitmask UnusedLanes = FullMask & ~UsedLanes;
VNInfo::Allocator &Alloc = LIS->getVNInfoAllocator();
DstInt.createSubRangeFrom(Alloc, UsedLanes, DstInt);
diff --git a/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir b/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir
index b61fa4be040070..08fc47d9480ce9 100644
--- a/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir
+++ b/llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir
@@ -1,5 +1,5 @@
+# RUN: llc -mtriple=aarch64 -o /dev/null -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking -debug-only=regalloc %s 2>&1 | FileCheck %s --check-prefix=CHECK-DBG
# RUN: llc -mtriple=aarch64 -verify-machineinstrs -o - -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking %s | FileCheck %s --check-prefix=CHECK
-# RUN: llc -mtriple=aarch64 -verify-machineinstrs -o /dev/null -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking -debug-only=regalloc %s 2>&1 | FileCheck %s --check-prefix=CHECK-DBG
# REQUIRES: asserts
# CHECK-DBG: ********** REGISTER COALESCER **********
@@ -36,3 +36,94 @@ body: |
RET_ReallyLR
...
+# CHECK-DBG: ********** REGISTER COALESCER **********
+# CHECK-DBG: ********** Function: reproducer
+# CHECK-DBG: ********** JOINING INTERVALS ***********
+# CHECK-DBG: ********** INTERVALS **********
+# CHECK-DBG: %1 [32r,48B:2)[48B,320r:0)[320r,368B:1) 0@48B-phi 1@320r 2@32r
+# CHECK-DBG-SAME: weight:0.000000e+00
+# CHECK-DBG: %3 [80r,160B:2)[240r,272B:1)[288r,304B:0)[304B,320r:3) 0@288r 1@240r 2@80r 3@304B-phi
+# CHECK-DBG-SAME: L0000000000000080 [288r,304B:0)[304B,320r:3) 0@288r 1@x 2@x 3@304B-phi
+# CHECK-DBG-SAME: L0000000000000040 [80r,160B:2)[240r,272B:1)[288r,304B:0)[304B,320r:3) 0@288r 1@240r 2@80r 3@304B-phi
+# CHECK-DBG-SAME: weight:0.000000e+00
+---
+name: reproducer
+tracksRegLiveness: true
+body: |
+ bb.0:
+ %0:gpr32 = MOVi32imm 1
+ %1:gpr64 = IMPLICIT_DEF
+
+ bb.1:
+
+ bb.2:
+ %3:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
+
+ bb.3:
+ $nzcv = IMPLICIT_DEF
+ %4:gpr64 = COPY killed %3
+ Bcc 1, %bb.7, implicit killed $nzcv
+
+ bb.4:
+ $nzcv = IMPLICIT_DEF
+ Bcc 1, %bb.6, implicit killed $nzcv
+
+ bb.5:
+ %5:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
+ %4:gpr64 = COPY killed %5
+ B %bb.7
+
+ bb.6:
+ %4:gpr64 = COPY $xzr
+
+ bb.7:
+ %7:gpr64 = ADDXrs killed %1, killed %4, 1
+ %1:gpr64 = COPY killed %7
+ B %bb.1
+
+...
+# CHECK-DBG: ********** REGISTER COALESCER **********
+# CHECK-DBG: ********** Function: reproducer2
+# CHECK-DBG: ********** JOINING INTERVALS ***********
+# CHECK-DBG: ********** INTERVALS **********
+# CHECK-DBG: %1 [32r,48B:2)[48B,304r:0)[304r,352B:1) 0@48B-phi 1@304r 2@32r
+# CHECK-DBG-SAME: weight:0.000000e+00
+# CHECK-DBG: %3 [80r,160B:2)[224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@80r 3@288B-phi
+# CHECK-DBG-SAME: L0000000000000080 [224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@x 3@288B-phi
+# CHECK-DBG-SAME: L0000000000000040 [80r,160B:2)[224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@80r 3@288B-phi
+# CHECK-DBG-SAME: weight:0.000000e+00
+---
+name: reproducer2
+tracksRegLiveness: true
+body: |
+ bb.0:
+ %0:gpr32 = MOVi32imm 1
+ %1:gpr64 = IMPLICIT_DEF
+
+ bb.1:
+
+ bb.2:
+ %3:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
+
+ bb.3:
+ $nzcv = IMPLICIT_DEF
+ %4:gpr64 = COPY killed %3
+ Bcc 1, %bb.7, implicit killed $nzcv
+
+ bb.4:
+ $nzcv = IMPLICIT_DEF
+ Bcc 1, %bb.6, implicit killed $nzcv
+
+ bb.5:
+ %4:gpr64 = IMPLICIT_DEF
+ B %bb.7
+
+ bb.6:
+ %4:gpr64 = COPY $xzr
+
+ bb.7:
+ %5:gpr64 = ADDXrs killed %1, killed %4, 1
+ %1:gpr64 = COPY killed %5
+ B %bb.1
+
+...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@@ -1508,14 +1508,14 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP, | |||
// undef %2.subreg:reg = INST %1:reg ; DefMI (rematerializable), | |||
// ; DefSubIdx = subreg | |||
// %3:reg = COPY %2 ; SrcIdx = DstIdx = 0 | |||
// .... = SOMEINSTR %3:reg | |||
// .... = SOMEINSTR %3:reg, %2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this part of the change intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't, thanks. Your question did trigger me to clarify the existing comment a bit though! :)
The code added in #116191 that updated the lanemasks for rematerialized values, checked if DefMI's destination reg had a subreg index. This seems to have missed the following case: %0:gpr32 = MOVi32imm 1 %1:gpr64 = SUBREG_TO_REG 0, %0:gpr32, %subreg.sub_32 which during rematerialization would have the following variables set: DefMI = %0:gpr32 = MOVi32imm 1 NewMI = %3.sub_32:gpr64 = MOVi32imm 1 (rematerialized value) When checking whether the lanemasks need to be generated, considering whether DefMI's destination has a subreg index is insufficient, we should look at DefMI's subreg index instead. The added tests are a bit more involved, because I was not able to reconstruct the issue without having some control flow in the test. These tests come from actual reproducers.
f9e3ada
to
912ab02
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
912ab02
to
fa17b74
Compare
The code added in #116191 that updated the lanemasks for rematerialized
values checked if
DefMI
's destination register had a subreg index.This seems to have missed the following case:
which during rematerialization would have the following variables set:
When checking whether the lanemasks need to be generated, considering
whether DefMI's destination has a subreg index is insufficient, we should
look at DefMI's subreg index instead.
The added tests are a bit more involved, because I was not able to
reconstruct the issue without having some control flow in the test.
These tests come from actual reproducers.