Skip to content

Commit 82ec2d6

Browse files
[Coalescer] Consider NewMI's subreg index when updating lanemask. (#121780)
The code added in #116191 that updated the lanemasks for rematerialized values checked if `DefMI`'s destination register had a subreg index. This seems to have missed the following case: ``` %0:gpr32 = MOVi32imm 1 %1:gpr64 = SUBREG_TO_REG 0, %0:gpr32, %subreg.sub_32 ``` which during rematerialization would have the following variables set: ``` DefMI = %0:gpr32 = MOVi32imm 1 NewMI = %3.sub_32:gpr64 = MOVi32imm 1 (rematerialized value) ``` When checking whether the lanemasks need to be generated, considering whether DefMI's destination has a subreg index is insufficient, we should look at DefMI's subreg index instead. The added tests are a bit more involved, because I was not able to reconstruct the issue without having some control flow in the test. These tests come from actual reproducers.
1 parent c3fc41c commit 82ec2d6

File tree

2 files changed

+100
-8
lines changed

2 files changed

+100
-8
lines changed

llvm/lib/CodeGen/RegisterCoalescer.cpp

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1526,17 +1526,18 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
15261526

15271527
// In a situation like the following:
15281528
//
1529-
// undef %2.subreg:reg = INST %1:reg ; DefMI (rematerializable),
1530-
// ; DefSubIdx = subreg
1531-
// %3:reg = COPY %2 ; SrcIdx = DstIdx = 0
1532-
// .... = SOMEINSTR %3:reg
1529+
// undef %2.subreg:reg = INST %1:reg ; DefMI (rematerializable),
1530+
// ; Defines only some of lanes,
1531+
// ; so DefSubIdx = NewIdx = subreg
1532+
// %3:reg = COPY %2 ; Copy full reg
1533+
// .... = SOMEINSTR %3:reg ; Use full reg
15331534
//
15341535
// there are no subranges for %3 so after rematerialization we need
15351536
// to explicitly create them. Undefined subranges are removed later on.
1536-
if (DefSubIdx && !CP.getSrcIdx() && !CP.getDstIdx() &&
1537-
MRI->shouldTrackSubRegLiveness(DstReg) && !DstInt.hasSubRanges()) {
1537+
if (NewIdx && !DstInt.hasSubRanges() &&
1538+
MRI->shouldTrackSubRegLiveness(DstReg)) {
15381539
LaneBitmask FullMask = MRI->getMaxLaneMaskForVReg(DstReg);
1539-
LaneBitmask UsedLanes = TRI->getSubRegIndexLaneMask(DefSubIdx);
1540+
LaneBitmask UsedLanes = TRI->getSubRegIndexLaneMask(NewIdx);
15401541
LaneBitmask UnusedLanes = FullMask & ~UsedLanes;
15411542
VNInfo::Allocator &Alloc = LIS->getVNInfoAllocator();
15421543
DstInt.createSubRangeFrom(Alloc, UsedLanes, DstInt);

llvm/test/CodeGen/AArch64/register-coalesce-update-subranges-remat.mir

Lines changed: 92 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1+
# RUN: llc -mtriple=aarch64 -o /dev/null -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking -debug-only=regalloc %s 2>&1 | FileCheck %s --check-prefix=CHECK-DBG
12
# RUN: llc -mtriple=aarch64 -verify-machineinstrs -o - -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking %s | FileCheck %s --check-prefix=CHECK
2-
# RUN: llc -mtriple=aarch64 -verify-machineinstrs -o /dev/null -run-pass=register-coalescer -aarch64-enable-subreg-liveness-tracking -debug-only=regalloc %s 2>&1 | FileCheck %s --check-prefix=CHECK-DBG
33
# REQUIRES: asserts
44

55
# CHECK-DBG: ********** REGISTER COALESCER **********
@@ -36,3 +36,94 @@ body: |
3636
RET_ReallyLR
3737
3838
...
39+
# CHECK-DBG: ********** REGISTER COALESCER **********
40+
# CHECK-DBG: ********** Function: reproducer
41+
# CHECK-DBG: ********** JOINING INTERVALS ***********
42+
# CHECK-DBG: ********** INTERVALS **********
43+
# CHECK-DBG: %1 [32r,48B:2)[48B,320r:0)[320r,368B:1) 0@48B-phi 1@320r 2@32r
44+
# CHECK-DBG-SAME: weight:0.000000e+00
45+
# CHECK-DBG: %3 [80r,160B:2)[240r,272B:1)[288r,304B:0)[304B,320r:3) 0@288r 1@240r 2@80r 3@304B-phi
46+
# CHECK-DBG-SAME: L0000000000000080 [288r,304B:0)[304B,320r:3) 0@288r 1@x 2@x 3@304B-phi
47+
# CHECK-DBG-SAME: L0000000000000040 [80r,160B:2)[240r,272B:1)[288r,304B:0)[304B,320r:3) 0@288r 1@240r 2@80r 3@304B-phi
48+
# CHECK-DBG-SAME: weight:0.000000e+00
49+
---
50+
name: reproducer
51+
tracksRegLiveness: true
52+
body: |
53+
bb.0:
54+
%0:gpr32 = MOVi32imm 1
55+
%1:gpr64 = IMPLICIT_DEF
56+
57+
bb.1:
58+
59+
bb.2:
60+
%3:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
61+
62+
bb.3:
63+
$nzcv = IMPLICIT_DEF
64+
%4:gpr64 = COPY killed %3
65+
Bcc 1, %bb.7, implicit killed $nzcv
66+
67+
bb.4:
68+
$nzcv = IMPLICIT_DEF
69+
Bcc 1, %bb.6, implicit killed $nzcv
70+
71+
bb.5:
72+
%5:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
73+
%4:gpr64 = COPY killed %5
74+
B %bb.7
75+
76+
bb.6:
77+
%4:gpr64 = COPY $xzr
78+
79+
bb.7:
80+
%7:gpr64 = ADDXrs killed %1, killed %4, 1
81+
%1:gpr64 = COPY killed %7
82+
B %bb.1
83+
84+
...
85+
# CHECK-DBG: ********** REGISTER COALESCER **********
86+
# CHECK-DBG: ********** Function: reproducer2
87+
# CHECK-DBG: ********** JOINING INTERVALS ***********
88+
# CHECK-DBG: ********** INTERVALS **********
89+
# CHECK-DBG: %1 [32r,48B:2)[48B,304r:0)[304r,352B:1) 0@48B-phi 1@304r 2@32r
90+
# CHECK-DBG-SAME: weight:0.000000e+00
91+
# CHECK-DBG: %3 [80r,160B:2)[224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@80r 3@288B-phi
92+
# CHECK-DBG-SAME: L0000000000000080 [224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@x 3@288B-phi
93+
# CHECK-DBG-SAME: L0000000000000040 [80r,160B:2)[224r,256B:1)[272r,288B:0)[288B,304r:3) 0@272r 1@224r 2@80r 3@288B-phi
94+
# CHECK-DBG-SAME: weight:0.000000e+00
95+
---
96+
name: reproducer2
97+
tracksRegLiveness: true
98+
body: |
99+
bb.0:
100+
%0:gpr32 = MOVi32imm 1
101+
%1:gpr64 = IMPLICIT_DEF
102+
103+
bb.1:
104+
105+
bb.2:
106+
%3:gpr64all = SUBREG_TO_REG 0, %0, %subreg.sub_32
107+
108+
bb.3:
109+
$nzcv = IMPLICIT_DEF
110+
%4:gpr64 = COPY killed %3
111+
Bcc 1, %bb.7, implicit killed $nzcv
112+
113+
bb.4:
114+
$nzcv = IMPLICIT_DEF
115+
Bcc 1, %bb.6, implicit killed $nzcv
116+
117+
bb.5:
118+
%4:gpr64 = IMPLICIT_DEF
119+
B %bb.7
120+
121+
bb.6:
122+
%4:gpr64 = COPY $xzr
123+
124+
bb.7:
125+
%5:gpr64 = ADDXrs killed %1, killed %4, 1
126+
%1:gpr64 = COPY killed %5
127+
B %bb.1
128+
129+
...

0 commit comments

Comments
 (0)