Skip to content

[LiveIntervals] Ignore artificial regs when adding kill flags #116963

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 4, 2024

Conversation

sdesmalen-arm
Copy link
Collaborator

If parts of a physical register for a given liverange, as assigned by the
register allocator, can be used to store other values not represented by
this liverange, then LiveIntervals::addKillFlags normally avoids adding a
kill flag on the use of this register when the value's liverange ends.

However, if all the other regunits are artificial, then we can still safely
add the kill flag, since those parts of the register can never be accessed
independently.

@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-tablegen
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-regalloc

Author: Sander de Smalen (sdesmalen-arm)

Changes

If parts of a physical register for a given liverange, as assigned by the
register allocator, can be used to store other values not represented by
this liverange, then LiveIntervals::addKillFlags normally avoids adding a
kill flag on the use of this register when the value's liverange ends.

However, if all the other regunits are artificial, then we can still safely
add the kill flag, since those parts of the register can never be accessed
independently.


Full diff: https://github.com/llvm/llvm-project/pull/116963.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/LiveIntervals.cpp (+37-1)
  • (modified) llvm/test/CodeGen/AArch64/arm64-addrmode.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/nested-iv-regalloc.mir (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_darwin.ll (+6-6)
diff --git a/llvm/lib/CodeGen/LiveIntervals.cpp b/llvm/lib/CodeGen/LiveIntervals.cpp
index a0b6bf445fa8af..18059a1d384580 100644
--- a/llvm/lib/CodeGen/LiveIntervals.cpp
+++ b/llvm/lib/CodeGen/LiveIntervals.cpp
@@ -710,6 +710,30 @@ void LiveIntervals::pruneValue(LiveRange &LR, SlotIndex Kill,
 // Register allocator hooks.
 //
 
+/// Returns true if the physreg has multiple regunits that can be accessed
+/// as independent registers.
+///
+/// Returns 'true' for e.g.:
+///   gpr64_0_gpr64_1
+//               => two independently accessible registers gpr64_0 and gpr64_1.
+///
+/// Returns 'false' for e.g.:
+///   gpr64_0:   => accessible register, reads/writes 64bits
+///   gpr32_0:   => accessible sub-regsiter of gpr64_0, reads/writes 32bits
+//    gpr32_0_hi => top 32bits of gpr64_0, not independently accessible.
+static bool hasMultipleAddressableRegUnits(const TargetRegisterInfo *TRI,
+                                           MCPhysReg PhysReg) {
+  unsigned NumAddressableRegUnits = 0;
+  for (MCRegUnit U : TRI->regunits(PhysReg)) {
+    for (MCRegUnitRootIterator RI(U, TRI); RI.isValid(); ++RI)
+      if (!TRI->isArtificial(*RI) && TRI->isInAllocatableClass(*RI))
+        NumAddressableRegUnits++;
+    if (NumAddressableRegUnits > 1)
+      return true;
+  }
+  return false;
+}
+
 void LiveIntervals::addKillFlags(const VirtRegMap *VRM) {
   // Keep track of regunit ranges.
   SmallVector<std::pair<const LiveRange*, LiveRange::const_iterator>, 8> RU;
@@ -736,6 +760,18 @@ void LiveIntervals::addKillFlags(const VirtRegMap *VRM) {
         continue;
       RU.push_back(std::make_pair(&RURange, RURange.find(LI.begin()->end)));
     }
+
+    // If parts of a physical register for a given liverange, as assigned by the
+    // register allocator, can be used to store other values not represented by
+    // this liverange, then `LiveIntervals::addKillFlags` normally avoids adding
+    // a kill flag on the use of this register when the value's liverange ends.
+    //
+    // However, if all the other regunits are artificial, then we can still
+    // safely add the kill flag, since those parts of the register can never be
+    // accessed independently.
+    bool AssumeOtherUnitsCanBeUsed =
+        hasMultipleAddressableRegUnits(TRI, PhysReg);
+
     // Every instruction that kills Reg corresponds to a segment range end
     // point.
     for (LiveInterval::const_iterator RI = LI.begin(), RE = LI.end(); RI != RE;
@@ -780,7 +816,7 @@ void LiveIntervals::addKillFlags(const VirtRegMap *VRM) {
         // are actually never written by %2. After assignment the <kill>
         // flag at the read instruction is invalid.
         LaneBitmask DefinedLanesMask;
-        if (LI.hasSubRanges()) {
+        if (LI.hasSubRanges() && AssumeOtherUnitsCanBeUsed) {
           // Compute a mask of lanes that are defined.
           DefinedLanesMask = LaneBitmask::getNone();
           for (const LiveInterval::SubRange &SR : LI.subranges())
diff --git a/llvm/test/CodeGen/AArch64/arm64-addrmode.ll b/llvm/test/CodeGen/AArch64/arm64-addrmode.ll
index bfef61abd8c129..f8695b62619c09 100644
--- a/llvm/test/CodeGen/AArch64/arm64-addrmode.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-addrmode.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=arm64-eabi < %s | FileCheck %s
+; RUN: llc -aarch64-enable-subreg-liveness-tracking -mtriple=arm64-eabi < %s | FileCheck %s
 ; rdar://10232252
 
 @object = external hidden global i64, section "__DATA, __objc_ivar", align 8
diff --git a/llvm/test/CodeGen/AArch64/nested-iv-regalloc.mir b/llvm/test/CodeGen/AArch64/nested-iv-regalloc.mir
index 3bd8f83d27c2da..ff29c78b5a0ce5 100644
--- a/llvm/test/CodeGen/AArch64/nested-iv-regalloc.mir
+++ b/llvm/test/CodeGen/AArch64/nested-iv-regalloc.mir
@@ -1,5 +1,5 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple aarch64 --run-pass=greedy,virtregrewriter -verify-machineinstrs %s -o - | FileCheck %s
+# RUN: llc -mtriple aarch64 -aarch64-enable-subreg-liveness-tracking --run-pass=greedy,virtregrewriter -verify-machineinstrs %s -o - | FileCheck %s
 
 # We should ideally not spill around any of the SUBSWri in the loop exit blocks (if.end and if.end27).
 
@@ -219,7 +219,7 @@ body:             |
   ; CHECK-NEXT:   liveins: $w10, $w11, $x2, $x8
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   STRXui renamable $x8, %stack.1, 0 :: (store (s64) into %stack.1)
-  ; CHECK-NEXT:   renamable $w9 = MOVi32imm 36, implicit-def $x9
+  ; CHECK-NEXT:   renamable $w9 = MOVi32imm 36
   ; CHECK-NEXT:   renamable $x8 = MADDXrrr killed renamable $x8, killed renamable $x9, $xzr
   ; CHECK-NEXT:   renamable $x9 = MOVaddr target-flags(aarch64-page) @g, target-flags(aarch64-pageoff, aarch64-nc) @g
   ; CHECK-NEXT:   renamable $w8 = LDRWroX killed renamable $x9, killed renamable $x8, 0, 0 :: (load (s32) from %ir.arrayidx9)
@@ -244,7 +244,7 @@ body:             |
   ; CHECK-NEXT:   successors: %bb.5(0x50000000), %bb.8(0x30000000)
   ; CHECK-NEXT:   liveins: $w10, $w11, $x2, $x12
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   renamable $w8 = MOVi32imm 36, implicit-def $x8
+  ; CHECK-NEXT:   renamable $w8 = MOVi32imm 36
   ; CHECK-NEXT:   renamable $x8 = MADDXrrr renamable $x12, killed renamable $x8, $xzr
   ; CHECK-NEXT:   renamable $x9 = MOVaddr target-flags(aarch64-page) @g, target-flags(aarch64-pageoff, aarch64-nc) @g
   ; CHECK-NEXT:   renamable $w8 = LDRWroX killed renamable $x9, killed renamable $x8, 0, 0 :: (load (s32) from %ir.arrayidx14)
diff --git a/llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_darwin.ll b/llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_darwin.ll
index e227f14542cc11..2a77d4dd33fe53 100644
--- a/llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_darwin.ll
+++ b/llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_darwin.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc -mtriple=aarch64-apple-darwin < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-apple-darwin -aarch64-enable-subreg-liveness-tracking < %s | FileCheck %s
 
 define preserve_nonecc i32 @callee(i32 %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5, ...) nounwind noinline ssp {
 ; CHECK-LABEL: callee:
@@ -27,12 +27,11 @@ define i32 @caller() nounwind ssp {
 ; CHECK-NEXT:    sub sp, sp, #208
 ; CHECK-NEXT:    mov w8, #10 ; =0xa
 ; CHECK-NEXT:    mov w9, #9 ; =0x9
-; CHECK-NEXT:    mov w0, #1 ; =0x1
+; CHECK-NEXT:    mov w10, #8 ; =0x8
 ; CHECK-NEXT:    stp x9, x8, [sp, #24]
-; CHECK-NEXT:    mov w8, #8 ; =0x8
-; CHECK-NEXT:    mov w9, #6 ; =0x6
-; CHECK-NEXT:    str x8, [sp, #16]
 ; CHECK-NEXT:    mov w8, #7 ; =0x7
+; CHECK-NEXT:    mov w9, #6 ; =0x6
+; CHECK-NEXT:    mov w0, #1 ; =0x1
 ; CHECK-NEXT:    mov w1, #2 ; =0x2
 ; CHECK-NEXT:    mov w2, #3 ; =0x3
 ; CHECK-NEXT:    mov w3, #4 ; =0x4
@@ -47,7 +46,8 @@ define i32 @caller() nounwind ssp {
 ; CHECK-NEXT:    stp x22, x21, [sp, #160] ; 16-byte Folded Spill
 ; CHECK-NEXT:    stp x20, x19, [sp, #176] ; 16-byte Folded Spill
 ; CHECK-NEXT:    stp x29, x30, [sp, #192] ; 16-byte Folded Spill
-; CHECK-NEXT:    stp x9, x8, [sp]
+; CHECK-NEXT:    stp x8, x10, [sp, #8]
+; CHECK-NEXT:    str x9, [sp]
 ; CHECK-NEXT:    bl _callee
 ; CHECK-NEXT:    ldp x29, x30, [sp, #192] ; 16-byte Folded Reload
 ; CHECK-NEXT:    ldp x20, x19, [sp, #176] ; 16-byte Folded Reload

@sdesmalen-arm
Copy link
Collaborator Author

Gentle ping :)

Copy link

github-actions bot commented Dec 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -780,7 +783,7 @@ void LiveIntervals::addKillFlags(const VirtRegMap *VRM) {
// are actually never written by %2. After assignment the <kill>
// flag at the read instruction is invalid.
LaneBitmask DefinedLanesMask;
if (LI.hasSubRanges()) {
if (LI.hasSubRanges() && TRI->getNumAllocatableSubRegs(PhysReg) > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this doesn't naturally fall out of the lanemasks for the register (e.g. above replace the TRI->regunits with MCRegUnitMaskIterator)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion @arsenm, this approach makes more sense!

LaneBitmask ArtificialLanes = LaneBitmask::getNone();
for (MCRegUnitMaskIterator UI(PhysReg, TRI); UI.isValid(); ++UI) {
auto [Unit, Bitmask] = *UI;
// Record lane mask for all artificial RegUnits for this physreg.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tablegen calculates the "aritificial" property for regunits as well as registers, but I guess we don't have direct access to that here?

// A register unit is artificial if at least one of its roots is

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly, unless I generate a table for it. Iterating through the roots shouldn't be much of a problem in practice, since there can never be more than two roots. (It's actually not clear to me why there would ever be more than one root? What's the use-case for this?)

In either case, I'll move this to a special isArtificial(MCRegUnit) method to query this info, to avoid having to write this loop in various places in the future. If you think it's worth generating this in a table though, I'm happy to do so!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually not clear to me why there would ever be more than one root? What's the use-case for this?

I think the second root is only ever an aritificial register used to represent ad hoc aliasing (the Aliases = [...] field in .td files), but I could be misremembering the details.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at ARMGenRegisterInfo.td, I think that's indeed right, thanks for explaining!

Comment on lines 737 to 742
for (MCRegUnitRootIterator Root(Unit, TRI); Root.isValid(); ++Root) {
if (TRI->isArtificial(*Root)) {
ArtificialLanes |= Bitmask;
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to a helper function that returns the mask instead of breaking out of the loop

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you reviewed an older revision of the PR, I had moved the check to a helper function in 5e5c20cfdb653ac7da2b17b46eab9dc694d558b7

@@ -730,7 +730,12 @@ void LiveIntervals::addKillFlags(const VirtRegMap *VRM) {
// Find the regunit intervals for the assigned register. They may overlap
// the virtual register live range, cancelling any kills.
RU.clear();
for (MCRegUnit Unit : TRI->regunits(PhysReg)) {
LaneBitmask ArtificialLanes = LaneBitmask::getNone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the explicit getNone to initialize

@@ -659,3 +659,11 @@ bool MachineRegisterInfo::isReservedRegUnit(unsigned Unit) const {
}
return false;
}

bool MachineRegisterInfo::isArtificialRegUnit(unsigned Unit) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't belong in MRI, it does not involve virtual registers.

Copy link
Collaborator Author

@sdesmalen-arm sdesmalen-arm Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it there because there was already a isReservedRegUnit, which I guess also shouldn't live there?
I'll move the function to TRI instead (or perhaps MCRegisterInfo?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set of reserved registers can be changed per function compiled so that needs to live in MRI. The artificial regs are a static property of the target so it can go in TRI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go in MCRegisterInfo, it's a constant property of the register. isReservedRegUnit is a stateful property depending on the current MachineFunction, so it should remain here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense, thanks both!

@llvmbot llvmbot added the mc Machine (object) code label Dec 4, 2024
@@ -730,7 +730,12 @@ void LiveIntervals::addKillFlags(const VirtRegMap *VRM) {
// Find the regunit intervals for the assigned register. They may overlap
// the virtual register live range, cancelling any kills.
RU.clear();
for (MCRegUnit Unit : TRI->regunits(PhysReg)) {
LaneBitmask ArtificialLanes = LaneBitmask::getNone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LaneBitmask ArtificialLanes = LaneBitmask::getNone();
LaneBitmask ArtificialLanes;

If parts of a physical register for a given liverange, as assigned by the
register allocator, can be used to store other values not represented by
this liverange, then `LiveIntervals::addKillFlags` normally avoids adding a
kill flag on the use of this register when the value's liverange ends.

However, if all the other regunits are artificial, then we can still safely
add the kill flag, since those parts of the register can never be accessed
independently.
@sdesmalen-arm sdesmalen-arm merged commit 048fc2b into llvm:main Dec 4, 2024
5 of 7 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 4, 2024

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/9243

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelaySignalWatch.py (614 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelayedCrashWithBreakpointSignal.py (615 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelayedCrashWithBreakpointWatchpoint.py (616 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyBreakpoints.py (617 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentNWatchNBreak.py (618 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalBreak.py (619 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalDelayBreak.py (620 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyCrash.py (621 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManySignals.py (622 of 2058)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyWatchpoints.py (623 of 2058)
FAIL: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalWatch.py (624 of 2058)
******************** TEST 'lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalWatch.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/make --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/thread/concurrent_events -p TestConcurrentSignalWatch.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 048fc2bc102cff806613592829ff275c0f2b826f)
  clang revision 048fc2bc102cff806613592829ff275c0f2b826f
  llvm revision 048fc2bc102cff806613592829ff275c0f2b826f
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

Watchpoint 1 hit:
old value: 0
new value: 1

--
Command Output (stderr):
--
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test (TestConcurrentSignalWatch.ConcurrentSignalWatch)
======================================================================
FAIL: test (TestConcurrentSignalWatch.ConcurrentSignalWatch)
   Test a watchpoint and a signal in multiple threads.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 148, in wrapper
    return func(*args, **kwargs)
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentSignalWatch.py", line 14, in test
    self.do_thread_actions(num_signal_threads=1, num_watchpoint_threads=1)
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/concurrent_base.py", line 333, in do_thread_actions
    self.assertEqual(
AssertionError: 1 != 2 : Expected 1 stops due to signal delivery, but got 2
Config=aarch64-/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang
----------------------------------------------------------------------
Ran 1 test in 0.659s


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants