Skip to content

[AMDGPU] NFC: Provide RPTracker interface for external iterators #93088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 21 commits into from

Conversation

jrbyrnes
Copy link
Contributor

This is part of a series of PRs which enable using the GCNRPTrackers during scheduling. I've split them up to (hopefully) make reviewing easier. For context see #88797 .

This PR adds adds an interface to the existing GCNRPTrackers which provides a way to use the trackers with an externally managed iterator.

Change-Id: I79b54722e6e858961486248d94766c3f3c161160
@llvmbot
Copy link
Member

llvmbot commented May 22, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Jeffrey Byrnes (jrbyrnes)

Changes

This is part of a series of PRs which enable using the GCNRPTrackers during scheduling. I've split them up to (hopefully) make reviewing easier. For context see #88797 .

This PR adds adds an interface to the existing GCNRPTrackers which provides a way to use the trackers with an externally managed iterator.


Full diff: https://github.com/llvm/llvm-project/pull/93088.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.cpp (+46-24)
  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.h (+10-8)
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
index 5c394e6d6296d..f1c4c8b397ddc 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
@@ -343,24 +343,25 @@ void GCNRPTracker::reset(const MachineInstr &MI,
   MaxPressure = CurPressure = getRegPressure(*MRI, LiveRegs);
 }
 
-////////////////////////////////////////////////////////////////////////////////
-// GCNUpwardRPTracker
-
-void GCNUpwardRPTracker::reset(const MachineRegisterInfo &MRI_,
-                               const LiveRegSet &LiveRegs_) {
+void GCNRPTracker::reset(const MachineRegisterInfo &MRI_,
+                         const LiveRegSet &LiveRegs_) {
   MRI = &MRI_;
   LiveRegs = LiveRegs_;
   LastTrackedMI = nullptr;
   MaxPressure = CurPressure = getRegPressure(MRI_, LiveRegs_);
 }
 
-void GCNUpwardRPTracker::recede(const MachineInstr &MI) {
+////////////////////////////////////////////////////////////////////////////////
+// GCNUpwardRPTracker
+
+bool GCNUpwardRPTracker::recede(const MachineInstr &MI, bool ShouldTrackIt) {
   assert(MRI && "call reset first");
 
-  LastTrackedMI = &MI;
+  if (ShouldTrackIt)
+    LastTrackedMI = &MI;
 
   if (MI.isDebugInstr())
-    return;
+    return false;
 
   // Kill all defs.
   GCNRegPressure DefPressure, ECDefPressure;
@@ -412,6 +413,7 @@ void GCNUpwardRPTracker::recede(const MachineInstr &MI) {
                           : max(CurPressure, MaxPressure);
 
   assert(CurPressure == getRegPressure(*MRI, LiveRegs));
+  return false;
 }
 
 ////////////////////////////////////////////////////////////////////////////////
@@ -430,28 +432,44 @@ bool GCNDownwardRPTracker::reset(const MachineInstr &MI,
   return true;
 }
 
-bool GCNDownwardRPTracker::advanceBeforeNext() {
+bool GCNDownwardRPTracker::advanceBeforeNext(MachineInstr *MI,
+                                             bool ShouldTrackIt,
+                                             LiveIntervals *TheLIS) {
   assert(MRI && "call reset first");
-  if (!LastTrackedMI)
-    return NextMI == MBBEnd;
+  SlotIndex SI;
+  LiveIntervals *CurrLIS;
+  MachineInstr *CurrMI;
+  if (ShouldTrackIt) {
+    if (!LastTrackedMI)
+      return NextMI == MBBEnd;
+
+    assert(NextMI == MBBEnd || !NextMI->isDebugInstr());
+    CurrLIS = const_cast<LiveIntervals *>(&LIS);
+    CurrMI = const_cast<MachineInstr *>(LastTrackedMI);
+
+    SI = NextMI == MBBEnd
+             ? CurrLIS->getInstructionIndex(*LastTrackedMI).getDeadSlot()
+             : CurrLIS->getInstructionIndex(*NextMI).getBaseIndex();
+  }
 
-  assert(NextMI == MBBEnd || !NextMI->isDebugInstr());
+  else { //! ShouldTrackIt
+    CurrLIS = TheLIS;
+    SI = CurrLIS->getInstructionIndex(*MI).getBaseIndex();
+    CurrMI = MI;
+  }
 
-  SlotIndex SI = NextMI == MBBEnd
-                     ? LIS.getInstructionIndex(*LastTrackedMI).getDeadSlot()
-                     : LIS.getInstructionIndex(*NextMI).getBaseIndex();
   assert(SI.isValid());
 
   // Remove dead registers or mask bits.
   SmallSet<Register, 8> SeenRegs;
-  for (auto &MO : LastTrackedMI->operands()) {
+  for (auto &MO : CurrMI->operands()) {
     if (!MO.isReg() || !MO.getReg().isVirtual())
       continue;
     if (MO.isUse() && !MO.readsReg())
       continue;
     if (!SeenRegs.insert(MO.getReg()).second)
       continue;
-    const LiveInterval &LI = LIS.getInterval(MO.getReg());
+    const LiveInterval &LI = CurrLIS->getInterval(MO.getReg());
     if (LI.hasSubRanges()) {
       auto It = LiveRegs.end();
       for (const auto &S : LI.subranges()) {
@@ -481,15 +499,18 @@ bool GCNDownwardRPTracker::advanceBeforeNext() {
 
   LastTrackedMI = nullptr;
 
-  return NextMI == MBBEnd;
+  return ShouldTrackIt && (NextMI == MBBEnd);
 }
 
-void GCNDownwardRPTracker::advanceToNext() {
+void GCNDownwardRPTracker::advanceToNext(MachineInstr *MI, bool ShouldTrackIt) {
   LastTrackedMI = &*NextMI++;
   NextMI = skipDebugInstructionsForward(NextMI, MBBEnd);
 
+  MachineInstr *CurrMI =
+      ShouldTrackIt ? const_cast<MachineInstr *>(LastTrackedMI) : MI;
+
   // Add new registers or mask bits.
-  for (const auto &MO : LastTrackedMI->all_defs()) {
+  for (const auto &MO : CurrMI->all_defs()) {
     Register Reg = MO.getReg();
     if (!Reg.isVirtual())
       continue;
@@ -502,11 +523,12 @@ void GCNDownwardRPTracker::advanceToNext() {
   MaxPressure = max(MaxPressure, CurPressure);
 }
 
-bool GCNDownwardRPTracker::advance() {
-  if (NextMI == MBBEnd)
+bool GCNDownwardRPTracker::advance(MachineInstr *MI, bool ShouldTrackIt,
+                                   LiveIntervals *TheLIS) {
+  if (ShouldTrackIt && NextMI == MBBEnd)
     return false;
-  advanceBeforeNext();
-  advanceToNext();
+  advanceBeforeNext(MI, ShouldTrackIt, TheLIS);
+  advanceToNext(MI, ShouldTrackIt);
   return true;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.h b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
index 752f53752fa68..8abbce138cf16 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.h
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
@@ -160,6 +160,9 @@ class GCNRPTracker {
              bool After);
 
 public:
+  // reset tracker and set live register set to the specified value.
+  void reset(const MachineRegisterInfo &MRI_, const LiveRegSet &LiveRegs_);
+
   // live regs for the current state
   const decltype(LiveRegs) &getLiveRegs() const { return LiveRegs; }
   const MachineInstr *getLastTrackedMI() const { return LastTrackedMI; }
@@ -180,12 +183,9 @@ class GCNUpwardRPTracker : public GCNRPTracker {
 public:
   GCNUpwardRPTracker(const LiveIntervals &LIS_) : GCNRPTracker(LIS_) {}
 
-  // reset tracker and set live register set to the specified value.
-  void reset(const MachineRegisterInfo &MRI_, const LiveRegSet &LiveRegs_);
-
   // reset tracker at the specified slot index.
   void reset(const MachineRegisterInfo &MRI, SlotIndex SI) {
-    reset(MRI, llvm::getLiveRegs(SI, LIS, MRI));
+    GCNRPTracker::reset(MRI, llvm::getLiveRegs(SI, LIS, MRI));
   }
 
   // reset tracker to the end of the MBB.
@@ -200,7 +200,7 @@ class GCNUpwardRPTracker : public GCNRPTracker {
   }
 
   // move to the state just before the MI (in program order).
-  void recede(const MachineInstr &MI);
+  bool recede(const MachineInstr &MI, bool ShouldTrackIt = true);
 
   // checks whether the tracker's state after receding MI corresponds
   // to reported by LIS.
@@ -242,13 +242,15 @@ class GCNDownwardRPTracker : public GCNRPTracker {
 
   // Move to the state right before the next MI or after the end of MBB.
   // Returns false if reached end of the block.
-  bool advanceBeforeNext();
+  bool advanceBeforeNext(MachineInstr *MI = nullptr, bool ShouldTrackIt = true,
+                         LiveIntervals *TheLIS = nullptr);
 
   // Move to the state at the MI, advanceBeforeNext has to be called first.
-  void advanceToNext();
+  void advanceToNext(MachineInstr *MI = nullptr, bool ShouldTrackIt = true);
 
   // Move to the state at the next MI. Returns false if reached end of block.
-  bool advance();
+  bool advance(MachineInstr *MI = nullptr, bool ShouldTrackIt = true,
+               LiveIntervals *TheLIS = nullptr);
 
   // Advance instructions until before End.
   bool advance(MachineBasicBlock::const_iterator End);

Change-Id: Ib798a9d6add7d6dd1ccd0e01bea09ac2f4eeb94f
@vpykhtin
Copy link
Contributor

I haven't found cases where advance or recede arecalled with ShouldTrackIt = false, and the semantics of the flag looks vague, can we avoid the flag?

Change-Id: I10242186f538359ff09110dd70b23e5136655849
@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented May 27, 2024

I haven't found cases where advance or recede arecalled with ShouldTrackIt = false, and the semantics of the flag looks vague, can we avoid the flag?

Thanks for catching that, that was my mistake. We should be using the !ShouldTrackIt version during scheduling. I've corrected that in #93090 , and also corrected an issue with the way RP was calculated using the !ShouldTrackIt version.

I hope it is more clear in that latest version why ShouldTrackIt is needed. We cannot perform speculative RP queries (initCandidate) using the internal iterator, so we need a tracker which allows us to provide the NextMI. If we did not want to introduce a flag, we would need to implement a new tracker that doesn't use an internal iterator, as was done in #88797

jrbyrnes added 14 commits June 14, 2024 13:34
Change-Id: I5c25a2e899c46fc3026b215010bde32b6c6d24ac
Change-Id: I9633e160b23360da06520b63a0e9b7c0fd33647a
Change-Id: I3b23fd089a1350570162633a4318feca092e472a
Change-Id: I1d557aa00cfb57f7a0395131ef57ac19efb5d5cf
Change-Id: I1cc390ff9a10fac9e874b097e3279376e52209a3
Change-Id: Idba7806fd4d6d43ba0381d7731720ac358c108e6
Change-Id: I233f0d3b3ce89adbc5a5c22efa2f28d5fb438e4a
Change-Id: I78f62d348b90448fa934aee1e9900634bf8dc209
Change-Id: Ie030633023361be626a34a1867e7740777a575d9
hange-Id: Ib383230c9fc88fb59a4d3c44f4b1459050554aa5
Change-Id: I0333ac7f2000e9c8fccb369e330882530ca0c912
Change-Id: I523e493fc80247497fd53925f05b9a8ad5f3771a
Change-Id: Ib207ef130a024b3fd12f4c34e80f59dd23450ed2
Change-Id: I2e3c08b4487262cb290cd070f7c2042156217fb1
Change-Id: I2e3c08b4487262cb290cd070f7c2042156217fb1
@jrbyrnes
Copy link
Contributor Author

In the latest, I've ported over RP speculation code from the generic trackers to GCN Trackers. While this does introduce code duplication, I thought this to be the better approach than to try to support subreg liveness in generic speculation and introduce generic tracker copies per speculation.

I've tried to organize the git history to ease the burden of review, please let me know if there's something I should do to assist further.

Commits named "Port *" are simple copy/paste of code./
Commits named "Fix * port issues" are minor changes to resolve API / build errors

The most interesting is "Modify behavior to support RP subreg aware RP speculation" which slightly modifies the behavior to support accurate RP speculation.

Copy link

github-actions bot commented Aug 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines +339 to +341
for (const MachineOperand &MO : MRI.use_nodbg_operands(Reg)) {
if (MO.isUndef())
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't handle subregister defs that read the register correctly. I think you need to one of the other iterators, and MO.readsReg

Copy link
Contributor Author

@jrbyrnes jrbyrnes Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not quite sure what we're trying to accomplish here?

  // Definitions

  1.  undef %63.sub1:vreg_128 = COPY %49.sub18:areg_1024 
  2.  %49.sub1:areg_1024 = COPY %63.sub1:vreg_128
  3.  %63.sub0:vreg_128 = COPY %49.sub19:areg_1024

  // No uses of %49.sub19 or defs of %49.subx

If we were speculating the RP from schedule (1, 3, ..) we would invoke findUseBetween to find any uses of %49 between 3s original position and its candidate position. I think you are saying that we should consider the %49.sub1 def in 2. as an implicit use of %49.sub19? I know that the LIS requires subregister liveness to be modeled in this way somewhat (e.g. no multiple connected components), but from the perspective of RegisterPressure, It seems more accurate (and safe) to not model it this way -- that is, to only check use operands.

Copy link
Contributor

@arsenm arsenm Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a use, and I don't see how it could possibly be safer to ignore it as such. Deviation from the liveness model requires a much stronger rationale than this. I've fixed too many bugs from missing uses on subreg defs.

All of the rest of the code is following ordinary liveness tracking

Copy link
Contributor Author

@jrbyrnes jrbyrnes Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately, RP should agree with RA interference. AFAICT, RA interference is determined by the subrange segments, rather than the main range with the strange liveness concerns -- I'm still analyzing.

In https://godbolt.org/z/YY4P7f9oe we have a use %49.sub0:areg_1024 occuring before a def of the same superreg %49.sub1:areg_1024. According to the subreg def is a use of other subregs, %49.sub0:areg_1024 should be live until the def %49.sub1:areg_1024. However, when assigning %2:agpr_32, RA sees there is no interference with %49.sub0:areg_1024 and assigns to the same PhysReg.

This is consistent with how the GCNTrackers currently calculate RP: we consider the use subrange and whether or not it is live at some given index

and don't check if there is some reaching subreg def.

Change-Id: I2de464b32d3c6ed9a77cbbc669d735dde63c2e47
if (MO.isUndef())
continue;
if (!MO.readsReg())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's this instead of isUndef, not with

Change-Id: I286c9ed1ae91a68da881c6fa27f5f391102d0a9c
@arsenm
Copy link
Contributor

arsenm commented Sep 15, 2024

In the latest, I've ported over RP speculation code from the generic trackers to GCN Trackers. While this does introduce code duplication, I thought this to be the better approach than to try to support subreg liveness in generic speculation and introduce generic tracker copies per speculation.

The generic tracker really just should support subregister liveness

Change-Id: I1446390178f8ed8705b0542e60a5f9f1649d8299
@jrbyrnes
Copy link
Contributor Author

In the latest, I've ported over RP speculation code from the generic trackers to GCN Trackers. While this does introduce code duplication, I thought this to be the better approach than to try to support subreg liveness in generic speculation and introduce generic tracker copies per speculation.

The generic tracker really just should support subregister liveness

That is most likely the next step of this work.

In order to enable the GCNTrackers by default, in order to save compile time, we would need to modify the scheduler to allow for a mode whcih enables RP tracking but disables the generic RPTracker.

I think it probably makes more sense to just improve the accuracy of the generic trackers as you suggest. The purpose of this PR then is to provide access to the improvements while the longer term solution is being worked out.

LaneBitmask DefLanes = P.LaneMask;
LaneBitmask LiveBefore = (LiveAfter & ~DefLanes) | UseLanes;

CurPressure.inc(Reg, LiveAfter, LiveAfter & LiveBefore, *MRI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No decrease of pressure after?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Oct 9, 2024

Landed with 17bc959

@jrbyrnes jrbyrnes closed this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants