Skip to content

[RFC][GlobalISel] InstructionSelect: Allow arbitrary instruction erasure #97670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 11, 2024

Conversation

tobias-stadler
Copy link
Contributor

@llvmbot
Copy link
Member

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Tobias Stadler (tobias-stadler)

Changes

See https://discourse.llvm.org/t/rfc-globalisel-instructionselect-allow-arbitrary-instruction-erasure/79948


Full diff: https://github.com/llvm/llvm-project/pull/97670.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/InstructionSelect.h (+6)
  • (modified) llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp (+119-82)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/InstructionSelect.h b/llvm/include/llvm/CodeGen/GlobalISel/InstructionSelect.h
index cada7f30072e2..037523eab407a 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/InstructionSelect.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/InstructionSelect.h
@@ -20,6 +20,7 @@
 
 namespace llvm {
 
+class InstructionSelector;
 class BlockFrequencyInfo;
 class ProfileSummaryInfo;
 
@@ -55,10 +56,15 @@ class InstructionSelect : public MachineFunctionPass {
   bool runOnMachineFunction(MachineFunction &MF) override;
 
 protected:
+  class MIIteratorMaintainer;
+
+  InstructionSelector *ISel = nullptr;
   BlockFrequencyInfo *BFI = nullptr;
   ProfileSummaryInfo *PSI = nullptr;
 
   CodeGenOptLevel OptLevel = CodeGenOptLevel::None;
+
+  bool select(MachineInstr &MI);
 };
 } // End namespace llvm.
 
diff --git a/llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp b/llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp
index 4cb1d01f3e8ca..6f360dcdffe85 100644
--- a/llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp
@@ -12,8 +12,10 @@
 #include "llvm/CodeGen/GlobalISel/InstructionSelect.h"
 #include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/ScopeExit.h"
+#include "llvm/ADT/SetVector.h"
 #include "llvm/Analysis/LazyBlockFrequencyInfo.h"
 #include "llvm/Analysis/ProfileSummaryInfo.h"
+#include "llvm/CodeGen/GlobalISel/GISelChangeObserver.h"
 #include "llvm/CodeGen/GlobalISel/GISelKnownBits.h"
 #include "llvm/CodeGen/GlobalISel/InstructionSelector.h"
 #include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"
@@ -62,6 +64,52 @@ INITIALIZE_PASS_END(InstructionSelect, DEBUG_TYPE,
                     "Select target instructions out of generic instructions",
                     false, false)
 
+/// This class observes instruction insertions/removals.
+/// InstructionSelect stores an iterator of the instruction prior to the one
+/// that is currently being selected to determine which instruction to select
+/// next. Previously this meant that selecting multiple instructions at once was
+/// illegal behavior due to potential invalidation of this iterator. This is
+/// a non-obvious limitation for selector implementers. Therefore, to allow
+/// deletion of arbitrary instructions, we detect this case and continue
+/// selection with the predecessor of the deleted instruction.
+class InstructionSelect::MIIteratorMaintainer
+    : public MachineFunction::Delegate {
+#ifndef NDEBUG
+  SetVector<const MachineInstr *> CreatedInstrs;
+#endif
+public:
+  MachineBasicBlock::reverse_iterator MII;
+
+  void MF_HandleInsertion(MachineInstr &MI) override {
+    LLVM_DEBUG(dbgs() << "Creating:  " << MI; CreatedInstrs.insert(&MI));
+  }
+
+  void MF_HandleRemoval(MachineInstr &MI) override {
+    LLVM_DEBUG(dbgs() << "Erasing:   " << MI; CreatedInstrs.remove(&MI));
+    if (MII.getInstrIterator().getNodePtr() == &MI) {
+      // If the iterator points to the MI that will be erased (i.e. the MI prior
+      // to the MI that is currently being selected), the iterator would be
+      // invalidated. Continue selection with its predecessor.
+      ++MII;
+      LLVM_DEBUG(dbgs() << "Instruction removal updated iterator.\n");
+    }
+  }
+
+  void reportFullyCreatedInstrs() {
+    LLVM_DEBUG({
+      if (CreatedInstrs.empty()) {
+        dbgs() << "Created no instructions.\n";
+      } else {
+        dbgs() << "Created:\n";
+        for (const auto *MI : CreatedInstrs) {
+          dbgs() << "  " << *MI;
+        }
+        CreatedInstrs.clear();
+      }
+    });
+  }
+};
+
 InstructionSelect::InstructionSelect(CodeGenOptLevel OL)
     : MachineFunctionPass(ID), OptLevel(OL) {}
 
@@ -93,7 +141,7 @@ bool InstructionSelect::runOnMachineFunction(MachineFunction &MF) {
   LLVM_DEBUG(dbgs() << "Selecting function: " << MF.getName() << '\n');
 
   const TargetPassConfig &TPC = getAnalysis<TargetPassConfig>();
-  InstructionSelector *ISel = MF.getSubtarget().getInstructionSelector();
+  ISel = MF.getSubtarget().getInstructionSelector();
   ISel->setTargetPassConfig(&TPC);
 
   CodeGenOptLevel OldOptLevel = OptLevel;
@@ -137,80 +185,36 @@ bool InstructionSelect::runOnMachineFunction(MachineFunction &MF) {
   // Keep track of selected blocks, so we can delete unreachable ones later.
   DenseSet<MachineBasicBlock *> SelectedBlocks;
 
-  for (MachineBasicBlock *MBB : post_order(&MF)) {
-    ISel->CurMBB = MBB;
-    SelectedBlocks.insert(MBB);
-    if (MBB->empty())
-      continue;
-
-    // Select instructions in reverse block order. We permit erasing so have
-    // to resort to manually iterating and recognizing the begin (rend) case.
-    bool ReachedBegin = false;
-    for (auto MII = std::prev(MBB->end()), Begin = MBB->begin();
-         !ReachedBegin;) {
-#ifndef NDEBUG
-      // Keep track of the insertion range for debug printing.
-      const auto AfterIt = std::next(MII);
-#endif
-      // Select this instruction.
-      MachineInstr &MI = *MII;
-
-      // And have our iterator point to the next instruction, if there is one.
-      if (MII == Begin)
-        ReachedBegin = true;
-      else
-        --MII;
-
-      LLVM_DEBUG(dbgs() << "Selecting: \n  " << MI);
-
-      // We could have folded this instruction away already, making it dead.
-      // If so, erase it.
-      if (isTriviallyDead(MI, MRI)) {
-        LLVM_DEBUG(dbgs() << "Is dead; erasing.\n");
-        salvageDebugInfo(MRI, MI);
-        MI.eraseFromParent();
-        continue;
-      }
-
-      // Eliminate hints or G_CONSTANT_FOLD_BARRIER.
-      if (isPreISelGenericOptimizationHint(MI.getOpcode()) ||
-          MI.getOpcode() == TargetOpcode::G_CONSTANT_FOLD_BARRIER) {
-        auto [DstReg, SrcReg] = MI.getFirst2Regs();
-
-        // At this point, the destination register class of the op may have
-        // been decided.
-        //
-        // Propagate that through to the source register.
-        const TargetRegisterClass *DstRC = MRI.getRegClassOrNull(DstReg);
-        if (DstRC)
-          MRI.setRegClass(SrcReg, DstRC);
-        assert(canReplaceReg(DstReg, SrcReg, MRI) &&
-               "Must be able to replace dst with src!");
-        MI.eraseFromParent();
-        MRI.replaceRegWith(DstReg, SrcReg);
-        continue;
-      }
-
-      if (MI.getOpcode() == TargetOpcode::G_INVOKE_REGION_START) {
-        MI.eraseFromParent();
-        continue;
-      }
-
-      if (!ISel->select(MI)) {
-        // FIXME: It would be nice to dump all inserted instructions.  It's
-        // not obvious how, esp. considering select() can insert after MI.
-        reportGISelFailure(MF, TPC, MORE, "gisel-select", "cannot select", MI);
-        return false;
+  {
+    // Observe IR insertions and removals during selection.
+    // We only install a MachineFunction::Delegate instead of a
+    // GISelChangeObserver, because we do not want notifications about changed
+    // instructions. This prevents significant compile-time regressions from
+    // e.g. constrainOperandRegClass().
+    MIIteratorMaintainer MIIMaintainer;
+    RAIIDelegateInstaller DelInstaller(MF, &MIIMaintainer);
+
+    for (MachineBasicBlock *MBB : post_order(&MF)) {
+      ISel->CurMBB = MBB;
+      SelectedBlocks.insert(MBB);
+
+      // Select instructions in reverse block order.
+      MIIMaintainer.MII = MBB->rbegin();
+      for (auto End = MBB->rend(); MIIMaintainer.MII != End;) {
+        MachineInstr &MI = *MIIMaintainer.MII;
+        // Increment early to skip instructions inserted by select().
+        ++MIIMaintainer.MII;
+
+        LLVM_DEBUG(dbgs() << "\nSelect:  " << MI);
+        if (!select(MI)) {
+          LLVM_DEBUG(dbgs() << "Selection failed!\n";
+                     MIIMaintainer.reportFullyCreatedInstrs());
+          reportGISelFailure(MF, TPC, MORE, "gisel-select", "cannot select",
+                             MI);
+          return false;
+        }
+        LLVM_DEBUG(MIIMaintainer.reportFullyCreatedInstrs());
       }
-
-      // Dump the range of instructions that MI expanded into.
-      LLVM_DEBUG({
-        auto InsertedBegin = ReachedBegin ? MBB->begin() : std::next(MII);
-        dbgs() << "Into:\n";
-        for (auto &InsertedMI : make_range(InsertedBegin, AfterIt))
-          dbgs() << "  " << InsertedMI;
-        dbgs() << '\n';
-      });
     }
   }
 
@@ -228,16 +232,10 @@ bool InstructionSelect::runOnMachineFunction(MachineFunction &MF) {
       continue;
     }
     // Try to find redundant copies b/w vregs of the same register class.
-    bool ReachedBegin = false;
-    for (auto MII = std::prev(MBB.end()), Begin = MBB.begin(); !ReachedBegin;) {
-      // Select this instruction.
+    for (auto MII = MBB.rbegin(), End = MBB.rend(); MII != End;) {
       MachineInstr &MI = *MII;
+      ++MII;
 
-      // And have our iterator point to the next instruction, if there is one.
-      if (MII == Begin)
-        ReachedBegin = true;
-      else
-        --MII;
       if (MI.getOpcode() != TargetOpcode::COPY)
         continue;
       Register SrcReg = MI.getOperand(1).getReg();
@@ -342,3 +340,42 @@ bool InstructionSelect::runOnMachineFunction(MachineFunction &MF) {
   // FIXME: Should we accurately track changes?
   return true;
 }
+
+bool InstructionSelect::select(MachineInstr &MI) {
+  MachineRegisterInfo &MRI = ISel->MF->getRegInfo();
+
+  // We could have folded this instruction away already, making it dead.
+  // If so, erase it.
+  if (isTriviallyDead(MI, MRI)) {
+    LLVM_DEBUG(dbgs() << "Is dead.\n");
+    salvageDebugInfo(MRI, MI);
+    MI.eraseFromParent();
+    return true;
+  }
+
+  // Eliminate hints or G_CONSTANT_FOLD_BARRIER.
+  if (isPreISelGenericOptimizationHint(MI.getOpcode()) ||
+      MI.getOpcode() == TargetOpcode::G_CONSTANT_FOLD_BARRIER) {
+    auto [DstReg, SrcReg] = MI.getFirst2Regs();
+
+    // At this point, the destination register class of the op may have
+    // been decided.
+    //
+    // Propagate that through to the source register.
+    const TargetRegisterClass *DstRC = MRI.getRegClassOrNull(DstReg);
+    if (DstRC)
+      MRI.setRegClass(SrcReg, DstRC);
+    assert(canReplaceReg(DstReg, SrcReg, MRI) &&
+           "Must be able to replace dst with src!");
+    MI.eraseFromParent();
+    MRI.replaceRegWith(DstReg, SrcReg);
+    return true;
+  }
+
+  if (MI.getOpcode() == TargetOpcode::G_INVOKE_REGION_START) {
+    MI.eraseFromParent();
+    return true;
+  }
+
+  return ISel->select(MI);
+}

Copy link
Collaborator

@qcolombet qcolombet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Could you add a test case that exercises the broken functionality that you described in discourse?

class InstructionSelect::MIIteratorMaintainer
: public MachineFunction::Delegate {
#ifndef NDEBUG
SetVector<const MachineInstr *> CreatedInstrs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetVector has a max static size of 32, which I think we should use here to reduce the chance of heap allocations.

Copy link

github-actions bot commented Jul 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@tobias-stadler
Copy link
Contributor Author

tobias-stadler commented Aug 1, 2024

Sorry for the rebase, had to fix a conflict. Can this be merged or does anyone have more feedback on the test changes?

@qcolombet
Copy link
Collaborator

Sorry for the rebase, had to fix a conflict. Can this be merged or does anyone have more feedback on the test changes?

Go ahead!

Copy link
Contributor

@aemerson aemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, LGTM too.

Copy link
Contributor

@holland11 holland11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for this patch.

@tobias-stadler tobias-stadler merged commit d2336fd into llvm:main Aug 11, 2024
7 checks passed
@tobias-stadler tobias-stadler deleted the gisel-isel-observer branch August 16, 2024 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants