[RISCV] Implement cross basic block VXRM write insertion. #70382

topperc · 2023-10-26T21:44:50Z

This adds a new pass to insert VXRM writes for vector instructions. With the goal of avoiding redundant writes.

The pass does 2 dataflow algorithms. The first is a forward data flow to calculate where a VXRM value is available. The second is a backwards dataflow to determine where a VXRM value is anticipated.

Finally, we use the results of these two dataflows to insert VXRM writes where a value is anticipated, but not available.

The pass does not split critical edges so we aren't always able to eliminate all redundancy.

The pass will only insert vxrm writes on paths that always require it.

This adds a new pass to insert VXRM writes for vector instructions. With the goal of avoiding redundant writes. The pass does 2 dataflow algorithms. The first is a forward data flow to calculate where a VXRM value is available. The second is a backwards dataflow to determine where a VXRM value is anticipated. Finally, we use the results of these two dataflows to insert VXRM writes where a value is anticipated, but not available. The pass does not split critical edges so we aren't always able to eliminate all redundancy. The pass will only insert vxrm writes on paths that always require it.

llvmbot · 2023-10-26T21:46:00Z

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

This adds a new pass to insert VXRM writes for vector instructions. With the goal of avoiding redundant writes.

The pass does 2 dataflow algorithms. The first is a forward data flow to calculate where a VXRM value is available. The second is a backwards dataflow to determine where a VXRM value is anticipated.

Finally, we use the results of these two dataflows to insert VXRM writes where a value is anticipated, but not available.

The pass does not split critical edges so we aren't always able to eliminate all redundancy.

The pass will only insert vxrm writes on paths that always require it.

Patch is 588.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/70382.diff

29 Files Affected:

(modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1)
(modified) llvm/lib/Target/RISCV/RISCV.h (+3)
(modified) llvm/lib/Target/RISCV/RISCVInsertReadWriteCSR.cpp (+2-17)
(added) llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp (+457)
(modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+2)
(modified) llvm/test/CodeGen/RISCV/O0-pipeline.ll (+1)
(modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/copyprop.mir (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/masked-tama.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/masked-tamu.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/masked-tuma.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/masked-tumu.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/mutate-prior-vsetvli-avl.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/unmasked-tu.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/rvv/vaadd.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vaaddu.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vasub.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vasubu.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vnclip.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/rvv/vnclipu.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/rvv/vsmul-rv32.ll (+80-80)
(modified) llvm/test/CodeGen/RISCV/rvv/vsmul-rv64.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vssra-rv32.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vssra-rv64.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vssrl-rv32.ll (+88-88)
(modified) llvm/test/CodeGen/RISCV/rvv/vssrl-rv64.ll (+88-88)
(added) llvm/test/CodeGen/RISCV/rvv/vxrm-insert.ll (+473)
(modified) llvm/test/CodeGen/RISCV/rvv/vxrm.mir (+2-2)

diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index 4d5fa79389ea68b..5b94ce3bea451a0 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -37,6 +37,7 @@ add_llvm_target(RISCVCodeGen
   RISCVGatherScatterLowering.cpp
   RISCVInsertVSETVLI.cpp
   RISCVInsertReadWriteCSR.cpp
+  RISCVInsertWriteVXRM.cpp
   RISCVInstrInfo.cpp
   RISCVISelDAGToDAG.cpp
   RISCVISelLowering.cpp
diff --git a/llvm/lib/Target/RISCV/RISCV.h b/llvm/lib/Target/RISCV/RISCV.h
index 3d8e33dc716ea44..6092d312e4f98b0 100644
--- a/llvm/lib/Target/RISCV/RISCV.h
+++ b/llvm/lib/Target/RISCV/RISCV.h
@@ -68,6 +68,9 @@ void initializeRISCVPostRAExpandPseudoPass(PassRegistry &);
 FunctionPass *createRISCVInsertReadWriteCSRPass();
 void initializeRISCVInsertReadWriteCSRPass(PassRegistry &);
 
+FunctionPass *createRISCVInsertWriteVXRMPass();
+void initializeRISCVInsertWriteVXRMPass(PassRegistry &);
+
 FunctionPass *createRISCVRedundantCopyEliminationPass();
 void initializeRISCVRedundantCopyEliminationPass(PassRegistry &);
 
diff --git a/llvm/lib/Target/RISCV/RISCVInsertReadWriteCSR.cpp b/llvm/lib/Target/RISCV/RISCVInsertReadWriteCSR.cpp
index 75f5ac3fbe0dd55..acd19bf1b8a162e 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertReadWriteCSR.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertReadWriteCSR.cpp
@@ -9,7 +9,6 @@
 // of the RISC-V instructions.
 //
 // Currently the pass implements:
-// -Naive insertion of a write to vxrm before an RVV fixed-point instruction.
 // -Writing and saving frm before an RVV floating-point instruction with a
 //  static rounding mode and restores the value after.
 //
@@ -58,25 +57,11 @@ char RISCVInsertReadWriteCSR::ID = 0;
 INITIALIZE_PASS(RISCVInsertReadWriteCSR, DEBUG_TYPE,
                 RISCV_INSERT_READ_WRITE_CSR_NAME, false, false)
 
-// This function inserts a write to vxrm when encountering an RVV fixed-point
-// instruction. This function also swaps frm and restores it when encountering
-// an RVV floating point instruction with a static rounding mode.
+// This function also swaps frm and restores it when encountering an RVV
+// floating point instruction with a static rounding mode.
 bool RISCVInsertReadWriteCSR::emitWriteRoundingMode(MachineBasicBlock &MBB) {
   bool Changed = false;
   for (MachineInstr &MI : MBB) {
-    int VXRMIdx = RISCVII::getVXRMOpNum(MI.getDesc());
-    if (VXRMIdx >= 0) {
-      unsigned VXRMImm = MI.getOperand(VXRMIdx).getImm();
-
-      Changed = true;
-
-      BuildMI(MBB, MI, MI.getDebugLoc(), TII->get(RISCV::WriteVXRMImm))
-          .addImm(VXRMImm);
-      MI.addOperand(MachineOperand::CreateReg(RISCV::VXRM, /*IsDef*/ false,
-                                              /*IsImp*/ true));
-      continue;
-    }
-
     int FRMIdx = RISCVII::getFRMOpNum(MI.getDesc());
     if (FRMIdx < 0)
       continue;
diff --git a/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp b/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp
new file mode 100644
index 000000000000000..8313a809b84cbcb
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp
@@ -0,0 +1,457 @@
+//===-- RISCVInsertWriteVXRM.cpp - Insert Write of RISC-V VXRM CSR --------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass inserts writes to the VXRM CSR as needed by vector instructions.
+// Each instruction that uses VXRM carries an operand that contains its required
+// VXRM value. This pass tries to optimize placement to avoid redundant writes
+// to VXRM.
+//
+// This is done using 2 dataflow algorithms. The first is a forward data flow
+// to calculate where a VXRM value is available. The second is a backwards
+// dataflow to determine where a VXRM value is anticipated.
+//
+// Finally, we use the results of these two dataflows to insert VXRM writes
+// where a value is anticipated, but not available.
+//
+// FIXME: This pass does not split critical edges, so there can still be some
+// redundancy.
+//
+// FIXME: If we are willing to have writes that aren't always needed, we could
+// reduce the number of VXRM writes in some cases.
+//===----------------------------------------------------------------------===//
+
+#include "MCTargetDesc/RISCVBaseInfo.h"
+#include "RISCV.h"
+#include "RISCVSubtarget.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include <queue>
+
+using namespace llvm;
+
+#define DEBUG_TYPE "riscv-insert-write-vxrm"
+#define RISCV_INSERT_WRITE_VXRM_NAME "RISC-V Insert Write VXRM Pass"
+
+namespace {
+
+class VXRMInfo {
+  uint8_t VXRMImm = 0;
+
+  enum : uint8_t {
+    Uninitialized,
+    Static,
+    Unknown,
+  } State = Uninitialized;
+
+public:
+  VXRMInfo() {}
+
+  static VXRMInfo getUnknown() {
+    VXRMInfo Info;
+    Info.setUnknown();
+    return Info;
+  }
+
+  bool isValid() const { return State != Uninitialized; }
+  void setUnknown() { State = Unknown; }
+  bool isUnknown() const { return State == Unknown; }
+
+  bool isStatic() const { return State == Static; }
+
+  void setVXRMImm(unsigned Imm) {
+    assert(Imm <= 3 && "Unexpected VXRM value");
+    VXRMImm = Imm;
+    State = Static;
+  }
+  unsigned getVXRMImm() const {
+    assert(isStatic() && VXRMImm <= 3 && "Unexpected state");
+    return VXRMImm;
+  }
+
+  bool operator==(const VXRMInfo &Other) const {
+    // Uninitialized is only equal to another Uninitialized.
+    if (!isValid())
+      return !Other.isValid();
+    if (!Other.isValid())
+      return !isValid();
+
+    // Unknown is only equal to another Unknown.
+    if (isUnknown())
+      return Other.isUnknown();
+    if (Other.isUnknown())
+      return isUnknown();
+
+    return VXRMImm == Other.VXRMImm;
+  }
+
+  bool operator!=(const VXRMInfo &Other) const { return !(*this == Other); }
+
+  // Calculate the VXRMInfo visible to a block assuming this and Other are
+  // both predecessors.
+  VXRMInfo intersect(const VXRMInfo &Other) const {
+    // If the new value isn't valid, ignore it.
+    if (!Other.isValid())
+      return *this;
+
+    // If this value isn't valid, this must be the first predecessor, use it.
+    if (!isValid())
+      return Other;
+
+    // If either is unknown, the result is unknown.
+    if (isUnknown() || Other.isUnknown())
+      return VXRMInfo::getUnknown();
+
+    // If we have an exact match, return this.
+    if (*this == Other)
+      return *this;
+
+    // Otherwise the result is unknown.
+    return VXRMInfo::getUnknown();
+  }
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Support for debugging, callable in GDB: V->dump()
+  LLVM_DUMP_METHOD void dump() const {
+    print(dbgs());
+    dbgs() << "\n";
+  }
+
+  void print(raw_ostream &OS) const {
+    OS << '{';
+    if (!isValid())
+      OS << "Uninitialized";
+    else if (isUnknown())
+      OS << "Unknown";
+    else
+      OS << getVXRMImm();
+    OS << '}';
+  }
+#endif
+};
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+LLVM_ATTRIBUTE_USED
+inline raw_ostream &operator<<(raw_ostream &OS, const VXRMInfo &V) {
+  V.print(OS);
+  return OS;
+}
+#endif
+
+struct BlockData {
+  // Indicates if the block uses VXRM. Uninitialized means no use.
+  VXRMInfo VXRMUse;
+
+  // Indicates the VXRM output from the block. Unitialized means transparent.
+  VXRMInfo VXRMOut;
+
+  // Keeps track of the available VXRM value at the start of the basic bloc.
+  VXRMInfo AvailableIn;
+
+  // Keeps track of the available VXRM value at the end of the basic block.
+  VXRMInfo AvailableOut;
+
+  // Keeps track of what VXRM is anticipated at the start of the basic block.
+  VXRMInfo AnticipatedIn;
+
+  // Keeps track of what VXRM is anticipated at the end of the basic block.
+  VXRMInfo AnticipatedOut;
+
+  // Keeps track of whether the block is already in the queue.
+  bool InQueue;
+
+  BlockData() = default;
+};
+
+class RISCVInsertWriteVXRM : public MachineFunctionPass {
+  const TargetInstrInfo *TII;
+
+  std::vector<BlockData> BlockInfo;
+  std::queue<const MachineBasicBlock *> WorkList;
+
+public:
+  static char ID;
+
+  RISCVInsertWriteVXRM() : MachineFunctionPass(ID) {
+    initializeRISCVInsertWriteVXRMPass(*PassRegistry::getPassRegistry());
+  }
+
+  bool runOnMachineFunction(MachineFunction &MF) override;
+
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.setPreservesCFG();
+    MachineFunctionPass::getAnalysisUsage(AU);
+  }
+
+  StringRef getPassName() const override {
+    return RISCV_INSERT_WRITE_VXRM_NAME;
+  }
+
+private:
+  bool computeVXRMChanges(const MachineBasicBlock &MBB);
+  void computeAvailable(const MachineBasicBlock &MBB);
+  void computeAnticipated(const MachineBasicBlock &MBB);
+  void emitWriteVXRM(MachineBasicBlock &MBB);
+};
+
+} // end anonymous namespace
+
+char RISCVInsertWriteVXRM::ID = 0;
+
+INITIALIZE_PASS(RISCVInsertWriteVXRM, DEBUG_TYPE, RISCV_INSERT_WRITE_VXRM_NAME,
+                false, false)
+
+bool RISCVInsertWriteVXRM::computeVXRMChanges(const MachineBasicBlock &MBB) {
+  BlockData &BBInfo = BlockInfo[MBB.getNumber()];
+
+  bool NeedVXRMWrite = false;
+  for (const MachineInstr &MI : MBB) {
+    int VXRMIdx = RISCVII::getVXRMOpNum(MI.getDesc());
+    if (VXRMIdx >= 0) {
+      unsigned NewVXRMImm = MI.getOperand(VXRMIdx).getImm();
+
+      if (!BBInfo.VXRMUse.isValid())
+        BBInfo.VXRMUse.setVXRMImm(NewVXRMImm);
+
+      BBInfo.VXRMOut.setVXRMImm(NewVXRMImm);
+      NeedVXRMWrite = true;
+    }
+
+    if (MI.isCall() || MI.isInlineAsm()) {
+      if (!BBInfo.VXRMUse.isValid())
+        BBInfo.VXRMUse.setUnknown();
+
+      BBInfo.VXRMOut.setUnknown();
+    }
+  }
+
+  return NeedVXRMWrite;
+}
+
+void RISCVInsertWriteVXRM::computeAvailable(const MachineBasicBlock &MBB) {
+  BlockData &BBInfo = BlockInfo[MBB.getNumber()];
+
+  BBInfo.InQueue = false;
+
+  VXRMInfo Available;
+  if (MBB.pred_empty()) {
+    Available.setUnknown();
+  } else {
+    for (const MachineBasicBlock *P : MBB.predecessors())
+      Available = Available.intersect(BlockInfo[P->getNumber()].AvailableOut);
+  }
+
+  // If we don't have any valid available info, wait until we do.
+  if (!Available.isValid())
+    return;
+
+  if (Available != BBInfo.AvailableIn) {
+    BBInfo.AvailableIn = Available;
+    LLVM_DEBUG(dbgs() << "AvailableIn state of " << printMBBReference(MBB)
+                      << " changed to " << BBInfo.AvailableIn << "\n");
+  }
+
+  if (BBInfo.VXRMOut.isValid())
+    Available = BBInfo.VXRMOut;
+
+  if (Available == BBInfo.AvailableOut)
+    return;
+
+  BBInfo.AvailableOut = Available;
+  LLVM_DEBUG(dbgs() << "AvailableOut state of " << printMBBReference(MBB)
+                    << " changed to " << BBInfo.AvailableOut << "\n");
+
+  // Add the successors to the work list so that we can propagate.
+  for (MachineBasicBlock *S : MBB.successors()) {
+    if (!BlockInfo[S->getNumber()].InQueue) {
+      BlockInfo[S->getNumber()].InQueue = true;
+      WorkList.push(S);
+    }
+  }
+}
+
+void RISCVInsertWriteVXRM::computeAnticipated(const MachineBasicBlock &MBB) {
+  BlockData &BBInfo = BlockInfo[MBB.getNumber()];
+
+  BBInfo.InQueue = false;
+
+  VXRMInfo Anticipated;
+  if (MBB.succ_empty()) {
+    Anticipated.setUnknown();
+  } else {
+    for (const MachineBasicBlock *S : MBB.successors())
+      Anticipated =
+          Anticipated.intersect(BlockInfo[S->getNumber()].AnticipatedIn);
+  }
+
+  // If we don't have any valid anticipated info, wait until we do.
+  if (!Anticipated.isValid())
+    return;
+
+  if (Anticipated != BBInfo.AnticipatedOut) {
+    BBInfo.AnticipatedOut = Anticipated;
+    LLVM_DEBUG(dbgs() << "AnticipatedOut state of " << printMBBReference(MBB)
+                      << " changed to " << BBInfo.AnticipatedOut << "\n");
+  }
+
+  // If this block reads VXRM, copy it.
+  if (BBInfo.VXRMUse.isValid())
+    Anticipated = BBInfo.VXRMUse;
+
+  if (Anticipated == BBInfo.AnticipatedIn)
+    return;
+
+  BBInfo.AnticipatedIn = Anticipated;
+  LLVM_DEBUG(dbgs() << "AnticipatedIn state of " << printMBBReference(MBB)
+                    << " changed to " << BBInfo.AnticipatedIn << "\n");
+
+  // Add the predecessors to the work list so that we can propagate.
+  for (MachineBasicBlock *P : MBB.predecessors()) {
+    if (!BlockInfo[P->getNumber()].InQueue) {
+      BlockInfo[P->getNumber()].InQueue = true;
+      WorkList.push(P);
+    }
+  }
+}
+
+void RISCVInsertWriteVXRM::emitWriteVXRM(MachineBasicBlock &MBB) {
+  const BlockData &BBInfo = BlockInfo[MBB.getNumber()];
+
+  VXRMInfo Info = BBInfo.AvailableIn;
+
+  // Insert VXRM write if anticipated and not available.
+  if (BBInfo.AnticipatedIn.isStatic()) {
+    bool NeedInsert = false;
+    // If there no predecessors and the value is anticipated, insert.
+    if (MBB.pred_empty()) {
+      NeedInsert = true;
+    } else {
+      // Search for any predecessors that wouldn't satisfy our requirement and
+      // insert a write VXRM if needed.
+      // NOTE: If one predecessor is able to provide the requirement, but
+      // another isn't, it means we have a critical edge. The better placement
+      // would be to split the critical edge.
+      for (MachineBasicBlock *P : MBB.predecessors()) {
+        const BlockData &PInfo = BlockInfo[P->getNumber()];
+        // If it's available out of the predecessor, then we're ok.
+        if (PInfo.AvailableOut.isStatic() &&
+            PInfo.AvailableOut.getVXRMImm() ==
+                BBInfo.AnticipatedIn.getVXRMImm())
+          continue;
+        // If the predecessor anticipated this value for all its succesors,
+        // then it should have already inserted.
+        if (PInfo.AnticipatedOut.isStatic() &&
+            PInfo.AnticipatedOut.getVXRMImm() ==
+                BBInfo.AnticipatedIn.getVXRMImm())
+          continue;
+        NeedInsert = true;
+        break;
+      }
+    }
+
+    if (NeedInsert) {
+      BuildMI(MBB, MBB.getFirstNonPHI(), DebugLoc(),
+              TII->get(RISCV::WriteVXRMImm))
+          .addImm(BBInfo.AnticipatedIn.getVXRMImm());
+      Info.setVXRMImm(BBInfo.AnticipatedIn.getVXRMImm());
+    }
+    Info = BBInfo.AnticipatedIn;
+  }
+
+  for (MachineInstr &MI : MBB) {
+    int VXRMIdx = RISCVII::getVXRMOpNum(MI.getDesc());
+    if (VXRMIdx >= 0) {
+      unsigned NewVXRMImm = MI.getOperand(VXRMIdx).getImm();
+
+      if (!Info.isStatic() || Info.getVXRMImm() != NewVXRMImm) {
+        LLVM_DEBUG(dbgs() << "Inserting before "; MI.print(dbgs()));
+        BuildMI(MBB, MI, MI.getDebugLoc(), TII->get(RISCV::WriteVXRMImm))
+            .addImm(NewVXRMImm);
+      }
+
+      MI.addOperand(MachineOperand::CreateReg(RISCV::VXRM, /*IsDef*/ false,
+                                              /*IsImp*/ true));
+      Info.setVXRMImm(NewVXRMImm);
+      continue;
+    }
+
+    if (MI.getOpcode() == RISCV::WriteVXRMImm) {
+      Info.setVXRMImm(MI.getOperand(0).getImm());
+      continue;
+    }
+
+    if (MI.isCall() || MI.isInlineAsm() || MI.modifiesRegister(RISCV::VXRM))
+      Info.setUnknown();
+  }
+
+  // If all our successors anticipate a value, do the insert.
+  // NOTE: It's possible that not all predecessors of our successor provide the
+  // correct value. This can occur on critical edges. If we don't split the
+  // critical edge we'll also have a write vxrm in the succesor that is
+  // redundant with this one.
+  if (BBInfo.AnticipatedOut.isStatic() &&
+      (!Info.isStatic() ||
+       Info.getVXRMImm() != BBInfo.AnticipatedOut.getVXRMImm())) {
+    LLVM_DEBUG(dbgs() << "Inserting at end of " << printMBBReference(MBB)
+                      << " changing to " << BBInfo.AnticipatedOut << "\n");
+    BuildMI(MBB, MBB.getFirstTerminator(), DebugLoc(),
+            TII->get(RISCV::WriteVXRMImm))
+        .addImm(BBInfo.AnticipatedOut.getVXRMImm());
+  }
+}
+
+bool RISCVInsertWriteVXRM::runOnMachineFunction(MachineFunction &MF) {
+  // Skip if the vector extension is not enabled.
+  const RISCVSubtarget &ST = MF.getSubtarget<RISCVSubtarget>();
+  if (!ST.hasVInstructions())
+    return false;
+
+  TII = ST.getInstrInfo();
+
+  assert(BlockInfo.empty() && "Expect empty block infos");
+  BlockInfo.resize(MF.getNumBlockIDs());
+
+  // Phase 1 - collect block information.
+  bool NeedVXRMChange = false;
+  for (const MachineBasicBlock &MBB : MF)
+    NeedVXRMChange |= computeVXRMChanges(MBB);
+
+  if (NeedVXRMChange) {
+    // Phase 2 - Compute available VXRM using a forward walk.
+    for (const MachineBasicBlock &MBB : MF) {
+      WorkList.push(&MBB);
+      BlockInfo[MBB.getNumber()].InQueue = true;
+    }
+    while (!WorkList.empty()) {
+      const MachineBasicBlock &MBB = *WorkList.front();
+      WorkList.pop();
+      computeAvailable(MBB);
+    }
+
+    // Phase 3 - Compute anticipated VXRM using a backwards walk.
+    for (const MachineBasicBlock &MBB : llvm::reverse(MF)) {
+      WorkList.push(&MBB);
+      BlockInfo[MBB.getNumber()].InQueue = true;
+    }
+    while (!WorkList.empty()) {
+      const MachineBasicBlock &MBB = *WorkList.front();
+      WorkList.pop();
+      computeAnticipated(MBB);
+    }
+
+    // Phase 4 - Emit VXRM writes at the earliest place possible.
+    for (MachineBasicBlock &MBB : MF)
+      emitWriteVXRM(MBB);
+  }
+
+  BlockInfo.clear();
+
+  return NeedVXRMChange;
+}
+
+FunctionPass *llvm::createRISCVInsertWriteVXRMPass() {
+  return new RISCVInsertWriteVXRM();
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index 953ac097b915044..137d82505d152db 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -103,6 +103,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeRISCVTarget() {
   initializeRISCVExpandPseudoPass(*PR);
   initializeRISCVInsertVSETVLIPass(*PR);
   initializeRISCVInsertReadWriteCSRPass(*PR);
+  initializeRISCVInsertWriteVXRMPass(*PR);
   initializeRISCVDAGToDAGISelPass(*PR);
   initializeRISCVInitUndefPass(*PR);
   initializeRISCVMoveMergePass(*PR);
@@ -432,6 +433,7 @@ void RISCVPassConfig::addPreRegAlloc() {
       EnableRISCVDeadRegisterElimination)
     addPass(createRISCVDeadRegisterDefinitionsPass());
   addPass(createRISCVInsertReadWriteCSRPass());
+  addPass(createRISCVInsertWriteVXRMPass());
 }
 
 void RISCVPassConfig::addOptimizedRegAlloc() {
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 1d9af9df2f718f0..e01d2d45263434e 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -42,6 +42,7 @@
 ; CHECK-NEXT:       RISC-V Pre-RA pseudo instruction expansion pass
 ; CHECK-NEXT:       RISC-V Insert VSETVLI pass
 ; CHECK-NEXT:       RISC-V Insert Read/Write CSR Pass
+; CHECK-NEXT:       RISC-V Insert Write VXRM Pass
 ; CHECK-NEXT:       RISC-V init undef pass
 ; CHECK-NEXT:       Eliminate PHI nodes for register allocation
 ; CHECK-NEXT:       Two-Address instruction pass
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index cf0826096bd41f8..f1d56e9e18e2bae 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -110,6 +110,7 @@
 ; CHECK-NEXT:       RISC-V Insert VSETVLI pass
 ; CHECK-NEXT:       RISC-V Dead register definitions
 ; CHECK-NEXT:       RISC-V Insert Read/Write CSR Pass
+; CHECK-NEXT:       RISC-V Insert Write VXRM Pass
 ; CHECK-NEXT:       Detect Dead Lanes
 ; CHECK-NEXT:       RISC-V init undef pass
 ; CHECK-NEXT:       Process Implicit Definitions
diff --git a/llvm/test/CodeGen/RISCV/rvv/copyprop.mir b/llvm/test/CodeGen/RISCV/rvv/copyprop.mir
index eb4c8bfdd67f9a6..7c9a943bd50b662 100644
--- a/llvm/test/CodeGen/RISCV/rvv/copyprop.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/copyprop.mir
@@ -5,11 +5,11 @@
   define void @foo() {
   ; CHECK-LABEL: foo:
   ; CHECK:       # %bb.0: # %entry
+  ; CHECK-NEXT:    csrwi vxrm, 0
   ; CHECK-NEXT:    vsetivli zero, 1, e64, m1, ta, ma
   ; CHECK-NEXT:    vmsne.vi v0, v8, 0
   ; CHECK-NEXT:    vsll.vi v9, v8, 5
   ; CHECK-NEXT:    vmerge.vim v9, v9, -1, v0
-  ; CHECK-NEXT:    csrwi vxrm, 0
   ; CHECK-NEXT:    vssra.vi v8, v8, 2
   ; CHECK-NEXT:    bgeu a0, zero, .LBB0_3
   ; CHECK-NEXT:  # %bb.1: # %entry
diff --git a/llvm/test/CodeGen/RISCV/rvv/masked-tama.ll b/llvm/test/CodeGen/RISCV/rvv/ma...
[truncated]

rofirrim

LGTM. I admit dataflow isn't always my forte and I had to reach my compiler books for the anticipation part :)

The approach looks sensible to me within limits already acknowledged in the notes and fixmes.

I fail to see a test in vxrm-insert.ll that shows the redundant write case caused by a critical edge. If there isn't one, would it make sense to add it?

rofirrim · 2023-10-27T06:18:38Z

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

+    }
+
+    if (NeedInsert) {
+      BuildMI(MBB, MBB.getFirstNonPHI(), DebugLoc(),


You emit a debug message for the other insertions, perhaps it makes sense to do it here too?

rofirrim · 2023-10-27T06:19:31Z

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

+  // Insert VXRM write if anticipated and not available.
+  if (BBInfo.AnticipatedIn.isStatic()) {
+    bool NeedInsert = false;
+    // If there no predecessors and the value is anticipated, insert.


Typo: "If there are no…"

topperc · 2023-10-27T06:46:26Z

LGTM. I admit dataflow isn't always my forte and I had to reach my compiler books for the anticipation part :)

The approach looks sensible to me within limits already acknowledged in the notes and fixmes.

Thanks for the review.

I fail to see a test in vxrm-insert.ll that shows the redundant write case caused by a critical edge. If there isn't one, would it make sense to add it?

I will add one.

kito-cheng

One question in my mind is should we keep naive one for O0?

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

preames · 2023-11-01T20:10:29Z

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

+  // Insert VXRM write if anticipated and not available.
+  if (BBInfo.AnticipatedIn.isStatic()) {
+    bool NeedInsert = false;
+    // If there no predecessors and the value is anticipated, insert.


Why? This is by definition an unreachable block, so why do we need to insert anything?

Ah, this can validly be the function entry block. Maybe check for that explicitly?

preames · 2023-11-01T20:14:44Z

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

+      continue;
+    }
+
+    if (MI.getOpcode() == RISCV::WriteVXRMImm) {


This case is missing from computeVXRMChanges. You really want that code to be in sync. :)

I don't expect a WriteVXRMImm to exist before this pass runs, but I'll add it. It's needed here because we can insert a WriteVXRM at the beginning of the block before we start this loop.

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

preames · 2023-11-01T20:25:55Z

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

+
+  VXRMInfo Info = BBInfo.AvailableIn;
+
+  // Insert VXRM write if anticipated and not available.


The eager insert at beginning of the block causes a bunch of spurious test diffs. The usual problem of "maybe there's something not spurious" lurking in their applies.

I think you could compute the start value, but defer the actual emission to first instruction using vxrm and get a lot fewer diffs in the existing test files.

There might be not be a using instruction in the block since the algorithm calculate the earliest insertion point. So we'd still need to emit at the end in the worst case.

I'll try it.

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

preames

LGTM

preames · 2023-11-02T17:55:39Z

llvm/test/CodeGen/RISCV/rvv/vxrm-insert.ll

+; CHECK-NEXT:    csrwi vxrm, 0
+; CHECK-NEXT:    beqz a1, .LBB5_2
+; CHECK-NEXT:  # %bb.1: # %trueblock
+; CHECK-NEXT:    vsetvli zero, a0, e8, mf8, ta, ma


Pure aside, but this test case demonstrates that we could probably benefit from an anticipation based extension to vsetvli insertion, and maybe a more aggressive flattening or tail commoning.

topperc requested review from preames, rofirrim, kito-cheng, eopXD and luke957 October 26, 2023 21:44

llvmbot added the backend:RISC-V label Oct 26, 2023

rofirrim reviewed Oct 27, 2023

View reviewed changes

topperc added 2 commits October 30, 2023 10:53

Address code review comments.

834e97f

Add some critical edge tests.

c2c0b7c

kito-cheng reviewed Oct 31, 2023

View reviewed changes

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp Outdated Show resolved Hide resolved

Address review comment

1dbc87c

preames reviewed Nov 1, 2023

View reviewed changes

topperc added 6 commits November 1, 2023 15:33

Use early out.

71c9f56

Reword comment

b7af9ca

Add modifesRegister to computeVXRMChanges

8faa430

[RISCV] Add check for WriteVXRMImm to computeVXRMChanges.

da77cf1

Change pred_empty() to isEntryBlock()

4ea78b4

Delay VXRM insertion to the latest point in a block.

b78b4d0

preames approved these changes Nov 2, 2023

View reviewed changes

topperc merged commit 014390d into llvm:main Nov 2, 2023

topperc deleted the pr/vxrm-pre branch November 2, 2023 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Implement cross basic block VXRM write insertion. #70382

[RISCV] Implement cross basic block VXRM write insertion. #70382

Uh oh!

topperc commented Oct 26, 2023

Uh oh!

llvmbot commented Oct 26, 2023

Uh oh!

rofirrim left a comment

Uh oh!

rofirrim Oct 27, 2023

Uh oh!

rofirrim Oct 27, 2023

Uh oh!

topperc commented Oct 27, 2023

Uh oh!

kito-cheng left a comment

Uh oh!

Uh oh!

Uh oh!

preames Nov 1, 2023

Uh oh!

preames Nov 1, 2023

Uh oh!

topperc Nov 1, 2023

Uh oh!

Uh oh!

preames Nov 1, 2023

Uh oh!

topperc Nov 1, 2023

Uh oh!

Uh oh!

preames left a comment

Uh oh!

preames Nov 2, 2023

Uh oh!

Uh oh!


		VXRMInfo Info = BBInfo.AvailableIn;

		// Insert VXRM write if anticipated and not available.

[RISCV] Implement cross basic block VXRM write insertion. #70382

[RISCV] Implement cross basic block VXRM write insertion. #70382

Uh oh!

Conversation

topperc commented Oct 26, 2023

Uh oh!

llvmbot commented Oct 26, 2023

Uh oh!

rofirrim left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc commented Oct 27, 2023

Uh oh!

kito-cheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!