Skip to content

[RISCV] Check VL dominates and potentially move in tryReduceVL #106753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 30, 2024

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Aug 30, 2024

Similar to what we do in foldVMV_V_V with the passthru, if we end up changing the Src's VL in tryReduceVL we need to make sure it dominates.

Fixes #106735

Similar to what we do in foldVMV_V_V with the passthru, if we end up changing the Src's VL in tryReduceVL we need to make sure it dominates.

Fixes llvm#106735
@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

Similar to what we do in foldVMV_V_V with the passthru, if we end up changing the Src's VL in tryReduceVL we need to make sure it dominates.

Fixes #106735


Full diff: https://github.com/llvm/llvm-project/pull/106753.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp (+26-9)
  • (added) llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.ll (+19)
  • (added) llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.mir (+15)
diff --git a/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp b/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
index 34e5d9224f7150..4323d7806463f0 100644
--- a/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
@@ -70,6 +70,7 @@ class RISCVVectorPeephole : public MachineFunctionPass {
 
   bool isAllOnesMask(const MachineInstr *MaskDef) const;
   std::optional<unsigned> getConstant(const MachineOperand &VL) const;
+  bool ensureDominates(const MachineOperand &Use, MachineInstr &Src) const;
 
   /// Maps uses of V0 to the corresponding def of V0.
   DenseMap<const MachineInstr *, const MachineInstr *> V0Defs;
@@ -165,6 +166,9 @@ bool RISCVVectorPeephole::tryToReduceVL(MachineInstr &MI) const {
   if (VL.isIdenticalTo(SrcVL) || !isVLKnownLE(VL, SrcVL))
     return false;
 
+  if (!ensureDominates(VL, *Src))
+    return false;
+
   if (VL.isImm())
     SrcVL.ChangeToImmediate(VL.getImm());
   else if (VL.isReg())
@@ -456,6 +460,26 @@ static bool dominates(MachineBasicBlock::const_iterator A,
   return &*I == A;
 }
 
+/// If the register in \p MO doesn't dominate \p Src, try to move \p Src so it
+/// does. Returns false if doesn't dominate and we can't move. \p MO must be in
+/// the same basic block as \Src.
+bool RISCVVectorPeephole::ensureDominates(const MachineOperand &MO,
+                                          MachineInstr &Src) const {
+  assert(MO.getParent()->getParent() == Src.getParent());
+  if (!MO.isReg() || MO.getReg() == RISCV::NoRegister)
+    return true;
+
+  MachineInstr *Def = MRI->getVRegDef(MO.getReg());
+  if (Def->getParent() == Src.getParent() && !dominates(Def, Src)) {
+    if (!isSafeToMove(Src, *Def->getNextNode()))
+      return false;
+    // FIXME: Update V0Defs
+    Src.moveBefore(Def->getNextNode());
+  }
+
+  return true;
+}
+
 /// If a PseudoVMV_V_V is the only user of its input, fold its passthru and VL
 /// into it.
 ///
@@ -501,15 +525,8 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) {
     return false;
 
   // If the new passthru doesn't dominate Src, try to move Src so it does.
-  if (Passthru.getReg() != RISCV::NoRegister) {
-    MachineInstr *PassthruDef = MRI->getVRegDef(Passthru.getReg());
-    if (PassthruDef->getParent() == Src->getParent() &&
-        !dominates(PassthruDef, Src)) {
-      if (!isSafeToMove(*Src, *PassthruDef->getNextNode()))
-        return false;
-      Src->moveBefore(PassthruDef->getNextNode());
-    }
-  }
+  if (!ensureDominates(Passthru, *Src))
+    return false;
 
   if (SrcPassthru.getReg() != Passthru.getReg()) {
     SrcPassthru.setReg(Passthru.getReg());
diff --git a/llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.ll b/llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.ll
new file mode 100644
index 00000000000000..7f70b0ed224ec0
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=riscv64 -mattr=+v -verify-machineinstrs | FileCheck %s
+
+define void @avl_not_dominated(<vscale x 2 x i32> %v, ptr %p) {
+; CHECK-LABEL: avl_not_dominated:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 1, e32, m1, ta, ma
+; CHECK-NEXT:    vmv.x.s a1, v8
+; CHECK-NEXT:    slli a1, a1, 32
+; CHECK-NEXT:    srli a1, a1, 32
+; CHECK-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
+; CHECK-NEXT:    vadd.vi v8, v8, 1
+; CHECK-NEXT:    vse32.v v8, (a0)
+; CHECK-NEXT:    ret
+  %w = add <vscale x 2 x i32> %v, splat (i32 1)
+  %evl = extractelement <vscale x 2 x i32> %v, i32 0
+  call void @llvm.vp.store(<vscale x 2 x i32> %w, ptr %p, <vscale x 2 x i1> splat(i1 true), i32 %evl)
+  ret void
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.mir b/llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.mir
new file mode 100644
index 00000000000000..5a223580821b75
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/reduce-vl-peephole.mir
@@ -0,0 +1,15 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc %s -o - -mtriple=riscv64 -mattr=+v -run-pass=riscv-vector-peephole \
+# RUN:     -verify-machineinstrs | FileCheck %s
+---
+name: avl_not_dominated
+body: |
+  bb.0:
+    ; CHECK-LABEL: name: avl_not_dominated
+    ; CHECK: %evl:gprnox0 = ADDI $x0, 1
+    ; CHECK-NEXT: %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, %evl, 5 /* e32 */, 0 /* tu, mu */
+    ; CHECK-NEXT: PseudoVSE32_V_M1 %x, $noreg, %evl, 5 /* e32 */
+    %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 5 /* e32 */, 0 /* tu, mu */
+    %evl:gprnox0 = ADDI $x0, 1
+    PseudoVSE32_V_M1 %x:vr, $noreg, %evl, 5 /* e32 */
+...

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This was really my bug, so thanks for jumping on it before I did.

if (Def->getParent() == Src.getParent() && !dominates(Def, Src)) {
if (!isSafeToMove(Src, *Def->getNextNode()))
return false;
// FIXME: Update V0Defs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticing the FIXME. I assume this is something you noticed when inspecting the code? How can a VL (scalar) be a VO (vector) register def? And if it's safe to move, doesn't that guarantee the reaching VO is the same?

Copy link
Contributor Author

@lukel97 lukel97 Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V0Defs stores the current definition of V0 at each instruction, so this is about updating the current value at Src if we move it.

isSafeToMove checks that any physical register uses aren't clobbered, so if Src doesn't use V0 then it might be moved past a new V0 def.

So masked pseudos won't get moved currently and I don't think we miscompile anything, but we should probably do our bookkeeping here anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other option is to only add instructions which have a mask use to the mapping, then moving source wouldn't need to update the mapping. Would also keep the mapping (much) smaller.

@lukel97
Copy link
Contributor Author

lukel97 commented Aug 30, 2024

It slipped past my review too so I think I'm also partly to blame!

@lukel97 lukel97 merged commit 0efa386 into llvm:main Aug 30, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in RISC-V Vector Peephole Optimization (riscv-vector-peephole) pass
3 participants