Skip to content

[RISCV] Run DeadMachineInstructionElim after regalloc #90598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Apr 30, 2024

In a vector pseudo, an AVL immediate > 31 will be materialized with an ADDI. In RISCVInsertVSETVLI we may not end up using the ADDI and it will be dead, and currently we have code in there to manually clean it up if that's the case.

This becomes more complicated and involves fixups when we start running RISCVInsertVSETVLI post regalloc in #70549, so this patch delegates that work by adding a pass of DeadMachineInstructionElim.

In a vector pseudo, an AVL immediate > 31 will be materialized with an ADDI. In RISCVInsertVSETVLI we may not end up using the ADDI and it will be dead, and currently we have code in there to manually clean it up if that's the case.

This becomes more complicated and involves fixups when we start running RISCVInsertVSETVLI post regalloc in llvm#70549, so this patch delegates that work by adding a pass of DeadMachineInstructionElim.
@llvmbot
Copy link
Member

llvmbot commented Apr 30, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

In a vector pseudo, an AVL immediate > 31 will be materialized with an ADDI. In RISCVInsertVSETVLI we may not end up using the ADDI and it will be dead, and currently we have code in there to manually clean it up if that's the case.

This becomes more complicated and involves fixups when we start running RISCVInsertVSETVLI post regalloc in #70549, so this patch delegates that work by adding a pass of DeadMachineInstructionElim.


Full diff: https://github.com/llvm/llvm-project/pull/90598.diff

7 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+1-20)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll (+6-7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv32-spill-zvlsseg.ll (-11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll (+6-7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv64-spill-zvlsseg.ll (-11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll (-2)
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index b27e1dd258ebd0..0edbe81cd02c4f 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -1402,20 +1402,9 @@ void RISCVInsertVSETVLI::emitVSETVLIs(MachineBasicBlock &MBB) {
       if (RISCVII::hasVLOp(TSFlags)) {
         MachineOperand &VLOp = MI.getOperand(getVLOpNum(MI));
         if (VLOp.isReg()) {
-          Register Reg = VLOp.getReg();
-          MachineInstr *VLOpDef = MRI->getVRegDef(Reg);
-
           // Erase the AVL operand from the instruction.
           VLOp.setReg(RISCV::NoRegister);
           VLOp.setIsKill(false);
-
-          // If the AVL was an immediate > 31, then it would have been emitted
-          // as an ADDI. However, the ADDI might not have been used in the
-          // vsetvli, or a vsetvli might not have been emitted, so it may be
-          // dead now.
-          if (VLOpDef && TII->isAddImmediate(*VLOpDef, Reg) &&
-              MRI->use_nodbg_empty(Reg))
-            VLOpDef->eraseFromParent();
         }
         MI.addOperand(MachineOperand::CreateReg(RISCV::VL, /*isDef*/ false,
                                                 /*isImp*/ true));
@@ -1670,17 +1659,9 @@ bool RISCVCoalesceVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) {
           if (NextMI->getOperand(1).isReg())
             NextMI->getOperand(1).setReg(RISCV::NoRegister);
 
-          if (OldVLReg && OldVLReg.isVirtual()) {
+          if (OldVLReg && OldVLReg.isVirtual())
             // NextMI no longer uses OldVLReg so shrink its LiveInterval.
             LIS->shrinkToUses(&LIS->getInterval(OldVLReg));
-
-            MachineInstr *VLOpDef = MRI->getUniqueVRegDef(OldVLReg);
-            if (VLOpDef && TII->isAddImmediate(*VLOpDef, OldVLReg) &&
-                MRI->use_nodbg_empty(OldVLReg)) {
-              VLOpDef->eraseFromParent();
-              LIS->removeInterval(OldVLReg);
-            }
-          }
           MI.setDesc(NextMI->getDesc());
         }
         MI.getOperand(2).setImm(NextMI->getOperand(2).getImm());
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index 0876f46728a10c..80e8b2ae2f7f27 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -553,6 +553,8 @@ void RISCVPassConfig::addPostRegAlloc() {
   if (TM->getOptLevel() != CodeGenOptLevel::None &&
       EnableRedundantCopyElimination)
     addPass(createRISCVRedundantCopyEliminationPass());
+  // RISCVInsertVSETVLI may leave some defs of AVLs dead, so remove them.
+  addPass(&DeadMachineInstructionElimID);
 }
 
 yaml::MachineFunctionInfo *
diff --git a/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll b/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll
index 129fbcfb88327b..9fca5f21746725 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll
@@ -29,13 +29,12 @@ define <vscale x 1 x double> @foo(<vscale x 1 x double> %a, <vscale x 1 x double
 ; SPILL-O0-NEXT:    lui a0, %hi(.L.str)
 ; SPILL-O0-NEXT:    addi a0, a0, %lo(.L.str)
 ; SPILL-O0-NEXT:    call puts
-; SPILL-O0-NEXT:    addi a1, sp, 16
-; SPILL-O0-NEXT:    vl1r.v v10, (a1) # Unknown-size Folded Reload
-; SPILL-O0-NEXT:    csrr a1, vlenb
-; SPILL-O0-NEXT:    add a1, sp, a1
-; SPILL-O0-NEXT:    addi a1, a1, 16
-; SPILL-O0-NEXT:    vl1r.v v9, (a1) # Unknown-size Folded Reload
-; SPILL-O0-NEXT:    # kill: def $x11 killed $x10
+; SPILL-O0-NEXT:    addi a0, sp, 16
+; SPILL-O0-NEXT:    vl1r.v v10, (a0) # Unknown-size Folded Reload
+; SPILL-O0-NEXT:    csrr a0, vlenb
+; SPILL-O0-NEXT:    add a0, sp, a0
+; SPILL-O0-NEXT:    addi a0, a0, 16
+; SPILL-O0-NEXT:    vl1r.v v9, (a0) # Unknown-size Folded Reload
 ; SPILL-O0-NEXT:    lw a0, 8(sp) # 4-byte Folded Reload
 ; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
 ; SPILL-O0-NEXT:    # implicit-def: $v8
diff --git a/llvm/test/CodeGen/RISCV/rvv/rv32-spill-zvlsseg.ll b/llvm/test/CodeGen/RISCV/rvv/rv32-spill-zvlsseg.ll
index 407c782d3377a8..09c3591d5ad04d 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rv32-spill-zvlsseg.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rv32-spill-zvlsseg.ll
@@ -14,9 +14,7 @@ define <vscale x 1 x i32> @spill_zvlsseg_nxv1i32(ptr %base, i32 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # implicit-def: $v10
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # kill: def $v8 killed $v8 def $v8_v9
 ; SPILL-O0-NEXT:    vmv1r.v v9, v10
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, mf2, ta, ma
@@ -96,9 +94,7 @@ define <vscale x 2 x i32> @spill_zvlsseg_nxv2i32(ptr %base, i32 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # implicit-def: $v10
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # kill: def $v8 killed $v8 def $v8_v9
 ; SPILL-O0-NEXT:    vmv1r.v v9, v10
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
@@ -178,9 +174,7 @@ define <vscale x 4 x i32> @spill_zvlsseg_nxv4i32(ptr %base, i32 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # implicit-def: $v12m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # kill: def $v8m2 killed $v8m2 def $v8m2_v10m2
 ; SPILL-O0-NEXT:    vmv2r.v v10, v12
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, m2, ta, ma
@@ -263,9 +257,7 @@ define <vscale x 8 x i32> @spill_zvlsseg_nxv8i32(ptr %base, i32 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 2
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8m4
-; SPILL-O0-NEXT:    # implicit-def: $v12m4
 ; SPILL-O0-NEXT:    # implicit-def: $v16m4
-; SPILL-O0-NEXT:    # implicit-def: $v12m4
 ; SPILL-O0-NEXT:    # kill: def $v8m4 killed $v8m4 def $v8m4_v12m4
 ; SPILL-O0-NEXT:    vmv4r.v v12, v16
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, m4, ta, ma
@@ -348,11 +340,8 @@ define <vscale x 4 x i32> @spill_zvlsseg3_nxv4i32(ptr %base, i32 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # implicit-def: $v16m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # implicit-def: $v14m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # kill: def $v8m2 killed $v8m2 def $v8m2_v10m2_v12m2
 ; SPILL-O0-NEXT:    vmv2r.v v10, v16
 ; SPILL-O0-NEXT:    vmv2r.v v12, v14
diff --git a/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll b/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
index 34eb58ee4d1c4b..c24baf5ea7d4e4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
@@ -32,13 +32,12 @@ define <vscale x 1 x double> @foo(<vscale x 1 x double> %a, <vscale x 1 x double
 ; SPILL-O0-NEXT:    lui a0, %hi(.L.str)
 ; SPILL-O0-NEXT:    addi a0, a0, %lo(.L.str)
 ; SPILL-O0-NEXT:    call puts
-; SPILL-O0-NEXT:    addi a1, sp, 32
-; SPILL-O0-NEXT:    vl1r.v v10, (a1) # Unknown-size Folded Reload
-; SPILL-O0-NEXT:    csrr a1, vlenb
-; SPILL-O0-NEXT:    add a1, sp, a1
-; SPILL-O0-NEXT:    addi a1, a1, 32
-; SPILL-O0-NEXT:    vl1r.v v9, (a1) # Unknown-size Folded Reload
-; SPILL-O0-NEXT:    # kill: def $x11 killed $x10
+; SPILL-O0-NEXT:    addi a0, sp, 32
+; SPILL-O0-NEXT:    vl1r.v v10, (a0) # Unknown-size Folded Reload
+; SPILL-O0-NEXT:    csrr a0, vlenb
+; SPILL-O0-NEXT:    add a0, sp, a0
+; SPILL-O0-NEXT:    addi a0, a0, 32
+; SPILL-O0-NEXT:    vl1r.v v9, (a0) # Unknown-size Folded Reload
 ; SPILL-O0-NEXT:    ld a0, 16(sp) # 8-byte Folded Reload
 ; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
 ; SPILL-O0-NEXT:    # implicit-def: $v8
diff --git a/llvm/test/CodeGen/RISCV/rvv/rv64-spill-zvlsseg.ll b/llvm/test/CodeGen/RISCV/rvv/rv64-spill-zvlsseg.ll
index 1c1544b4efa0b8..1dcf6479a2a603 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rv64-spill-zvlsseg.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rv64-spill-zvlsseg.ll
@@ -14,9 +14,7 @@ define <vscale x 1 x i32> @spill_zvlsseg_nxv1i32(ptr %base, i64 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # implicit-def: $v10
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # kill: def $v8 killed $v8 def $v8_v9
 ; SPILL-O0-NEXT:    vmv1r.v v9, v10
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, mf2, ta, ma
@@ -96,9 +94,7 @@ define <vscale x 2 x i32> @spill_zvlsseg_nxv2i32(ptr %base, i64 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # implicit-def: $v10
-; SPILL-O0-NEXT:    # implicit-def: $v9
 ; SPILL-O0-NEXT:    # kill: def $v8 killed $v8 def $v8_v9
 ; SPILL-O0-NEXT:    vmv1r.v v9, v10
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
@@ -178,9 +174,7 @@ define <vscale x 4 x i32> @spill_zvlsseg_nxv4i32(ptr %base, i64 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # implicit-def: $v12m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # kill: def $v8m2 killed $v8m2 def $v8m2_v10m2
 ; SPILL-O0-NEXT:    vmv2r.v v10, v12
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, m2, ta, ma
@@ -263,9 +257,7 @@ define <vscale x 8 x i32> @spill_zvlsseg_nxv8i32(ptr %base, i64 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 2
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8m4
-; SPILL-O0-NEXT:    # implicit-def: $v12m4
 ; SPILL-O0-NEXT:    # implicit-def: $v16m4
-; SPILL-O0-NEXT:    # implicit-def: $v12m4
 ; SPILL-O0-NEXT:    # kill: def $v8m4 killed $v8m4 def $v8m4_v12m4
 ; SPILL-O0-NEXT:    vmv4r.v v12, v16
 ; SPILL-O0-NEXT:    vsetvli zero, a1, e32, m4, ta, ma
@@ -348,11 +340,8 @@ define <vscale x 4 x i32> @spill_zvlsseg3_nxv4i32(ptr %base, i64 %vl) nounwind {
 ; SPILL-O0-NEXT:    slli a2, a2, 1
 ; SPILL-O0-NEXT:    sub sp, sp, a2
 ; SPILL-O0-NEXT:    # implicit-def: $v8m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # implicit-def: $v16m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # implicit-def: $v14m2
-; SPILL-O0-NEXT:    # implicit-def: $v10m2
 ; SPILL-O0-NEXT:    # kill: def $v8m2 killed $v8m2 def $v8m2_v10m2_v12m2
 ; SPILL-O0-NEXT:    vmv2r.v v10, v16
 ; SPILL-O0-NEXT:    vmv2r.v v12, v14
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
index 4ff2fc7a5fff5d..1aa06b8020f2a7 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
@@ -91,13 +91,11 @@ define <vscale x 1 x double> @test3(i64 %avl, i8 zeroext %cond, <vscale x 1 x do
 ; CHECK-NEXT:    vsetvli a0, a0, e64, m1, ta, ma
 ; CHECK-NEXT:    vfadd.vv v9, v8, v9
 ; CHECK-NEXT:    vfmul.vv v8, v9, v8
-; CHECK-NEXT:    # implicit-def: $x10
 ; CHECK-NEXT:    ret
 ; CHECK-NEXT:  .LBB2_2: # %if.else
 ; CHECK-NEXT:    vsetvli a0, a0, e64, m1, ta, ma
 ; CHECK-NEXT:    vfsub.vv v9, v8, v9
 ; CHECK-NEXT:    vfmul.vv v8, v9, v8
-; CHECK-NEXT:    # implicit-def: $x10
 ; CHECK-NEXT:    ret
 entry:
   %tobool = icmp eq i8 %cond, 0

@@ -553,6 +553,8 @@ void RISCVPassConfig::addPostRegAlloc() {
if (TM->getOptLevel() != CodeGenOptLevel::None &&
EnableRedundantCopyElimination)
addPass(createRISCVRedundantCopyEliminationPass());
// RISCVInsertVSETVLI may leave some defs of AVLs dead, so remove them.
addPass(&DeadMachineInstructionElimID);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update O3-pipeline.ll

@topperc
Copy link
Collaborator

topperc commented Apr 30, 2024

Are there any post-RA or even post-SSA uses of this pass on any other target? All I found were uses still in SSA form.

@lukel97
Copy link
Contributor Author

lukel97 commented Apr 30, 2024

Are there any post-RA or even post-SSA uses of this pass on any other target? All I found were uses still in SSA form.

No, it seemed ok from reading the pass implementation. But I actually think we can use LiveIntervals to check for dead immediates #90629. I'd prefer this approach, so closing this for now.

@lukel97 lukel97 closed this Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants