Skip to content

[RISCV][InsertVSETVLI] Remove redundant vsetvli by coalescing blocks bottom-up #141298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@

#include "RISCV.h"
#include "RISCVSubtarget.h"
#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/LiveDebugVariables.h"
#include "llvm/CodeGen/LiveIntervals.h"
Expand Down Expand Up @@ -1840,8 +1841,11 @@ bool RISCVInsertVSETVLI::runOnMachineFunction(MachineFunction &MF) {
// any cross block analysis within the dataflow. We can't have both
// demanded fields based mutation and non-local analysis in the
// dataflow at the same time without introducing inconsistencies.
for (MachineBasicBlock &MBB : MF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does traversing the blocks in post-order also fix this? E.g. can we do

for (MachineBasicBlock &MBB : post_order(&MF))
  coalesceVSETVLIs(*MBB);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it indeed solves the problem. The patch is updated now.

coalesceVSETVLIs(MBB);
// We're visiting blocks from the bottom up because a VSETVLI in the
// earlier block might become dead when its uses in later blocks are
// optimized away.
for (MachineBasicBlock *MBB : post_order(&MF))
coalesceVSETVLIs(*MBB);

// Insert PseudoReadVL after VLEFF/VLSEGFF and replace it with the vl output
// of VLEFF/VLSEGFF.
Expand Down
69 changes: 69 additions & 0 deletions llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-coalesce.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
# RUN: llc -mtriple=riscv64 -mattr=+v -run-pass=liveintervals,riscv-insert-vsetvli %s -o - | FileCheck %s

---
name: coalesce
tracksRegLiveness: true
noPhis: true
body: |
; CHECK-LABEL: name: coalesce
; CHECK: bb.0:
; CHECK-NEXT: successors: %bb.1(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[DEF:%[0-9]+]]:gprnox0 = IMPLICIT_DEF
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.2(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: dead [[PseudoVSETVLIX0_:%[0-9]+]]:gpr = PseudoVSETVLIX0 killed $x0, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: renamable $v10m2 = PseudoVMV_V_I_M2 undef renamable $v10m2, 0, -1, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2:
; CHECK-NEXT: successors: %bb.3(0x04000000), %bb.2(0x7c000000)
; CHECK-NEXT: liveins: $v10m2, $v12m2
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: BEQ undef %2:gpr, $x0, %bb.2
; CHECK-NEXT: PseudoBR %bb.3
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.3:
; CHECK-NEXT: successors: %bb.1(0x7c000000), %bb.4(0x04000000)
; CHECK-NEXT: liveins: $v8m2
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: $x0 = PseudoVSETVLI [[DEF]], 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: renamable $v10 = PseudoVMV_S_X undef renamable $v10, undef %2:gpr, $noreg, 5 /* e32 */, implicit $vl, implicit $vtype
; CHECK-NEXT: dead renamable $v8 = PseudoVREDSUM_VS_M2_E32 undef renamable $v8, killed undef renamable $v8m2, killed undef renamable $v10, $noreg, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
; CHECK-NEXT: BNE undef %3:gpr, $x0, %bb.1
; CHECK-NEXT: PseudoBR %bb.4
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.4:
; CHECK-NEXT: PseudoRET
bb.0:
successors: %bb.1(0x80000000)

%78:gprnox0 = IMPLICIT_DEF

bb.1:
successors: %bb.2(0x80000000)

%46:gprnox0 = PseudoVSETVLI %78, 199 /* e8, mf2, ta, ma */, implicit-def dead $vl, implicit-def dead $vtype
renamable $v10m2 = PseudoVMV_V_I_M2 undef renamable $v10m2, 0, -1, 5 /* e32 */, 0 /* tu, mu */

bb.2:
successors: %bb.3(0x04000000), %bb.2(0x7c000000)
liveins: $v10m2, $v12m2

BEQ undef %54:gpr, $x0, %bb.2
PseudoBR %bb.3

bb.3:
successors: %bb.1(0x7c000000), %bb.4(0x04000000)
liveins: $v8m2

renamable $v10 = PseudoVMV_S_X undef renamable $v10, undef %54:gpr, %46, 5 /* e32 */
dead renamable $v8 = PseudoVREDSUM_VS_M2_E32 undef renamable $v8, killed undef renamable $v8m2, killed undef renamable $v10, %46, 5 /* e32 */, 0 /* tu, mu */
BNE undef %29:gpr, $x0, %bb.1
PseudoBR %bb.4

bb.4:
PseudoRET
...
Loading