[RISCV] Add TuneDisableLatencySchedHeuristic #115858

wangpc-pp · 2024-11-12T11:57:48Z

This tune feature will disable latency scheduling heuristic.

This can reduce the number of spills/reloads but will cause some
regressions on some cores.

CPU may add this tune feature if they find it's profitable.

Created using spr 1.3.6-beta.1 [skip ci]

Created using spr 1.3.6-beta.1

llvmbot · 2024-11-12T11:58:36Z

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: Pengcheng Wang (wangpc-pp)

Changes

This helps reduce register pressure for some cases.

Patch is 7.18 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115858.diff

465 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVSubtarget.cpp (+7)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll (+68-68)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb-zbkb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll (+181-180)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbkb.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+261-261)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/abds-neg.ll (+262-262)
(modified) llvm/test/CodeGen/RISCV/abds.ll (+254-254)
(modified) llvm/test/CodeGen/RISCV/abdu-neg.ll (+358-358)
(modified) llvm/test/CodeGen/RISCV/abdu.ll (+244-244)
(modified) llvm/test/CodeGen/RISCV/add-before-shl.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/add-imm.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/addcarry.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/addimm-mulimm.ll (+63-63)
(modified) llvm/test/CodeGen/RISCV/alu16.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/alu8.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/and.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll (+192-192)
(modified) llvm/test/CodeGen/RISCV/atomic-rmw.ll (+1343-1343)
(modified) llvm/test/CodeGen/RISCV/atomic-signext.ll (+122-122)
(modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+36-36)
(modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/avgceils.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/avgceilu.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/avgfloors.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/avgflooru.ll (+20-20)
(modified) llvm/test/CodeGen/RISCV/bf16-promote.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/bfloat-arith.ll (+57-57)
(modified) llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+91-91)
(modified) llvm/test/CodeGen/RISCV/bfloat-fcmp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/bfloat-mem.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/bfloat.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/bitextract-mac.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+95-95)
(modified) llvm/test/CodeGen/RISCV/bswap-bitreverse.ll (+528-524)
(modified) llvm/test/CodeGen/RISCV/calling-conv-half.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32-ilp32f-common.ll (+48-52)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32-ilp32f-ilp32d-common.ll (+126-126)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32d.ll (+34-34)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+308-308)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32f-ilp32d-common.ll (+24-24)
(modified) llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll (+58-58)
(modified) llvm/test/CodeGen/RISCV/cmov-branch-opt.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/compress.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/condbinops.ll (+40-40)
(modified) llvm/test/CodeGen/RISCV/condops.ll (+450-424)
(modified) llvm/test/CodeGen/RISCV/copysign-casts.ll (+48-48)
(modified) llvm/test/CodeGen/RISCV/ctlz-cttz-ctpop.ll (+583-583)
(modified) llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll (+31-31)
(modified) llvm/test/CodeGen/RISCV/div-by-constant.ll (+31-31)
(modified) llvm/test/CodeGen/RISCV/div-pow2.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/div.ll (+28-28)
(modified) llvm/test/CodeGen/RISCV/double-arith.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/double-bitmanip-dagcombines.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/double-calling-conv.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/double-convert.ll (+43-43)
(modified) llvm/test/CodeGen/RISCV/double-imm.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/double-previous-failure.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/double-round-conv-sat.ll (+216-216)
(modified) llvm/test/CodeGen/RISCV/double_reduct.ll (+15-15)
(modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/float-arith.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/float-bitmanip-dagcombines.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/float-convert.ll (+52-52)
(modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+49-49)
(modified) llvm/test/CodeGen/RISCV/float-round-conv-sat.ll (+144-144)
(modified) llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/fold-binop-into-select.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/forced-atomics.ll (+27-27)
(modified) llvm/test/CodeGen/RISCV/fp128.ll (+14-14)
(modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/fpenv.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/ghccc-rv32.ll (+40-40)
(modified) llvm/test/CodeGen/RISCV/ghccc-rv64.ll (+40-40)
(modified) llvm/test/CodeGen/RISCV/ghccc-without-f-reg.ll (+20-20)
(modified) llvm/test/CodeGen/RISCV/global-merge.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/half-arith-strict.ll (+46-46)
(modified) llvm/test/CodeGen/RISCV/half-arith.ll (+91-91)
(modified) llvm/test/CodeGen/RISCV/half-bitmanip-dagcombines.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/half-br-fcmp.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/half-convert.ll (+469-469)
(modified) llvm/test/CodeGen/RISCV/half-fcmp-strict.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/half-fcmp.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/half-intrinsics.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/half-mem.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/half-round-conv-sat.ll (+294-294)
(modified) llvm/test/CodeGen/RISCV/half-select-fcmp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/iabs.ll (+52-52)
(modified) llvm/test/CodeGen/RISCV/imm.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/inline-asm-d-modifier-N.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/interrupt-attr-nocall.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+29-29)
(modified) llvm/test/CodeGen/RISCV/lack-of-signed-truncation-check.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/loop-strength-reduce-loop-invar.ll (+22-22)
(modified) llvm/test/CodeGen/RISCV/lsr-legaladdimm.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/machinelicm-constant-phys-reg.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/memcmp-optsize.ll (+131-131)
(modified) llvm/test/CodeGen/RISCV/memcmp.ll (+291-291)
(modified) llvm/test/CodeGen/RISCV/memcpy.ll (+73-73)
(modified) llvm/test/CodeGen/RISCV/mul.ll (+268-263)
(modified) llvm/test/CodeGen/RISCV/neg-abs.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/or-is-add.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+19-17)
(modified) llvm/test/CodeGen/RISCV/pr51206.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/pr56457.ll (+25-25)
(modified) llvm/test/CodeGen/RISCV/pr58511.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/pr65025.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/pr68855.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/pr69586.ll (+1150-1498)
(modified) llvm/test/CodeGen/RISCV/pr84653_pr85190.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/pr95271.ll (+25-25)
(modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rem.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/riscv-codegenprepare-asm.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/riscv-shifted-extend.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/rotl-rotr.ll (+219-219)
(modified) llvm/test/CodeGen/RISCV/rv32xtheadbb.ll (+38-37)
(modified) llvm/test/CodeGen/RISCV/rv32zbb-zbkb.ll (+29-26)
(modified) llvm/test/CodeGen/RISCV/rv32zbb.ll (+225-224)
(modified) llvm/test/CodeGen/RISCV/rv32zbs.ll (+22-22)
(modified) llvm/test/CodeGen/RISCV/rv64-double-convert.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/rv64-float-convert.ll (+15-15)
(modified) llvm/test/CodeGen/RISCV/rv64-half-convert.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/rv64-trampoline.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/rv64i-shift-sext.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rv64i-w-insts-legalization.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rv64xtheadbb.ll (+150-150)
(modified) llvm/test/CodeGen/RISCV/rv64zba.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64zbb-intrinsic.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rv64zbb-zbkb.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rv64zbb.ll (+259-259)
(modified) llvm/test/CodeGen/RISCV/rv64zbkb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/65704-illegal-instruction.ll (+3-1)
(modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/rvv/active_lane_mask.ll (+57-52)
(modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-struct.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-vector-tuple.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rvv/allocate-lmul-2-4-8.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+481-488)
(modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+1188-1229)
(modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+170-177)
(modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+518-564)
(modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+148-170)
(modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+198-153)
(modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+87-85)
(modified) llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll (+15-19)
(modified) llvm/test/CodeGen/RISCV/rvv/ctlz-sdnode.ll (+539-539)
(modified) llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll (+137-137)
(modified) llvm/test/CodeGen/RISCV/rvv/ctpop-sdnode.ll (+175-175)
(modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+609-598)
(modified) llvm/test/CodeGen/RISCV/rvv/cttz-sdnode.ll (+702-702)
(modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+801-806)
(modified) llvm/test/CodeGen/RISCV/rvv/dont-sink-splat-operands.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/expand-no-v.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/rvv/expandload.ll (+2193-2140)
(modified) llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/extractelt-i1.ll (+20-20)
(modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv32.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv64.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fceil-constrained-sdnode.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/fceil-sdnode.ll (+28-32)
(modified) llvm/test/CodeGen/RISCV/rvv/ffloor-constrained-sdnode.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/ffloor-sdnode.ll (+28-32)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+54-54)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-binop-splats.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse-vp.ll (+1150-1197)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse.ll (+210-210)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap-vp.ll (+467-513)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap.ll (+86-86)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-buildvec-of-binop.ll (+36-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv-fastcc.ll (+26-30)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv.ll (+22-24)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+61-74)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz-vp.ll (+1522-1646)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz.ll (+266-266)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+643-643)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop.ll (+86-86)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz-vp.ll (+1598-1482)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz.ll (+276-280)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-deinterleave-load.ll (+19-22)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-elen.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-i1.ll (+48-48)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract.ll (+62-62)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fceil-constrained-sdnode.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ffloor-constrained-sdnode.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+61-74)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll (+12-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum.ll (+13-9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum-vp.ll (+12-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum.ll (+13-9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fnearbyint-constrained-sdnode.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll (+283-267)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll (+17-16)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-setcc.ll (+26-22)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll (+94-94)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp2i.ll (+84-84)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fptrunc-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fround-constrained-sdnode.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fround.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-froundeven-constrained-sdnode.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-froundeven.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ftrunc-constrained-sdnode.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-i1.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll (+15-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll (+794-763)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-explodevector.ll (+332-346)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll (+21-24)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll (+24-25)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll (+216-212)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access-zve32x.ll (+21-25)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll (+340-279)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-llrint.ll (+114-111)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-load.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-lrint.ll (+338-327)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-mask-buildvec.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+494-494)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-load-fp.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-load-int.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-scatter.ll (+200-216)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-fp.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-int.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+16-35)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-formation.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll (+214-126)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int-vp.ll (+21-21)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll (+211-141)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+10-11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+61-74)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+61-74)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+61-74)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-sad.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-scalarized.ll (+34-30)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-fp-vp.ll (+1371-1883)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll (+75-78)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-concat.ll (+20-13)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-reverse.ll (+258-269)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-rotate.ll (+54-54)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shufflevector-vnsrl.ll (+15-16)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-store.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+64-64)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-vpload.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll (+8-14)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfcmp-constrained-sdnode.ll (+216-216)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfcmps-constrained-sdnode.ll (+33-33)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfma-vp.ll (+71-92)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfmuladd-vp.ll (+27-48)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfw-web-simplification.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpgather.ll (+43-43)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpload.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpscatter.ll (+49-49)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpstore.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vrol.ll (+83-83)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vror.ll (+125-125)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+23-13)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect.ll (+84-84)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+198-153)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-sdnode.ll (+81-200)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-vp.ll (+267-288)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-sdnode.ll (+81-200)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-vp.ll (+267-288)
(modified) llvm/test/CodeGen/RISCV/rvv/fnearbyint-constrained-sdnode.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rvv/fnearbyint-sdnode.ll (+30-30)

diff --git a/llvm/lib/Target/RISCV/RISCVSubtarget.cpp b/llvm/lib/Target/RISCV/RISCVSubtarget.cpp
index 3eae2b9774203f..ac81d8980fd3e0 100644
--- a/llvm/lib/Target/RISCV/RISCVSubtarget.cpp
+++ b/llvm/lib/Target/RISCV/RISCVSubtarget.cpp
@@ -208,6 +208,13 @@ void RISCVSubtarget::overrideSchedPolicy(MachineSchedPolicy &Policy,
   Policy.OnlyTopDown = false;
   Policy.OnlyBottomUp = false;
 
+  // Enabling or Disabling the latency heuristic is a close call: It seems to
+  // help nearly no benchmark on out-of-order architectures, on the other hand
+  // it regresses register pressure on a few benchmarking.
+  // FIXME: This is from AArch64, but we haven't evaluated it on RISC-V.
+  // TODO: We may disable it for out-of-order architectures only.
+  Policy.DisableLatencyHeuristic = true;
+
   // Spilling is generally expensive on all RISC-V cores, so always enable
   // register-pressure tracking. This will increase compile time.
   Policy.ShouldTrackPressure = true;
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll b/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
index ee414992a5245c..330f8b16065f13 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
@@ -25,8 +25,8 @@ define i32 @add_i8_signext_i32(i8 %a, i8 %b) {
 ; RV32IM-LABEL: add_i8_signext_i32:
 ; RV32IM:       # %bb.0: # %entry
 ; RV32IM-NEXT:    slli a0, a0, 24
-; RV32IM-NEXT:    slli a1, a1, 24
 ; RV32IM-NEXT:    srai a0, a0, 24
+; RV32IM-NEXT:    slli a1, a1, 24
 ; RV32IM-NEXT:    srai a1, a1, 24
 ; RV32IM-NEXT:    add a0, a0, a1
 ; RV32IM-NEXT:    ret
@@ -34,8 +34,8 @@ define i32 @add_i8_signext_i32(i8 %a, i8 %b) {
 ; RV64IM-LABEL: add_i8_signext_i32:
 ; RV64IM:       # %bb.0: # %entry
 ; RV64IM-NEXT:    slli a0, a0, 56
-; RV64IM-NEXT:    slli a1, a1, 56
 ; RV64IM-NEXT:    srai a0, a0, 56
+; RV64IM-NEXT:    slli a1, a1, 56
 ; RV64IM-NEXT:    srai a1, a1, 56
 ; RV64IM-NEXT:    add a0, a0, a1
 ; RV64IM-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll b/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
index bce6dfacf8e82c..f33ba1d7a302ef 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
@@ -6,8 +6,8 @@ define i2 @bitreverse_i2(i2 %x) {
 ; RV32-LABEL: bitreverse_i2:
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    slli a1, a0, 1
-; RV32-NEXT:    andi a0, a0, 3
 ; RV32-NEXT:    andi a1, a1, 2
+; RV32-NEXT:    andi a0, a0, 3
 ; RV32-NEXT:    srli a0, a0, 1
 ; RV32-NEXT:    or a0, a1, a0
 ; RV32-NEXT:    ret
@@ -15,8 +15,8 @@ define i2 @bitreverse_i2(i2 %x) {
 ; RV64-LABEL: bitreverse_i2:
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a1, a0, 1
-; RV64-NEXT:    andi a0, a0, 3
 ; RV64-NEXT:    andi a1, a1, 2
+; RV64-NEXT:    andi a0, a0, 3
 ; RV64-NEXT:    srli a0, a0, 1
 ; RV64-NEXT:    or a0, a1, a0
 ; RV64-NEXT:    ret
@@ -28,8 +28,8 @@ define i3 @bitreverse_i3(i3 %x) {
 ; RV32-LABEL: bitreverse_i3:
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    slli a1, a0, 2
-; RV32-NEXT:    andi a0, a0, 7
 ; RV32-NEXT:    andi a1, a1, 4
+; RV32-NEXT:    andi a0, a0, 7
 ; RV32-NEXT:    andi a2, a0, 2
 ; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    srli a0, a0, 2
@@ -39,8 +39,8 @@ define i3 @bitreverse_i3(i3 %x) {
 ; RV64-LABEL: bitreverse_i3:
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a1, a0, 2
-; RV64-NEXT:    andi a0, a0, 7
 ; RV64-NEXT:    andi a1, a1, 4
+; RV64-NEXT:    andi a0, a0, 7
 ; RV64-NEXT:    andi a2, a0, 2
 ; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    srli a0, a0, 2
@@ -54,11 +54,11 @@ define i4 @bitreverse_i4(i4 %x) {
 ; RV32-LABEL: bitreverse_i4:
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    slli a1, a0, 3
-; RV32-NEXT:    slli a2, a0, 1
-; RV32-NEXT:    andi a0, a0, 15
 ; RV32-NEXT:    andi a1, a1, 8
+; RV32-NEXT:    slli a2, a0, 1
 ; RV32-NEXT:    andi a2, a2, 4
 ; RV32-NEXT:    or a1, a1, a2
+; RV32-NEXT:    andi a0, a0, 15
 ; RV32-NEXT:    srli a2, a0, 1
 ; RV32-NEXT:    andi a2, a2, 2
 ; RV32-NEXT:    or a1, a1, a2
@@ -69,11 +69,11 @@ define i4 @bitreverse_i4(i4 %x) {
 ; RV64-LABEL: bitreverse_i4:
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a1, a0, 3
-; RV64-NEXT:    slli a2, a0, 1
-; RV64-NEXT:    andi a0, a0, 15
 ; RV64-NEXT:    andi a1, a1, 8
+; RV64-NEXT:    slli a2, a0, 1
 ; RV64-NEXT:    andi a2, a2, 4
 ; RV64-NEXT:    or a1, a1, a2
+; RV64-NEXT:    andi a0, a0, 15
 ; RV64-NEXT:    srli a2, a0, 1
 ; RV64-NEXT:    andi a2, a2, 2
 ; RV64-NEXT:    or a1, a1, a2
@@ -88,21 +88,21 @@ define i7 @bitreverse_i7(i7 %x) {
 ; RV32-LABEL: bitreverse_i7:
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    slli a1, a0, 6
-; RV32-NEXT:    slli a2, a0, 4
-; RV32-NEXT:    slli a3, a0, 2
-; RV32-NEXT:    andi a0, a0, 127
 ; RV32-NEXT:    andi a1, a1, 64
+; RV32-NEXT:    slli a2, a0, 4
 ; RV32-NEXT:    andi a2, a2, 32
-; RV32-NEXT:    andi a3, a3, 16
 ; RV32-NEXT:    or a1, a1, a2
-; RV32-NEXT:    andi a2, a0, 8
-; RV32-NEXT:    or a2, a3, a2
-; RV32-NEXT:    srli a3, a0, 2
+; RV32-NEXT:    slli a2, a0, 2
+; RV32-NEXT:    andi a2, a2, 16
+; RV32-NEXT:    andi a0, a0, 127
+; RV32-NEXT:    andi a3, a0, 8
+; RV32-NEXT:    or a2, a2, a3
 ; RV32-NEXT:    or a1, a1, a2
-; RV32-NEXT:    srli a2, a0, 4
-; RV32-NEXT:    andi a3, a3, 4
-; RV32-NEXT:    andi a2, a2, 2
-; RV32-NEXT:    or a2, a3, a2
+; RV32-NEXT:    srli a2, a0, 2
+; RV32-NEXT:    andi a2, a2, 4
+; RV32-NEXT:    srli a3, a0, 4
+; RV32-NEXT:    andi a3, a3, 2
+; RV32-NEXT:    or a2, a2, a3
 ; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    srli a0, a0, 6
 ; RV32-NEXT:    or a0, a1, a0
@@ -111,21 +111,21 @@ define i7 @bitreverse_i7(i7 %x) {
 ; RV64-LABEL: bitreverse_i7:
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a1, a0, 6
-; RV64-NEXT:    slli a2, a0, 4
-; RV64-NEXT:    slli a3, a0, 2
-; RV64-NEXT:    andi a0, a0, 127
 ; RV64-NEXT:    andi a1, a1, 64
+; RV64-NEXT:    slli a2, a0, 4
 ; RV64-NEXT:    andi a2, a2, 32
-; RV64-NEXT:    andi a3, a3, 16
 ; RV64-NEXT:    or a1, a1, a2
-; RV64-NEXT:    andi a2, a0, 8
-; RV64-NEXT:    or a2, a3, a2
-; RV64-NEXT:    srli a3, a0, 2
+; RV64-NEXT:    slli a2, a0, 2
+; RV64-NEXT:    andi a2, a2, 16
+; RV64-NEXT:    andi a0, a0, 127
+; RV64-NEXT:    andi a3, a0, 8
+; RV64-NEXT:    or a2, a2, a3
 ; RV64-NEXT:    or a1, a1, a2
-; RV64-NEXT:    srli a2, a0, 4
-; RV64-NEXT:    andi a3, a3, 4
-; RV64-NEXT:    andi a2, a2, 2
-; RV64-NEXT:    or a2, a3, a2
+; RV64-NEXT:    srli a2, a0, 2
+; RV64-NEXT:    andi a2, a2, 4
+; RV64-NEXT:    srli a3, a0, 4
+; RV64-NEXT:    andi a3, a3, 2
+; RV64-NEXT:    or a2, a2, a3
 ; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    srli a0, a0, 6
 ; RV64-NEXT:    or a0, a1, a0
@@ -139,33 +139,33 @@ define i24 @bitreverse_i24(i24 %x) {
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    slli a1, a0, 16
 ; RV32-NEXT:    lui a2, 4096
-; RV32-NEXT:    lui a3, 1048335
 ; RV32-NEXT:    addi a2, a2, -1
-; RV32-NEXT:    addi a3, a3, 240
 ; RV32-NEXT:    and a0, a0, a2
 ; RV32-NEXT:    srli a0, a0, 16
 ; RV32-NEXT:    or a0, a0, a1
-; RV32-NEXT:    and a1, a3, a2
-; RV32-NEXT:    and a1, a0, a1
+; RV32-NEXT:    lui a1, 1048335
+; RV32-NEXT:    addi a1, a1, 240
+; RV32-NEXT:    and a3, a1, a2
+; RV32-NEXT:    and a3, a0, a3
+; RV32-NEXT:    srli a3, a3, 4
 ; RV32-NEXT:    slli a0, a0, 4
-; RV32-NEXT:    and a0, a0, a3
-; RV32-NEXT:    lui a3, 1047757
-; RV32-NEXT:    addi a3, a3, -820
-; RV32-NEXT:    srli a1, a1, 4
-; RV32-NEXT:    or a0, a1, a0
-; RV32-NEXT:    and a1, a3, a2
-; RV32-NEXT:    and a1, a0, a1
+; RV32-NEXT:    and a0, a0, a1
+; RV32-NEXT:    or a0, a3, a0
+; RV32-NEXT:    lui a1, 1047757
+; RV32-NEXT:    addi a1, a1, -820
+; RV32-NEXT:    and a3, a1, a2
+; RV32-NEXT:    and a3, a0, a3
+; RV32-NEXT:    srli a3, a3, 2
 ; RV32-NEXT:    slli a0, a0, 2
-; RV32-NEXT:    and a0, a0, a3
-; RV32-NEXT:    lui a3, 1047211
-; RV32-NEXT:    addi a3, a3, -1366
-; RV32-NEXT:    and a2, a3, a2
-; RV32-NEXT:    srli a1, a1, 2
-; RV32-NEXT:    or a0, a1, a0
+; RV32-NEXT:    and a0, a0, a1
+; RV32-NEXT:    or a0, a3, a0
+; RV32-NEXT:    lui a1, 1047211
+; RV32-NEXT:    addi a1, a1, -1366
+; RV32-NEXT:    and a2, a1, a2
 ; RV32-NEXT:    and a2, a0, a2
-; RV32-NEXT:    slli a0, a0, 1
 ; RV32-NEXT:    srli a2, a2, 1
-; RV32-NEXT:    and a0, a0, a3
+; RV32-NEXT:    slli a0, a0, 1
+; RV32-NEXT:    and a0, a0, a1
 ; RV32-NEXT:    or a0, a2, a0
 ; RV32-NEXT:    ret
 ;
@@ -173,33 +173,33 @@ define i24 @bitreverse_i24(i24 %x) {
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a1, a0, 16
 ; RV64-NEXT:    lui a2, 4096
-; RV64-NEXT:    lui a3, 1048335
 ; RV64-NEXT:    addiw a2, a2, -1
-; RV64-NEXT:    addiw a3, a3, 240
 ; RV64-NEXT:    and a0, a0, a2
 ; RV64-NEXT:    srli a0, a0, 16
 ; RV64-NEXT:    or a0, a0, a1
-; RV64-NEXT:    and a1, a3, a2
-; RV64-NEXT:    and a1, a0, a1
+; RV64-NEXT:    lui a1, 1048335
+; RV64-NEXT:    addiw a1, a1, 240
+; RV64-NEXT:    and a3, a1, a2
+; RV64-NEXT:    and a3, a0, a3
+; RV64-NEXT:    srli a3, a3, 4
 ; RV64-NEXT:    slli a0, a0, 4
-; RV64-NEXT:    and a0, a0, a3
-; RV64-NEXT:    lui a3, 1047757
-; RV64-NEXT:    addiw a3, a3, -820
-; RV64-NEXT:    srli a1, a1, 4
-; RV64-NEXT:    or a0, a1, a0
-; RV64-NEXT:    and a1, a3, a2
-; RV64-NEXT:    and a1, a0, a1
+; RV64-NEXT:    and a0, a0, a1
+; RV64-NEXT:    or a0, a3, a0
+; RV64-NEXT:    lui a1, 1047757
+; RV64-NEXT:    addiw a1, a1, -820
+; RV64-NEXT:    and a3, a1, a2
+; RV64-NEXT:    and a3, a0, a3
+; RV64-NEXT:    srli a3, a3, 2
 ; RV64-NEXT:    slli a0, a0, 2
-; RV64-NEXT:    and a0, a0, a3
-; RV64-NEXT:    lui a3, 1047211
-; RV64-NEXT:    addiw a3, a3, -1366
-; RV64-NEXT:    and a2, a3, a2
-; RV64-NEXT:    srli a1, a1, 2
-; RV64-NEXT:    or a0, a1, a0
+; RV64-NEXT:    and a0, a0, a1
+; RV64-NEXT:    or a0, a3, a0
+; RV64-NEXT:    lui a1, 1047211
+; RV64-NEXT:    addiw a1, a1, -1366
+; RV64-NEXT:    and a2, a1, a2
 ; RV64-NEXT:    and a2, a0, a2
-; RV64-NEXT:    slli a0, a0, 1
 ; RV64-NEXT:    srli a2, a2, 1
-; RV64-NEXT:    and a0, a0, a3
+; RV64-NEXT:    slli a0, a0, 1
+; RV64-NEXT:    and a0, a0, a1
 ; RV64-NEXT:    or a0, a2, a0
 ; RV64-NEXT:    ret
   %rev = call i24 @llvm.bitreverse.i24(i24 %x)
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
index cf7cef83bcc135..70d1b25309c844 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
@@ -21,34 +21,34 @@ define void @constant_fold_barrier_i128(ptr %p) {
 ; RV32-LABEL: constant_fold_barrier_i128:
 ; RV32:       # %bb.0: # %entry
 ; RV32-NEXT:    li a1, 1
+; RV32-NEXT:    slli a1, a1, 11
 ; RV32-NEXT:    lw a2, 0(a0)
 ; RV32-NEXT:    lw a3, 4(a0)
 ; RV32-NEXT:    lw a4, 8(a0)
 ; RV32-NEXT:    lw a5, 12(a0)
-; RV32-NEXT:    slli a1, a1, 11
 ; RV32-NEXT:    and a2, a2, a1
 ; RV32-NEXT:    and a3, a3, zero
 ; RV32-NEXT:    and a4, a4, zero
 ; RV32-NEXT:    and a5, a5, zero
 ; RV32-NEXT:    add a2, a2, a1
-; RV32-NEXT:    add a6, a3, zero
 ; RV32-NEXT:    sltu a1, a2, a1
+; RV32-NEXT:    add a6, a3, zero
 ; RV32-NEXT:    sltu a3, a6, a3
 ; RV32-NEXT:    add a6, a6, a1
 ; RV32-NEXT:    seqz a7, a6
 ; RV32-NEXT:    and a1, a7, a1
-; RV32-NEXT:    add a7, a4, zero
-; RV32-NEXT:    add a5, a5, zero
-; RV32-NEXT:    sltu a4, a7, a4
 ; RV32-NEXT:    or a1, a3, a1
-; RV32-NEXT:    add a7, a7, a1
-; RV32-NEXT:    seqz a3, a7
-; RV32-NEXT:    and a1, a3, a1
+; RV32-NEXT:    add a3, a4, zero
+; RV32-NEXT:    sltu a4, a3, a4
+; RV32-NEXT:    add a3, a3, a1
+; RV32-NEXT:    seqz a7, a3
+; RV32-NEXT:    and a1, a7, a1
 ; RV32-NEXT:    or a1, a4, a1
+; RV32-NEXT:    add a5, a5, zero
 ; RV32-NEXT:    add a1, a5, a1
 ; RV32-NEXT:    sw a2, 0(a0)
 ; RV32-NEXT:    sw a6, 4(a0)
-; RV32-NEXT:    sw a7, 8(a0)
+; RV32-NEXT:    sw a3, 8(a0)
 ; RV32-NEXT:    sw a1, 12(a0)
 ; RV32-NEXT:    ret
 entry:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
index 2c3e3faddc3916..51e8b6da39d099 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
@@ -21,9 +21,9 @@ define i128 @constant_fold_barrier_i128(i128 %x) {
 ; RV64-LABEL: constant_fold_barrier_i128:
 ; RV64:       # %bb.0: # %entry
 ; RV64-NEXT:    li a2, 1
-; RV64-NEXT:    and a1, a1, zero
 ; RV64-NEXT:    slli a2, a2, 11
 ; RV64-NEXT:    and a0, a0, a2
+; RV64-NEXT:    and a1, a1, zero
 ; RV64-NEXT:    add a0, a0, a2
 ; RV64-NEXT:    sltu a2, a0, a2
 ; RV64-NEXT:    add a1, a1, zero
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll b/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll
index 1156edffe91943..05989c310541b8 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll
@@ -117,8 +117,8 @@ define i64 @abs64(i64 %x) {
 ; RV32I:       # %bb.0:
 ; RV32I-NEXT:    srai a2, a1, 31
 ; RV32I-NEXT:    add a0, a0, a2
-; RV32I-NEXT:    add a1, a1, a2
 ; RV32I-NEXT:    sltu a3, a0, a2
+; RV32I-NEXT:    add a1, a1, a2
 ; RV32I-NEXT:    add a1, a1, a3
 ; RV32I-NEXT:    xor a0, a0, a2
 ; RV32I-NEXT:    xor a1, a1, a2
@@ -128,8 +128,8 @@ define i64 @abs64(i64 %x) {
 ; RV32ZBB:       # %bb.0:
 ; RV32ZBB-NEXT:    srai a2, a1, 31
 ; RV32ZBB-NEXT:    add a0, a0, a2
-; RV32ZBB-NEXT:    add a1, a1, a2
 ; RV32ZBB-NEXT:    sltu a3, a0, a2
+; RV32ZBB-NEXT:    add a1, a1, a2
 ; RV32ZBB-NEXT:    add a1, a1, a3
 ; RV32ZBB-NEXT:    xor a0, a0, a2
 ; RV32ZBB-NEXT:    xor a1, a1, a2
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb-zbkb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb-zbkb.ll
index 68bf9240ccd1df..c558639fda424e 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb-zbkb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb-zbkb.ll
@@ -302,8 +302,8 @@ define i64 @rori_i64(i64 %a) nounwind {
 ; CHECK-NEXT:    slli a2, a0, 31
 ; CHECK-NEXT:    srli a0, a0, 1
 ; CHECK-NEXT:    slli a3, a1, 31
-; CHECK-NEXT:    srli a1, a1, 1
 ; CHECK-NEXT:    or a0, a0, a3
+; CHECK-NEXT:    srli a1, a1, 1
 ; CHECK-NEXT:    or a1, a2, a1
 ; CHECK-NEXT:    ret
   %1 = tail call i64 @llvm.fshl.i64(i64 %a, i64 %a, i64 63)
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
index 7f22127ad3536c..1184905c17edea 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
@@ -12,31 +12,31 @@ define i32 @ctlz_i32(i32 %a) nounwind {
 ; RV32I-NEXT:    beqz a0, .LBB0_2
 ; RV32I-NEXT:  # %bb.1: # %cond.false
 ; RV32I-NEXT:    srli a1, a0, 1
-; RV32I-NEXT:    lui a2, 349525
 ; RV32I-NEXT:    or a0, a0, a1
-; RV32I-NEXT:    addi a1, a2, 1365
-; RV32I-NEXT:    srli a2, a0, 2
-; RV32I-NEXT:    or a0, a0, a2
-; RV32I-NEXT:    srli a2, a0, 4
-; RV32I-NEXT:    or a0, a0, a2
-; RV32I-NEXT:    srli a2, a0, 8
-; RV32I-NEXT:    or a0, a0, a2
-; RV32I-NEXT:    srli a2, a0, 16
-; RV32I-NEXT:    or a0, a0, a2
-; RV32I-NEXT:    srli a2, a0, 1
-; RV32I-NEXT:    and a1, a2, a1
-; RV32I-NEXT:    lui a2, 209715
-; RV32I-NEXT:    addi a2, a2, 819
+; RV32I-NEXT:    srli a1, a0, 2
+; RV32I-NEXT:    or a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 4
+; RV32I-NEXT:    or a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 8
+; RV32I-NEXT:    or a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 16
+; RV32I-NEXT:    or a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 1
+; RV32I-NEXT:    lui a2, 349525
+; RV32I-NEXT:    addi a2, a2, 1365
+; RV32I-NEXT:    and a1, a1, a2
 ; RV32I-NEXT:    sub a0, a0, a1
 ; RV32I-NEXT:    srli a1, a0, 2
-; RV32I-NEXT:    and a0, a0, a2
+; RV32I-NEXT:    lui a2, 209715
+; RV32I-NEXT:    addi a2, a2, 819
 ; RV32I-NEXT:    and a1, a1, a2
-; RV32I-NEXT:    lui a2, 61681
-; RV32I-NEXT:    addi a2, a2, -241
+; RV32I-NEXT:    and a0, a0, a2
 ; RV32I-NEXT:    add a0, a1, a0
 ; RV32I-NEXT:    srli a1, a0, 4
 ; RV32I-NEXT:    add a0, a1, a0
-; RV32I-NEXT:    and a0, a0, a2
+; RV32I-NEXT:    lui a1, 61681
+; RV32I-NEXT:    addi a1, a1, -241
+; RV32I-NEXT:    and a0, a0, a1
 ; RV32I-NEXT:    slli a1, a0, 8
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    slli a1, a0, 16
@@ -63,11 +63,11 @@ define i64 @ctlz_i64(i64 %a) nounwind {
 ; RV32I-LABEL: ctlz_i64:
 ; RV32I:       # %bb.0:
 ; RV32I-NEXT:    lui a2, 349525
-; RV32I-NEXT:    lui a3, 209715
-; RV32I-NEXT:    lui a6, 61681
 ; RV32I-NEXT:    addi a5, a2, 1365
-; RV32I-NEXT:    addi a4, a3, 819
-; RV32I-NEXT:    addi a3, a6, -241
+; RV32I-NEXT:    lui a2, 209715
+; RV32I-NEXT:    addi a4, a2, 819
+; RV32I-NEXT:    lui a2, 61681
+; RV32I-NEXT:    addi a3, a2, -241
 ; RV32I-NEXT:    li a2, 32
 ; RV32I-NEXT:    beqz a1, .LBB1_2
 ; RV32I-NEXT:  # %bb.1:
@@ -155,22 +155,22 @@ define i32 @cttz_i32(i32 %a) nounwind {
 ; RV32I-NEXT:  # %bb.1: # %cond.false
 ; RV32I-NEXT:    not a1, a0
 ; RV32I-NEXT:    addi a0, a0, -1
-; RV32I-NEXT:    lui a2, 349525
 ; RV32I-NEXT:    and a0, a1, a0
-; RV32I-NEXT:    addi a1, a2, 1365
-; RV32I-NEXT:    srli a2, a0, 1
-; RV32I-NEXT:    and a1, a2, a1
-; RV32I-NEXT:    lui a2, 209715
-; RV32I-NEXT:    addi a2, a2, 819
+; RV32I-NEXT:    srli a1, a0, 1
+; RV32I-NEXT:    lui a2, 349525
+; RV32I-NEXT:    addi a2, a2, 1365
+; RV32I-NEXT:    and a1, a1, a2
 ; RV32I-NEXT:    sub a0, a0, a1
 ; RV32I-NEXT:    srli a1, a0, 2
-; RV32I-NEXT:    and a0, a0, a2
+; RV32I-NEXT:    lui a2, 209715
+; RV32I-NEXT:    addi a2, a2, 819
 ; RV32I-NEXT:    and a1, a1, a2
-; RV32I-NEXT:    lui a2, 61681
+; RV32I-NEXT:    and a0, a0, a2
 ; RV32I-NEXT:    add a0, a1, a0
 ; RV32I-NEXT:    srli a1, a0, 4
 ; RV32I-NEXT:    add a0, a1, a0
-; RV32I-NEXT:    addi a1, a2, -241
+; RV32I-NEXT:    lui a1, 61681
+; RV32I-NEXT:    addi a1, a1, -241
 ; RV32I-NEXT:    and a0, a0, a1
 ; RV32I-NEXT:    slli a1, a0, 8
 ; RV32I-NEXT:    add a0, a0, a1
@@ -196,11 +196,11 @@ define i64 @cttz_i64(i64 %a) nounwind {
 ; RV32I-LABEL: cttz_i64:
 ; RV32I:       # %bb.0:
 ; RV32I-NEXT:    lui a2, 349525
-; RV32I-NEXT:    lui a3, 209715
-; RV32I-NEXT:    lui a5, 61681
 ; RV32I-NEXT:    addi a4, a2, 1365
-; RV32I-NEXT:    addi a3, a3, 819
-; RV32I-NEXT:    addi a2, a5, -241
+; RV32I-NEXT:    lui a2, 209715
+; RV32I-NEXT:    addi a3, a2, 819
+; RV32I-NEXT:    lui a2, 61681
+; RV32I-NEXT:    addi a2, a2, -241
 ; RV32I-NEXT:    beqz a0, .LBB3_2
 ; RV32I-NEXT:  # %bb.1:
 ; RV32I-NEXT:    not a1, a0
@@ -271,17 +271,17 @@ define i32 @ctpop_i32(i32 %a) nounwind {
 ; RV32I-NEXT:    lui a2, 349525
 ; RV32I-NEXT:    addi a2, a2, 1365
 ; RV32I-NEXT:    and a1, a1, a2
-; RV32I-NEXT:    lui a2, 209715
-; RV32I-NEXT:    addi a2, a2, 819
 ; RV32I-NEXT:    sub a0, a0, a1
 ; RV32I-NEXT:    srli a1, a0, 2
-; RV32I-NEXT:    and a0, a0, a2
+; RV32I-NEXT:    lui a2, 209715
+; RV32I-NEXT:    addi a2, a2, 819
 ; RV32I-NEXT:    and a1, a1, a2
-; RV32I-NEXT:    lui a2, 61681
+; RV32I-NEXT:    and a0, a0, a2
 ; RV32I-NEXT:    add a0, a1, a0
 ; RV32I-NEXT:    srli a1, a0, 4
 ; RV32I-NEXT:    add a0, a1, a0
-; RV32I-NEXT:    addi a1, a2, -241
+; RV32I-NEXT:    lui a1, 61681
+; RV32I-NEXT:    addi a1, a1, -241
 ; RV32I-NEXT:    and a0, a0, a1
 ; RV32I-NEXT:    slli a1, a0, 8
 ; RV32I-NEXT:    add a0, a0, a1
@@ -305,39 +305,39 @@ define i64 @ctpop_i64(i64 %a) nounwind {
 ; RV32I:       # %bb.0:
 ; RV32I-NEXT:    srli a2, a0, 1
 ; RV32I-NEXT:    lui a3, 349525
-; RV32I-NEXT:    lui a4, 209715
-; RV32I-NEXT:    srli a5, a1, 1
 ; RV32I-NEXT:    addi a3, a3, 1365
 ; RV32I-NEXT:    and a2, a2, a3
-; RV32I-NEXT:    and a3, a5, a3
-; RV32I-NEXT:    lui a5, 61681
-; RV32I-NEXT:    addi a4, a4, 819
-; RV32I-NEXT:    addi a5, a5, -241
 ; RV32I-NEXT:    sub a0, a0, a2
-; RV32I-NEXT:    sub a1, a1, a3
 ; RV32I-NEXT:    srli a2, a0, 2
+; RV32I-NEXT:    lui a4, 209715
+; RV32I-NEXT:    addi a4, a4, 819
+; RV32I-NEXT:    and a2, a2, a4
 ; RV32I-NEXT:    and a0, a0, a4
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    srli a2, a0, 4
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    lui a2, 61681
+; RV32I-NEXT:    addi a2, a2, -241
+; RV32I-NEXT:    and a0, a0, a2
+; RV32I-NEXT:    slli a5, a0, 8
+; RV32I-NEXT:    add a0, a0, a5
+; RV32I-NEXT:    slli a5, a0, 16
+; RV32I-NEXT:    add a0, a0, a5
+; RV32I-NEXT:    srli a0, a0, 24
+; RV32I-NEXT:    srli a5, a1, 1
+; RV32I-NEXT:    and a3, a5, a3
+; RV32I-NEXT:    sub a1, a1, a3
 ; RV32I-NEXT:    srli a3, a1, 2
-; RV32I-NEXT:    and a1, a1, a4
-; RV32I-NEXT:    and a2, a2, a4
 ; RV32I-NEXT:    and a3, a3, a4
-; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    and a1, a1, a4
 ; RV32I-NEXT:    add a1, a3, a1
-; RV32I-NEXT:    srli a2, a0, 4
 ; RV32I-NEXT:    srli a3, a1, 4
-; RV32I-NEXT:    add a0, a2, a0
 ; RV32I-NEXT:    add a1, a3, a1
-; RV32I-NEXT:    and a0...
[truncated]

preames · 2024-11-12T15:43:11Z

I suggest we let this sit for 2-3 days after the prior patch has landed. As I said on the previous review, I think this is reasonable, but a) we had a bunch of discussion on this point and b) it'd be good to have some staging between the individual commits to simplify regression analysis.

michaelmaitland · 2024-11-12T17:07:05Z

For the recent scheduler patches, the common theme is we saw another target did something brought that functionality to RISC-V. How do we know that these changes are sensible defaults for RISC-V cores? Are you making measurements on any cores? Are they in order, out of order, both? In my experience tuning for different cores, there is often a difference between OOO and in order cores.

mshockwave · 2024-11-12T18:34:26Z

This helps reduce register pressure for some cases.

Is it possible to provide some numbers to back this up? Preferably using some well known benchmarks like SPEC and/or llvm-test-suite

wangpc-pp · 2024-11-13T05:21:22Z

I added two experimental options: -riscv-disable-latency-heuristic and -riscv-should-track-lane-masks and evaluated the statistics (regalloc.NumSpills/regalloc.NumReloads) on llvm-test-suite (option: -O3 -march=rva23u64):

-riscv-disable-latency-heuristic=true and -riscv-should-track-lane-masks=false:

Program                                       regalloc.NumSpills                   regalloc.NumReloads                  
                                              00                 10       diff     00                  10       diff    
SingleSour...ce/UnitTests/matrix-types-spec    8823.00            6166.00 -2657.00 15603.00            13403.00 -2200.00
External/S...rate/510.parest_r/510.parest_r   43817.00           43262.00  -555.00 87058.00            87033.00   -25.00
External/S...017speed/625.x264_s/625.x264_s    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
External/S...2017rate/525.x264_r/525.x264_r    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000    2684.00            2334.00  -350.00  4820.00             4349.00  -471.00
MultiSourc...nchmarks/FreeBench/pifft/pifft     442.00             126.00  -316.00   595.00              281.00  -314.00
MultiSourc.../Applications/JM/ldecod/ldecod    1335.00            1131.00  -204.00  2311.00             2142.00  -169.00
External/S...00.perlbench_s/600.perlbench_s    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
External/S...00.perlbench_r/500.perlbench_r    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
MultiSourc.../Applications/JM/lencod/lencod    3368.00            3172.00  -196.00  7261.00             7069.00  -192.00
External/S...te/538.imagick_r/538.imagick_r    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSourc...ch/consumer-lame/consumer-lame     722.00             559.00  -163.00  1098.00              994.00  -104.00
External/S...ed/638.imagick_s/638.imagick_s    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSource/Applications/oggenc/oggenc          970.00             817.00  -153.00  2327.00             2120.00  -207.00
MultiSourc...e/Applications/ClamAV/clamscan    2072.00            1937.00  -135.00  4836.00             4648.00  -188.00
      regalloc.NumSpills                            regalloc.NumReloads                           
run                   00            10         diff                  00            10         diff
mean   87.747460          84.068699    -3.678761     1371.475285         1357.146154  -3.792937

-riscv-disable-latency-heuristic=false and -riscv-should-track-lane-masks=true:

Program                                       regalloc.NumSpills                 regalloc.NumReloads                 
                                              00                 01      diff    00                  01       diff   
SingleSour...ce/UnitTests/matrix-types-spec   8823.00            8233.00 -590.00 15603.00            15020.00 -583.00
MultiSourc...ch/consumer-lame/consumer-lame    722.00             689.00  -33.00  1098.00             1065.00  -33.00
MultiSourc...s/Prolangs-C/football/football    248.00             250.00    2.00   349.00              350.00    1.00
MultiSourc...ench/telecomm-gsm/telecomm-gsm    182.00             181.00   -1.00   196.00              195.00   -1.00
MultiSourc...Benchmarks/7zip/7zip-benchmark   1272.00            1273.00    1.00  2436.00             2437.00    1.00
MicroBench...arks/ImageProcessing/Blur/blur    114.00             113.00   -1.00   136.00              136.00    0.00
MultiSourc...rks/mediabench/gsm/toast/toast    182.00             181.00   -1.00   196.00              195.00   -1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc   1196.00            1195.00   -1.00  2036.00             2029.00   -7.00
SingleSour.../execute/GCC-C-execute-pr36321      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr36077      0.00               0.00    0.00                                 0.00
SingleSour...xecute/GCC-C-execute-pr33779-1      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr33669      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr33631      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr33382      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr37102      0.00               0.00    0.00                                 0.00
      regalloc.NumSpills                            regalloc.NumReloads                           
run                   00            01         diff                  00            01         diff
mean   87.747460          87.445573    -0.301887     1371.475285         1369.091255  -0.303338

-riscv-disable-latency-heuristic=true and -riscv-should-track-lane-masks=true:

Program                                       regalloc.NumSpills                   regalloc.NumReloads                  
                                              00                 11       diff     00                  11       diff    
SingleSour...ce/UnitTests/matrix-types-spec    8823.00            6320.00 -2503.00 15603.00            13544.00 -2059.00
External/S...rate/510.parest_r/510.parest_r   43817.00           43262.00  -555.00 87058.00            87033.00   -25.00
External/S...017speed/625.x264_s/625.x264_s    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
External/S...2017rate/525.x264_r/525.x264_r    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000    2684.00            2334.00  -350.00  4820.00             4349.00  -471.00
MultiSourc...nchmarks/FreeBench/pifft/pifft     442.00             126.00  -316.00   595.00              281.00  -314.00
MultiSourc.../Applications/JM/ldecod/ldecod    1335.00            1131.00  -204.00  2311.00             2142.00  -169.00
External/S...00.perlbench_s/600.perlbench_s    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
External/S...00.perlbench_r/500.perlbench_r    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
MultiSourc.../Applications/JM/lencod/lencod    3368.00            3172.00  -196.00  7261.00             7069.00  -192.00
External/S...te/538.imagick_r/538.imagick_r    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSourc...ch/consumer-lame/consumer-lame     722.00             559.00  -163.00  1098.00              994.00  -104.00
External/S...ed/638.imagick_s/638.imagick_s    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSource/Applications/oggenc/oggenc          970.00             817.00  -153.00  2327.00             2120.00  -207.00
MultiSourc...e/Applications/ClamAV/clamscan    2072.00            1937.00  -135.00  4836.00             4648.00  -188.00
      regalloc.NumSpills                            regalloc.NumReloads                           
run                   00            11         diff                  00            11         diff
mean   87.747460          84.142235    -3.605225     1371.475285         1357.692308  -3.724238

We can see that both options can reduce the mean of spills/reloads. ShouldTrackLaneMasks has smaller influence because only vector registers (with sub-registers) can benefit from this.

I didn't run these tests on real hardwares, so these data may not be so convincing. I'd appreciate it if you can evaluate this on some platforms, that will be helpful. If you find this common setting is not suitable for your microarchitectures, please let me know, we can make it a tune feature. All I want is just to unify the common sched policy and make part of the policy being tune features.

Created using spr 1.3.6-beta.1 [skip ci]

Created using spr 1.3.6-beta.1

mshockwave

LGTM . I do have some minor questions but they're not blocking.

mshockwave · 2024-11-19T22:04:46Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll

-; CHECK-NEXT:    vse32.v v8, (a5)
-; CHECK-NEXT:    vse32.v v9, (a6)
-; CHECK-NEXT:    ret
+; RV32-LABEL: buildvec_vid_step1o2_v4i32:


why do we have different scheduling between RV32 and RV64?

mshockwave · 2024-11-19T22:09:58Z

llvm/test/CodeGen/RISCV/rvv/vselect-vp.ll

+; CHECK-NEXT:    slli a1, a1, 4
+; CHECK-NEXT:    add a1, sp, a1
+; CHECK-NEXT:    addi a1, a1, 16
+; CHECK-NEXT:    vs8r.v v16, (a1) # Unknown-size Folded Spill


do we have more spills / reloads in this function?

michaelmaitland · 2024-11-26T17:22:29Z

I measured this patch on spec for an in-order and an out-of-order core.

For the out-of-order core, 557.xz_r saw a regression.
For the in-order core, I see regression on 456.hmmer and 458.sjeng.

I saw no other significant improvements or regressions.

These findings, combined with the fact that latency heuristic is so low on the heuristic list (above only program order), I'm not sure that I see a strong argument to set this to true on either in-order or out-of-order by default.

preames · 2024-11-26T17:27:28Z

Given @michaelmaitland's data, @wangpc-pp the burden shifts to you to clearly justify which cases this is profitable and figure out how to selectively enable only in profitable cases. I agree with @michaelmaitland's conclusion that this should not move forward otherwise.

@michaelmaitland Can you say anything about the magnitude of regression in either case? I assume they were statistically significant given you mention them, but are these small regressions or largish ones?

michaelmaitland · 2024-11-27T02:22:34Z

Given @michaelmaitland's data, @wangpc-pp the burden shifts to you to clearly justify which cases this is profitable and figure out how to selectively enable only in profitable cases. I agree with @michaelmaitland's conclusion that this should not move forward otherwise.

@michaelmaitland Can you say anything about the magnitude of regression in either case? I assume they were statistically significant given you mention them, but are these small regressions or largish ones?

sjeng: 1.86% regression
557.xz_r: 1.14% regression
456.hmmer: 1.03% regression
All other x in results were -1 < x < 1

wangpc-pp · 2024-11-27T03:59:37Z

Thanks for evaluating this! The data is very helpful! @michaelmaitland

Given @michaelmaitland's data, @wangpc-pp the burden shifts to you to clearly justify which cases this is profitable and figure out how to selectively enable only in profitable cases. I agree with @michaelmaitland's conclusion that this should not move forward otherwise.

I don't have other data other than the spill/reload data above. I don't know how to dynamically determine if a SchedDAG region will benefit from disabling it because we can only know NumRegionInstrs (we may change the function signature and pass DAG directly in the future so that we can analyse the region). AArch64 is the only target will disable it and almost all Apple's CPUs have this feature on (don't know if it is profitable or they are just some inertial copies when defining new processor).

Again, if the conclusion is that we shouldn't make it true by default, I can make it a tune feature. All I want is making scheduling infrastructure easy to tune for downstreams. :-)

Created using spr 1.3.6-beta.1

michaelmaitland

LGTM

llvm/lib/Target/RISCV/RISCVFeatures.td

Created using spr 1.3.6-beta.1

topperc

LGTM

Created using spr 1.3.6-beta.1 [skip ci]

Created using spr 1.3.6-beta.1

wangpc-pp added 2 commits November 12, 2024 19:57

[𝘀𝗽𝗿] changes to main this commit is based on

a56f42c

Created using spr 1.3.6-beta.1 [skip ci]

[𝘀𝗽𝗿] initial version

a37fa16

Created using spr 1.3.6-beta.1

llvmbot added backend:RISC-V llvm:globalisel labels Nov 12, 2024

wangpc-pp requested review from topperc, lukel97 and preames November 12, 2024 11:58

wangpc-pp requested review from asb, mshockwave and michaelmaitland November 12, 2024 11:58

wangpc-pp added 2 commits November 15, 2024 18:13

[𝘀𝗽𝗿] changes introduced through rebase

fd8310f

Created using spr 1.3.6-beta.1 [skip ci]

Rebase

34c5042

Created using spr 1.3.6-beta.1

mshockwave approved these changes Nov 19, 2024

View reviewed changes

Add tune feature and set false by default

bd7860c

Created using spr 1.3.6-beta.1

wangpc-pp changed the title ~~[RISCV] Set DisableLatencyHeuristic to true~~ [RISCV] Add FeatureDisableLatencySchedHeuristic Nov 27, 2024

michaelmaitland approved these changes Nov 27, 2024

View reviewed changes

topperc reviewed Nov 27, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVFeatures.td Outdated Show resolved Hide resolved

Feature->Tune

33281d9

Created using spr 1.3.6-beta.1

wangpc-pp changed the title ~~[RISCV] Add FeatureDisableLatencySchedHeuristic~~ [RISCV] Add TuneDisableLatencySchedHeuristic Nov 28, 2024

topperc approved these changes Nov 28, 2024

View reviewed changes

[𝘀𝗽𝗿] changes introduced through rebase

dbcacb6

Created using spr 1.3.6-beta.1 [skip ci]

Rebase

d4b8db6

Created using spr 1.3.6-beta.1

wangpc-pp changed the base branch from users/wangpc-pp/spr/main.riscv-set-disablelatencyheuristic-to-true to main November 28, 2024 07:16

wangpc-pp merged commit 93f7398 into main Nov 28, 2024
7 of 11 checks passed

wangpc-pp deleted the users/wangpc-pp/spr/riscv-set-disablelatencyheuristic-to-true branch November 28, 2024 07:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Add TuneDisableLatencySchedHeuristic #115858

[RISCV] Add TuneDisableLatencySchedHeuristic #115858

Uh oh!

wangpc-pp commented Nov 12, 2024 •

edited

Loading

Uh oh!

llvmbot commented Nov 12, 2024 •

edited

Loading

Uh oh!

preames commented Nov 12, 2024

Uh oh!

michaelmaitland commented Nov 12, 2024

Uh oh!

mshockwave commented Nov 12, 2024

Uh oh!

wangpc-pp commented Nov 13, 2024

Uh oh!

mshockwave left a comment

Uh oh!

mshockwave Nov 19, 2024

Uh oh!

mshockwave Nov 19, 2024

Uh oh!

michaelmaitland commented Nov 26, 2024

Uh oh!

preames commented Nov 26, 2024

Uh oh!

michaelmaitland commented Nov 27, 2024

Uh oh!

wangpc-pp commented Nov 27, 2024

Uh oh!

michaelmaitland left a comment

Uh oh!

Uh oh!

topperc left a comment

Uh oh!

Uh oh!

Uh oh!

[RISCV] Add TuneDisableLatencySchedHeuristic #115858

[RISCV] Add TuneDisableLatencySchedHeuristic #115858

Uh oh!

Conversation

wangpc-pp commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

preames commented Nov 12, 2024

Uh oh!

michaelmaitland commented Nov 12, 2024

Uh oh!

mshockwave commented Nov 12, 2024

Uh oh!

wangpc-pp commented Nov 13, 2024

Uh oh!

mshockwave left a comment

Choose a reason for hiding this comment

Uh oh!

mshockwave Nov 19, 2024

Choose a reason for hiding this comment

Uh oh!

mshockwave Nov 19, 2024

Choose a reason for hiding this comment

Uh oh!

michaelmaitland commented Nov 26, 2024

Uh oh!

preames commented Nov 26, 2024

Uh oh!

michaelmaitland commented Nov 27, 2024

Uh oh!

wangpc-pp commented Nov 27, 2024

Uh oh!

michaelmaitland left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wangpc-pp commented Nov 12, 2024 •

edited

Loading

llvmbot commented Nov 12, 2024 •

edited

Loading