Skip to content

[RISCV] Remove unnecessary patterns for tail agnostic FP intrinsics with rounding mode. #94498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Jun 5, 2024

These are patterns that explicitly check for undef. Similar patterns do not exist without rounding mode. The vsetvli insertion pass should be able to detect that the passthru is undef.

The test changes seem to be a deficiency in identifying pass thru being undef in the vsetvli inserter for -O0.

…intrinsics with rounding mode.

These are patterns that explicitly check for undef. Similar patterns
do not exist without rounding mode. The vsetvli insertion pass
should be able to detect that the passthru is undef.

The test changes seem to be a deficiency in identifying pass thru
being undef in the vsetvli inserter for -O0.
@llvmbot
Copy link
Member

llvmbot commented Jun 5, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

These are patterns that explicitly check for undef. Similar patterns do not exist without rounding mode. The vsetvli insertion pass should be able to detect that the passthru is undef.

The test changes seem to be a deficiency in identifying pass thru being undef in the vsetvli inserter for -O0.


Full diff: https://github.com/llvm/llvm-project/pull/94498.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td (-23)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll (+2-2)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index 3af8b65291efc..230fd0aa1bf68 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -4153,27 +4153,6 @@ class VPatBinaryNoMaskTU<string intrinsic_name,
                    (op2_type op2_kind:$rs2),
                    GPR:$vl, sew, TU_MU)>;
 
-class VPatBinaryNoMaskRoundingMode<string intrinsic_name,
-                                   string inst,
-                                   ValueType result_type,
-                                   ValueType op1_type,
-                                   ValueType op2_type,
-                                   int sew,
-                                   VReg op1_reg_class,
-                                   DAGOperand op2_kind> :
-  Pat<(result_type (!cast<Intrinsic>(intrinsic_name)
-                   (result_type (undef)),
-                   (op1_type op1_reg_class:$rs1),
-                   (op2_type op2_kind:$rs2),
-                   (XLenVT timm:$round),
-                   VLOpFrag)),
-                   (!cast<Instruction>(inst)
-                   (result_type (IMPLICIT_DEF)),
-                   (op1_type op1_reg_class:$rs1),
-                   (op2_type op2_kind:$rs2),
-                   (XLenVT timm:$round),
-                   GPR:$vl, sew, TA_MA)>;
-
 class VPatBinaryNoMaskTURoundingMode<string intrinsic_name,
                                      string inst,
                                      ValueType result_type,
@@ -4827,8 +4806,6 @@ multiclass VPatBinaryRoundingMode<string intrinsic,
                                   VReg result_reg_class,
                                   VReg op1_reg_class,
                                   DAGOperand op2_kind> {
-  def : VPatBinaryNoMaskRoundingMode<intrinsic, inst, result_type, op1_type, op2_type,
-                                       sew, op1_reg_class, op2_kind>;
   def : VPatBinaryNoMaskTURoundingMode<intrinsic, inst, result_type, op1_type, op2_type,
                                        sew, result_reg_class, op1_reg_class, op2_kind>;
   def : VPatBinaryMaskTARoundingMode<intrinsic, inst, result_type, op1_type, op2_type,
diff --git a/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll b/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll
index 8210ea22a6ee9..ac74a82e79e6d 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll
@@ -22,7 +22,7 @@ define <vscale x 1 x double> @foo(<vscale x 1 x double> %a, <vscale x 1 x double
 ; SPILL-O0-NEXT:    addi a1, a1, 16
 ; SPILL-O0-NEXT:    vs1r.v v9, (a1) # Unknown-size Folded Spill
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; SPILL-O0-NEXT:    vfadd.vv v8, v9, v10
 ; SPILL-O0-NEXT:    addi a0, sp, 16
 ; SPILL-O0-NEXT:    vs1r.v v8, (a0) # Unknown-size Folded Spill
@@ -38,7 +38,7 @@ define <vscale x 1 x double> @foo(<vscale x 1 x double> %a, <vscale x 1 x double
 ; SPILL-O0-NEXT:    # kill: def $x11 killed $x10
 ; SPILL-O0-NEXT:    lw a0, 8(sp) # 4-byte Folded Reload
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; SPILL-O0-NEXT:    vfadd.vv v8, v9, v10
 ; SPILL-O0-NEXT:    csrr a0, vlenb
 ; SPILL-O0-NEXT:    slli a0, a0, 1
diff --git a/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll b/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
index 3523629088983..9054048f2f747 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
@@ -25,7 +25,7 @@ define <vscale x 1 x double> @foo(<vscale x 1 x double> %a, <vscale x 1 x double
 ; SPILL-O0-NEXT:    addi a1, a1, 32
 ; SPILL-O0-NEXT:    vs1r.v v9, (a1) # Unknown-size Folded Spill
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; SPILL-O0-NEXT:    vfadd.vv v8, v9, v10
 ; SPILL-O0-NEXT:    addi a0, sp, 32
 ; SPILL-O0-NEXT:    vs1r.v v8, (a0) # Unknown-size Folded Spill
@@ -41,7 +41,7 @@ define <vscale x 1 x double> @foo(<vscale x 1 x double> %a, <vscale x 1 x double
 ; SPILL-O0-NEXT:    # kill: def $x11 killed $x10
 ; SPILL-O0-NEXT:    ld a0, 16(sp) # 8-byte Folded Reload
 ; SPILL-O0-NEXT:    # implicit-def: $v8
-; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; SPILL-O0-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; SPILL-O0-NEXT:    vfadd.vv v8, v9, v10
 ; SPILL-O0-NEXT:    csrr a0, vlenb
 ; SPILL-O0-NEXT:    slli a0, a0, 1

@4vtomat
Copy link
Member

4vtomat commented Jun 5, 2024

Will you also fix the vsetvli insertion pass to recognize this undef pattern?

@lukel97
Copy link
Contributor

lukel97 commented Jun 5, 2024

Will you also fix the vsetvli insertion pass to recognize this undef pattern?

In #93796 we were actually planning on removing LiveIntervals at O0 which will make the codegen even more sub-optimal, but hopefully in exchange for faster compilation times. So if we're committing to that then I don't think this regression matters

@4vtomat
Copy link
Member

4vtomat commented Jun 5, 2024

Will you also fix the vsetvli insertion pass to recognize this undef pattern?

In #93796 we were actually planning on removing LiveIntervals at O0 which will make the codegen even more sub-optimal, but hopefully in exchange for faster compilation times. So if we're committing to that then I don't think this regression matters

Thanks @lukel97 , I agree that in O0, we can exchange optimal for faster compile time~

Copy link
Member

@4vtomat 4vtomat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

@topperc topperc merged commit 31ba25e into llvm:main Jun 5, 2024
7 of 8 checks passed
@topperc topperc deleted the pr/unneeded-ta branch June 5, 2024 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants