Skip to content

[RISCV][CostModel] Correct the cost of some reductions #118072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 4, 2024

Conversation

LiqinWeng
Copy link
Contributor

Reductions include: and/or/max/min

@llvmbot
Copy link
Member

llvmbot commented Nov 29, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-risc-v

Author: LiqinWeng (LiqinWeng)

Changes

Reductions include: and/or/max/min


Full diff: https://github.com/llvm/llvm-project/pull/118072.diff

1 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+13-6)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 8f0ef69258b165..b098dd0b7613e5 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1470,27 +1470,27 @@ RISCVTTIImpl::getMinMaxReductionCost(Intrinsic::ID IID, VectorType *Ty,
     llvm_unreachable("Unsupported intrinsic");
   case Intrinsic::smax:
     SplitOp = RISCV::VMAX_VV;
-    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAX_VS, RISCV::VMV_X_S};
+    Opcodes = {RISCV::VREDMAX_VS, RISCV::VMV_X_S};
     break;
   case Intrinsic::smin:
     SplitOp = RISCV::VMIN_VV;
-    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMIN_VS, RISCV::VMV_X_S};
+    Opcodes = {RISCV::VREDMIN_VS, RISCV::VMV_X_S};
     break;
   case Intrinsic::umax:
     SplitOp = RISCV::VMAXU_VV;
-    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAXU_VS, RISCV::VMV_X_S};
+    Opcodes = {RISCV::VREDMAXU_VS, RISCV::VMV_X_S};
     break;
   case Intrinsic::umin:
     SplitOp = RISCV::VMINU_VV;
-    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMINU_VS, RISCV::VMV_X_S};
+    Opcodes = {RISCV::VREDMINU_VS, RISCV::VMV_X_S};
     break;
   case Intrinsic::maxnum:
     SplitOp = RISCV::VFMAX_VV;
-    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMAX_VS, RISCV::VFMV_F_S};
+    Opcodes = {RISCV::VFREDMAX_VS, RISCV::VFMV_F_S};
     break;
   case Intrinsic::minnum:
     SplitOp = RISCV::VFMIN_VV;
-    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMIN_VS, RISCV::VFMV_F_S};
+    Opcodes = {RISCV::VFREDMIN_VS, RISCV::VFMV_F_S};
     break;
   }
   // Add a cost for data larger than LMUL8
@@ -1534,6 +1534,13 @@ RISCVTTIImpl::getArithmeticReductionCost(unsigned Opcode, VectorType *Ty,
              getRISCVInstructionCost(Opcodes, LT.second, CostKind) +
              getCmpSelInstrCost(Instruction::ICmp, ElementTy, ElementTy,
                                 CmpInst::ICMP_EQ, CostKind);
+    } else if (ISD == ISD::XOR) {
+      // Example sequences:
+      //   vsetvli a0, zero, e8, mf8, ta, ma
+      //   vcpop.m a0, v0
+      //   andi a0, a0, 1
+      Opcodes = {RISCV::VCPOP_M};
+      return LT.first + getRISCVInstructionCost(Opcodes, LT.second, CostKind);
     } else {
       // Example sequences:
       //   vsetvli a0, zero, e8, mf8, ta, ma

@LiqinWeng LiqinWeng force-pushed the correct-reduction-cost branch from 429a864 to ffe5b35 Compare November 29, 2024 09:17
@LiqinWeng LiqinWeng force-pushed the correct-reduction-cost branch from ffe5b35 to 2a43729 Compare November 29, 2024 09:18
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Nov 29, 2024
// vcpop.m a0, v0
// andi a0, a0, 1
Opcodes = {RISCV::VCPOP_M};
return LT.first + getRISCVInstructionCost(Opcodes, LT.second, CostKind);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be

Suggested change
return LT.first + getRISCVInstructionCost(Opcodes, LT.second, CostKind);
return LT.first * getRISCVInstructionCost(Opcodes, LT.second, CostKind);

With that said, I'm also not sure why the existing sequences are adding the legalization cost instead of multiplying it. We multiply it everywhere else in RISCVTargetTransformInfo. Maybe this is something to fix in a follow up patch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deal with the type: ElementTy->isIntegerTy(1), should we real need use *???? I don't understand the case of ISD == ISD::AND, it uses LT.first - 1. I'm not sure why it subtracts 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mycost : LT.first - 1 + getRISCVInstructionCost + 1(1 is cost of the andi)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(LT.first - 1) first appeared in this commit.
It seems to be used to calculate the additional cost when the LMUL exceeds 8.
For example,

define zeroext i1 @vreduce_and_nxv1i1(<vscale x 128 x i1> %v) {
;    vmand.mm    v8, v0, v8 // addtional cost when LMUL exceeds 8.
;    vmnot.m v8, v8
;    vcpop.m a0, v8
;    seqz    a0, a0
  %red = call i1 @llvm.vector.reduce.and.nxv128i1(<vscale x 128 x i1> %v)
  ret i1 %red
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah LT.first is the cost of type legalization, i.e. splitting. But it's the number of times the operation will be split, so we need to multiply that by the cost of whatever node. I'll open up a PR to fix the existing cases, but for this PR we might as well do it the correct way i.e. by multiplying. (Although I don't think it will make a difference to the costs, since these mask instructions are always LMUL 1!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone ahead and done this in df10f1c

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@LiqinWeng LiqinWeng force-pushed the correct-reduction-cost branch from 65d8998 to 1588e92 Compare December 1, 2024 05:24
Comment on lines 1537 to 1541
} else if (ISD == ISD::XOR) {
// Example sequences:
// vsetvli a0, zero, e8, mf8, ta, ma
// vcpop.m a0, v0
// andi a0, a0, 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the else block already cover this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XOR doesn't use a seqz though so it shouldn't use getCmpSelInstrCost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukel97 You're right. I initially considered replacing getCmpSelInstrCost with 1 to streamline the code. However, I'm fine if you'd prefer to handle the cost separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@arcbbb arcbbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please wait for one more approval.

Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@LiqinWeng LiqinWeng merged commit 46829e5 into llvm:main Dec 4, 2024
8 checks passed
@LiqinWeng LiqinWeng deleted the correct-reduction-cost branch December 10, 2024 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants