Skip to content

Commit ac6e1fd

Browse files
authored
[RISCV][TTI] Cost non-power-of-two size changing casts (#101047)
For a cast with src and destination size being unequal, we were costing the cast as if it were being scalarized, when in fact we can often promote such cases to a wider legal type. Note that for casts with equal size (i.e. bitcast, some fp<->i, and ptrtoint) the generic logic in BasicTTI already assumed promotion. It just doesn't handle the cast where source and destination are both promoted to non-equal types. This is analogous to d3fd28a, but with the same reasoning applied to casts instead.
1 parent 5ae9faa commit ac6e1fd

File tree

2 files changed

+32
-21
lines changed

2 files changed

+32
-21
lines changed

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1030,17 +1030,28 @@ InstructionCost RISCVTTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
10301030
if (!IsVectorType)
10311031
return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
10321032

1033-
bool IsTypeLegal = isTypeLegal(Src) && isTypeLegal(Dst) &&
1034-
(Src->getScalarSizeInBits() <= ST->getELen()) &&
1035-
(Dst->getScalarSizeInBits() <= ST->getELen());
1036-
1037-
// FIXME: Need to compute legalizing cost for illegal types.
1038-
if (!IsTypeLegal)
1033+
// FIXME: Need to compute legalizing cost for illegal types. The current
1034+
// code handles only legal types and those which can be trivially
1035+
// promoted to legal.
1036+
if (!ST->hasVInstructions() || Src->getScalarSizeInBits() > ST->getELen() ||
1037+
Dst->getScalarSizeInBits() > ST->getELen())
10391038
return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
10401039

10411040
std::pair<InstructionCost, MVT> SrcLT = getTypeLegalizationCost(Src);
10421041
std::pair<InstructionCost, MVT> DstLT = getTypeLegalizationCost(Dst);
10431042

1043+
// Our actual lowering for the case where a wider legal type is available
1044+
// uses promotion to the wider type. This is reflected in the result of
1045+
// getTypeLegalizationCost, but BasicTTI assumes the widened cases are
1046+
// scalarized if the legalized Src and Dst are not equal sized.
1047+
const DataLayout &DL = this->getDataLayout();
1048+
if (!SrcLT.second.isVector() || !DstLT.second.isVector() ||
1049+
!TypeSize::isKnownLE(DL.getTypeSizeInBits(Src),
1050+
SrcLT.second.getSizeInBits()) ||
1051+
!TypeSize::isKnownLE(DL.getTypeSizeInBits(Dst),
1052+
DstLT.second.getSizeInBits()))
1053+
return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
1054+
10441055
int ISD = TLI->InstructionOpcodeToISD(Opcode);
10451056
assert(ISD && "Invalid opcode");
10461057

llvm/test/Analysis/CostModel/RISCV/cast.ll

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -4317,24 +4317,24 @@ define void @uitofp() {
43174317

43184318
define void @oddvec_sizes() {
43194319
; CHECK-LABEL: 'oddvec_sizes'
4320-
; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %1 = sext <3 x i8> undef to <3 x i16>
4321-
; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %2 = sext <7 x i8> undef to <7 x i32>
4322-
; CHECK-NEXT: Cost Model: Found an estimated cost of 45 for instruction: %3 = sext <15 x i8> undef to <15 x i32>
4323-
; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %4 = zext <3 x i8> undef to <3 x i16>
4324-
; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %5 = zext <7 x i8> undef to <7 x i32>
4325-
; CHECK-NEXT: Cost Model: Found an estimated cost of 45 for instruction: %6 = zext <15 x i8> undef to <15 x i32>
4326-
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %7 = trunc <3 x i32> undef to <3 x i8>
4327-
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %8 = trunc <7 x i32> undef to <7 x i8>
4328-
; CHECK-NEXT: Cost Model: Found an estimated cost of 30 for instruction: %9 = trunc <15 x i32> undef to <15 x i8>
4320+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = sext <3 x i8> undef to <3 x i16>
4321+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = sext <7 x i8> undef to <7 x i32>
4322+
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %3 = sext <15 x i8> undef to <15 x i32>
4323+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = zext <3 x i8> undef to <3 x i16>
4324+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = zext <7 x i8> undef to <7 x i32>
4325+
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %6 = zext <15 x i8> undef to <15 x i32>
4326+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <3 x i32> undef to <3 x i8>
4327+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <7 x i32> undef to <7 x i8>
4328+
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %9 = trunc <15 x i32> undef to <15 x i8>
43294329
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %10 = bitcast <3 x i32> undef to <3 x float>
43304330
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %11 = bitcast <7 x i32> undef to <7 x float>
43314331
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %12 = bitcast <15 x i32> undef to <15 x float>
4332-
; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %13 = sitofp <3 x i32> undef to <3 x float>
4333-
; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %14 = sitofp <7 x i32> undef to <7 x float>
4334-
; CHECK-NEXT: Cost Model: Found an estimated cost of 45 for instruction: %15 = sitofp <15 x i32> undef to <15 x float>
4335-
; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %16 = uitofp <3 x i32> undef to <3 x float>
4336-
; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %17 = uitofp <7 x i32> undef to <7 x float>
4337-
; CHECK-NEXT: Cost Model: Found an estimated cost of 45 for instruction: %18 = uitofp <15 x i32> undef to <15 x float>
4332+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %13 = sitofp <3 x i32> undef to <3 x float>
4333+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %14 = sitofp <7 x i32> undef to <7 x float>
4334+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %15 = sitofp <15 x i32> undef to <15 x float>
4335+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = uitofp <3 x i32> undef to <3 x float>
4336+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %17 = uitofp <7 x i32> undef to <7 x float>
4337+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %18 = uitofp <15 x i32> undef to <15 x float>
43384338
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %19 = fptosi <3 x float> undef to <3 x i32>
43394339
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %20 = fptosi <7 x float> undef to <7 x i32>
43404340
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %21 = fptosi <15 x float> undef to <15 x i32>

0 commit comments

Comments
 (0)