Skip to content

Commit 9512db0

Browse files
committed
AMDGPU: Correct costs of saturating add/sub intrinsics
These are directly legal with fast instructions.
1 parent bbd9d3d commit 9512db0

File tree

3 files changed

+339
-337
lines changed

3 files changed

+339
-337
lines changed

llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -753,7 +753,9 @@ GCNTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
753753
case Intrinsic::usub_sat:
754754
case Intrinsic::sadd_sat:
755755
case Intrinsic::ssub_sat: {
756-
// TODO: Full rate for i32/i16
756+
if (SLT == MVT::i16 || SLT == MVT::i32)
757+
InstRate = getFullRateInstrCost();
758+
757759
static const auto ValidSatTys = {MVT::v2i16, MVT::v4i16};
758760
if (any_of(ValidSatTys, [&LT](MVT M) { return M == LT.second; }))
759761
NElts = 1;

0 commit comments

Comments
 (0)