Skip to content

[SLP][REVEC] Make getExtractWithExtendCost support FixedVectorType as Dst. #134822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 10, 2025

Conversation

HanKuanChen
Copy link
Contributor

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Apr 8, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: Han-Kuan Chen (HanKuanChen)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/134822.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+24-3)
  • (added) llvm/test/Transforms/SLPVectorizer/X86/revec-getExtractWithExtendCost.ll (+27)
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index e6559f26be8c2..6a96d6e40674c 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5399,6 +5399,25 @@ static InstructionCost getVectorInstrCost(
                                 ScalarUserAndIdx);
 }
 
+/// This is similar to TargetTransformInfo::getExtractWithExtendCost, but if Dst
+/// is a FixedVectorType, a vector will be extracted instead of a scalar.
+static InstructionCost getExtractWithExtendCost(const TargetTransformInfo &TTI,
+                                                unsigned Opcode, Type *Dst,
+                                                VectorType *VecTy,
+                                                unsigned Index) {
+  if (auto *ScalarTy = dyn_cast<FixedVectorType>(Dst)) {
+    assert(SLPReVec && "Only supported by REVEC.");
+    TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
+    auto *SubTp =
+        getWidenedType(VecTy->getElementType(), ScalarTy->getNumElements());
+    return getShuffleCost(TTI, TTI::SK_ExtractSubvector, VecTy, {}, CostKind,
+                          Index * ScalarTy->getNumElements(), SubTp) +
+           TTI.getCastInstrCost(Opcode, Dst, SubTp, TTI::CastContextHint::None,
+                                CostKind);
+  }
+  return TTI.getExtractWithExtendCost(Opcode, Dst, VecTy, Index);
+}
+
 /// Correctly creates insert_subvector, checking that the index is multiple of
 /// the subvectors length. Otherwise, generates shuffle using \p Generator or
 /// using default shuffle.
@@ -14088,13 +14107,15 @@ InstructionCost BoUpSLP::getTreeCost(ArrayRef<Value *> VectorizedVals,
     const TreeEntry *Entry = &EU.E;
     auto It = MinBWs.find(Entry);
     if (It != MinBWs.end()) {
-      auto *MinTy = IntegerType::get(F->getContext(), It->second.first);
+      Type *MinTy = IntegerType::get(F->getContext(), It->second.first);
+      if (auto *VecTy = dyn_cast<FixedVectorType>(ScalarTy))
+        MinTy = getWidenedType(MinTy, VecTy->getNumElements());
       unsigned Extend = isKnownNonNegative(EU.Scalar, SimplifyQuery(*DL))
                             ? Instruction::ZExt
                             : Instruction::SExt;
       VecTy = getWidenedType(MinTy, BundleWidth);
-      ExtraCost = TTI->getExtractWithExtendCost(Extend, EU.Scalar->getType(),
-                                                VecTy, EU.Lane);
+      ExtraCost =
+          getExtractWithExtendCost(*TTI, Extend, ScalarTy, VecTy, EU.Lane);
     } else {
       ExtraCost =
           getVectorInstrCost(*TTI, ScalarTy, Instruction::ExtractElement, VecTy,
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/revec-getExtractWithExtendCost.ll b/llvm/test/Transforms/SLPVectorizer/X86/revec-getExtractWithExtendCost.ll
new file mode 100644
index 0000000000000..8d3d9f2979298
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/revec-getExtractWithExtendCost.ll
@@ -0,0 +1,27 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -mtriple=x86_64-unknown-linux-gnu -mattr=+avx10.2-512 -passes=slp-vectorizer -S -slp-revec %s | FileCheck %s
+
+define void @test() {
+; CHECK-LABEL: @test(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = sub <8 x i64> zeroinitializer, splat (i64 1)
+; CHECK-NEXT:    [[TMP1:%.*]] = sub <8 x i64> zeroinitializer, zeroinitializer
+; CHECK-NEXT:    [[TMP2:%.*]] = or <8 x i64> [[TMP0]], zeroinitializer
+; CHECK-NEXT:    [[TMP3:%.*]] = trunc <8 x i64> [[TMP0]] to <8 x i32>
+; CHECK-NEXT:    [[TMP4:%.*]] = trunc <8 x i64> [[TMP1]] to <8 x i32>
+; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i8, ptr null, i64 32
+; CHECK-NEXT:    store <8 x i32> [[TMP3]], ptr null, align 4
+; CHECK-NEXT:    store <8 x i32> [[TMP4]], ptr [[TMP5]], align 4
+; CHECK-NEXT:    ret void
+;
+entry:
+  %0 = sub <8 x i64> zeroinitializer, splat (i64 1)
+  %1 = sub <8 x i64> zeroinitializer, zeroinitializer
+  %2 = or <8 x i64> %0, zeroinitializer
+  %3 = trunc <8 x i64> %0 to <8 x i32>
+  %4 = trunc <8 x i64> %1 to <8 x i32>
+  %5 = getelementptr i8, ptr null, i64 32
+  store <8 x i32> %3, ptr null, align 4
+  store <8 x i32> %4, ptr %5, align 4
+  ret void
+}

@HanKuanChen HanKuanChen merged commit d02a704 into llvm:main Apr 10, 2025
11 checks passed
@HanKuanChen HanKuanChen deleted the slp-revec-getExtractWithExtendCost branch April 10, 2025 10:54
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
qiaojbao pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Apr 29, 2025
Local branch origin/amd-gfx 6efc92e Merged main:e3f5a1bfc58b into origin/amd-gfx:6b6e30e6b6dc
Remote branch main d02a704 [SLP][REVEC] Make getExtractWithExtendCost support FixedVectorType as Dst. (llvm#134822)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants