[TTI][AArch64] Add preferFixedIfEqualToScalable hook #95818

sjoerdmeijer · 2024-06-17T18:05:20Z

This adds a new hook to prefer fixed width loop vectorization over scalable. This will be used in the loop vectoriser to generate more NEON code instead of SVE if cost-model assigns equal costs to fixed and scalable versions of the vectorised loop.

This is used in #95819.

llvmbot · 2024-06-17T18:05:50Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-aarch64

Author: Sjoerd Meijer (sjoerdmeijer)

Changes

This adds a new hook to prefer fixed width loop vectorization over scalable. This will be used in the loop vectoriser to generate more NEON code instead of SVE if cost-model assigns equal costs to fixed and scalable versions of the vectorised loop.

Full diff: https://github.com/llvm/llvm-project/pull/95818.diff

6 Files Affected:

(modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (+6)
(modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+4)
(modified) llvm/lib/Analysis/TargetTransformInfo.cpp (+4)
(modified) llvm/lib/Target/AArch64/AArch64Features.td (+3)
(modified) llvm/lib/Target/AArch64/AArch64Processors.td (+1)
(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h (+2)

diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index f55f21c94a85a..84b811818beee 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -1674,6 +1674,8 @@ class TargetTransformInfo {
         false; ///< If op is an fp min/max, whether NaNs may be present.
   };
 
+  bool preferFixedIfEqualToScalable() const;
+
   /// \returns True if the target prefers reductions in loop.
   bool preferInLoopReduction(unsigned Opcode, Type *Ty,
                              ReductionFlags Flags) const;
@@ -2143,6 +2145,7 @@ class TargetTransformInfo::Concept {
   virtual unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
                                         unsigned ChainSizeInBytes,
                                         VectorType *VecTy) const = 0;
+  virtual bool preferFixedIfEqualToScalable() const = 0;
   virtual bool preferInLoopReduction(unsigned Opcode, Type *Ty,
                                      ReductionFlags) const = 0;
   virtual bool preferPredicatedReductionSelect(unsigned Opcode, Type *Ty,
@@ -2873,6 +2876,9 @@ class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
                                 VectorType *VecTy) const override {
     return Impl.getStoreVectorFactor(VF, StoreSize, ChainSizeInBytes, VecTy);
   }
+  bool preferFixedIfEqualToScalable() const override {
+    return Impl.preferFixedIfEqualToScalable();
+  }
   bool preferInLoopReduction(unsigned Opcode, Type *Ty,
                              ReductionFlags Flags) const override {
     return Impl.preferInLoopReduction(Opcode, Ty, Flags);
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 7828bdc1f1f43..9679c9ec6bc4e 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -913,6 +913,10 @@ class TargetTransformInfoImplBase {
     return VF;
   }
 
+  bool preferFixedIfEqualToScalable() const {
+    return false;
+  }
+
   bool preferInLoopReduction(unsigned Opcode, Type *Ty,
                              TTI::ReductionFlags Flags) const {
     return false;
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index 7e721cbc87f3f..27a7f5b32d3cf 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -1282,6 +1282,10 @@ unsigned TargetTransformInfo::getStoreVectorFactor(unsigned VF,
   return TTIImpl->getStoreVectorFactor(VF, StoreSize, ChainSizeInBytes, VecTy);
 }
 
+bool TargetTransformInfo::preferFixedIfEqualToScalable() const {
+  return TTIImpl->preferFixedIfEqualToScalable();
+}
+
 bool TargetTransformInfo::preferInLoopReduction(unsigned Opcode, Type *Ty,
                                                 ReductionFlags Flags) const {
   return TTIImpl->preferInLoopReduction(Opcode, Ty, Flags);
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index ffb899a301459..988630769afdb 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -244,6 +244,9 @@ def FeatureExperimentalZeroingPseudos
 def FeatureUseScalarIncVL : SubtargetFeature<"use-scalar-inc-vl",
   "UseScalarIncVL", "true", "Prefer inc/dec over add+cnt">;
 
+def FeatureUseFixedIfEqualToScalable : SubtargetFeature<"use-fixed-if-equal-to-scalable",
+  "UseFixedIfEqualToScalable", "true", "Prefer fixed width loop vectorization over scalable if cost-model assigns equal costs">;
+
 def FeatureBF16 : Extension<"bf16", "BF16",
     "Enable BFloat16 Extension (FEAT_BF16)", [],
     "FEAT_BF16", "+bf16", 280>;
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index cc33765307fb4..a0263f0164f4f 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -489,6 +489,7 @@ def TuneNeoverseV2 : SubtargetFeature<"neoversev2", "ARMProcFamily", "NeoverseV2
                                       FeatureALULSLFast,
                                       FeaturePostRAScheduler,
                                       FeatureEnableSelectOptimize,
+                                      FeatureUseFixedIfEqualToScalable,
                                       FeaturePredictableSelectIsExpensive]>;
 
 def TuneNeoverseV3 : SubtargetFeature<"neoversev3", "ARMProcFamily", "NeoverseV3",
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
index feec1a4289c3a..13e66f9ea1913 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
@@ -371,6 +371,8 @@ class AArch64TTIImpl : public BasicTTIImplBase<AArch64TTIImpl> {
     return TailFoldingStyle::DataWithoutLaneMask;
   }
 
+  bool preferFixedIfEqualToScalable() const { return ST->useFixedIfEqualToScalable(); }
+
   bool preferPredicateOverEpilogue(TailFoldingInfo *TFI);
 
   bool supportsScalableVectors() const { return ST->hasSVE(); }

github-actions · 2024-06-17T18:08:11Z

✅ With the latest revision this PR passed the C/C++ code formatter.

This adds a new hook to prefer fixed width loop vectorization over scalable. This will be used in the loop vectoriser to generate more NEON code instead of SVE if cost-model assigns equal costs to fixed and scalable versions of the vectorised loop. This is used in llvm#95819.

…erse V2) For the Neoverse V2, prefer fixed width vectorisation If the cost-model assigns an equal cost to fixed and scalable vectorisation. This improves 7 kernels from TSVC-2 by about 2x, and does not affect SPEC21017 INT and FP. This tends to benefit small kernels, like the ones in TSVC, for a number of reasons: processing the predicates does not come entirely for free, NEON tends to generate slightly less code which can have a big impact on these small kernels, and then there are second order affects that SVE codegen is slightly less optimal in some areas. This codegen strategy to generate more NEON is inline with GCC's codegen strategy, which is actually even more aggressive in generating NEON when no predication is required. We could be smarter and more aggressive too about generating more NEON (and improve performance), but this seems to be a first good and straight forward step. This depends on llvm#95818.

davemgreen · 2024-06-18T06:49:16Z

llvm/lib/Target/AArch64/AArch64Features.td

@@ -244,6 +244,9 @@ def FeatureExperimentalZeroingPseudos
 def FeatureUseScalarIncVL : SubtargetFeature<"use-scalar-inc-vl",
  "UseScalarIncVL", "true", "Prefer inc/dec over add+cnt">;

+def FeatureUseFixedIfEqualToScalable : SubtargetFeature<"use-fixed-if-equal-to-scalable",


This could probably do with the word "Cost" in their somewhere. Maybe FeatureUseFixedOverScalableIfEqualCost

Does this need to be a full-blown feature if you intend to just enable it by default for V2 anyway? When we added VScaleForTuning, we just used a boolean in AArch64Subtarget that gets set during initialisation for the particular CPU. See AArch64Subtarget::initializeProperties.

Subtarget features are considered the best way to add tuning features like this. There is even a comment about it

void AArch64Subtarget::initializeProperties(bool HasMinSize) { // Initialize CPU specific properties. We should add a tablegen feature for // this in the future so we can specify it together with the subtarget // features.

Ah ok. Fair enough!

In that case maybe it's worth us revisiting VScaleForTuning as well and making that a feature for consistency.

Thanks for the naming suggesting, I was struggling with the name, will change it to FeatureUseFixedOverScalableIfEqualCost.

davemgreen · 2024-06-18T16:26:09Z

Could you combine this into #95819? They look atomic together and we would probably want them either both in or both out together.

sjoerdmeijer · 2024-06-19T09:54:25Z

Could you combine this into #95819? They look atomic together and we would probably want them either both in or both out together.

Yep, sure, will do. Makes my life a bit easier too. Cheers.

paulwalker-arm · 2024-09-05T15:34:32Z

@sjoerdmeijer Is this PR still relevant?

llvmbot added backend:AArch64 llvm:analysis Includes value tracking, cost tables and constant folding labels Jun 17, 2024

sjoerdmeijer mentioned this pull request Jun 17, 2024

[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2) #95819

Merged

sjoerdmeijer requested review from davemgreen, paulwalker-arm and david-arm June 17, 2024 18:10

sjoerdmeijer force-pushed the tti-prefer-fixed branch from 9e3fbe4 to 0a3352a Compare June 17, 2024 18:19

davemgreen reviewed Jun 18, 2024

View reviewed changes

david-arm removed their request for review October 9, 2024 09:20

paulwalker-arm removed their request for review October 11, 2024 09:37

sjoerdmeijer closed this Nov 14, 2024

sjoerdmeijer deleted the tti-prefer-fixed branch November 14, 2024 09:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TTI][AArch64] Add preferFixedIfEqualToScalable hook #95818

[TTI][AArch64] Add preferFixedIfEqualToScalable hook #95818

Uh oh!

sjoerdmeijer commented Jun 17, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jun 17, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Jun 17, 2024 •

edited

Loading

Uh oh!

davemgreen Jun 18, 2024

Uh oh!

david-arm Jun 18, 2024

Uh oh!

davemgreen Jun 18, 2024

Uh oh!

david-arm Jun 18, 2024

Uh oh!

david-arm Jun 18, 2024

Uh oh!

sjoerdmeijer Jun 18, 2024

Uh oh!

davemgreen commented Jun 18, 2024

Uh oh!

sjoerdmeijer commented Jun 19, 2024

Uh oh!

paulwalker-arm commented Sep 5, 2024

Uh oh!

Uh oh!

[TTI][AArch64] Add preferFixedIfEqualToScalable hook #95818

[TTI][AArch64] Add preferFixedIfEqualToScalable hook #95818

Uh oh!

Conversation

sjoerdmeijer commented Jun 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

david-arm Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

davemgreen Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

david-arm Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

david-arm Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

sjoerdmeijer Jun 18, 2024

Choose a reason for hiding this comment

Uh oh!

davemgreen commented Jun 18, 2024

Uh oh!

sjoerdmeijer commented Jun 19, 2024

Uh oh!

paulwalker-arm commented Sep 5, 2024

Uh oh!

Uh oh!

sjoerdmeijer commented Jun 17, 2024 •

edited

Loading

llvmbot commented Jun 17, 2024 •

edited

Loading

github-actions bot commented Jun 17, 2024 •

edited

Loading