Skip to content

[ARM]Adjust cost of muls in (U/S)MLAL and patterns #122713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 19, 2025
Merged

Conversation

nasherm
Copy link
Contributor

@nasherm nasherm commented Jan 13, 2025

PR #117350 made changes to the SLP vectorizer which introduced a regression on some ARM benchmarks. Investigation narrowed it down to suboptimal codegen for benchmarks that previously only used scalar SMLAL instructions. The linked change meant the SLPVectorizer thought that these could be vectorized. This change makes the cost of muls in (U/S)MLAL patterns slightly cheaper to make sure scalar instructions are preferred in these cases over SLP vectorization

@llvmbot llvmbot added backend:ARM vectorizers llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Jan 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 13, 2025

@llvm/pr-subscribers-vectorizers
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Nashe Mncube (nasherm)

Changes

PR #117350 made changes to the SLP vectorizer which introduced a regression on some ARM benchmarks. This was due to the changes assuming that SExt/ZExt vector instructions have constant cost. This behaviour is expected for RISCV but not on ARM where we take into account source and destination type of SExt/ZExt instructions when calculating vector cost.


Full diff: https://github.com/llvm/llvm-project/pull/122713.diff

6 Files Affected:

  • (modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (+9)
  • (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+2)
  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+3)
  • (modified) llvm/lib/Analysis/TargetTransformInfo.cpp (+8)
  • (modified) llvm/lib/Target/ARM/ARMTargetTransformInfo.h (+5)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+2-1)
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index 752313ab15858c..bac6500ed7184d 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -1184,6 +1184,11 @@ class TargetTransformInfo {
   /// \return true if vscale is known to be a power of 2
   bool isVScaleKnownToBeAPowerOfTwo() const;
 
+  // \return true if vector implementations assume that SExt and ZExt
+  // instructions have a fixed cost.
+  bool isSExtCostConstant() const;
+  bool isZExtCostConstant() const;
+
   /// \return True if the vectorization factor should be chosen to
   /// make the vector of the smallest element type match the size of a
   /// vector register. For wider element types, this could result in
@@ -2065,6 +2070,8 @@ class TargetTransformInfo::Concept {
   virtual std::optional<unsigned> getMaxVScale() const = 0;
   virtual std::optional<unsigned> getVScaleForTuning() const = 0;
   virtual bool isVScaleKnownToBeAPowerOfTwo() const = 0;
+  virtual bool isSExtCostConstant() const = 0;
+  virtual bool isZExtCostConstant() const = 0;
   virtual bool
   shouldMaximizeVectorBandwidth(TargetTransformInfo::RegisterKind K) const = 0;
   virtual ElementCount getMinimumVF(unsigned ElemWidth,
@@ -2719,6 +2726,8 @@ class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
   bool isVScaleKnownToBeAPowerOfTwo() const override {
     return Impl.isVScaleKnownToBeAPowerOfTwo();
   }
+  bool isSExtCostConstant() const override { return Impl.isSExtCostConstant(); }
+  bool isZExtCostConstant() const override { return Impl.isZExtCostConstant(); }
   bool shouldMaximizeVectorBandwidth(
       TargetTransformInfo::RegisterKind K) const override {
     return Impl.shouldMaximizeVectorBandwidth(K);
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 9c74b2a0c31df1..3120e27139d358 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -530,6 +530,8 @@ class TargetTransformInfoImplBase {
   std::optional<unsigned> getMaxVScale() const { return std::nullopt; }
   std::optional<unsigned> getVScaleForTuning() const { return std::nullopt; }
   bool isVScaleKnownToBeAPowerOfTwo() const { return false; }
+  bool isSExtCostConstant() const { return true; }
+  bool isZExtCostConstant() const { return true; }
 
   bool
   shouldMaximizeVectorBandwidth(TargetTransformInfo::RegisterKind K) const {
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index c9f142d64ae9e4..5785aa8d74b61c 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -801,6 +801,9 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
   std::optional<unsigned> getVScaleForTuning() const { return std::nullopt; }
   bool isVScaleKnownToBeAPowerOfTwo() const { return false; }
 
+  bool isSExtCostConstant() const { return true; }
+  bool isZExtCostConstant() const { return true; }
+
   /// Estimate the overhead of scalarizing an instruction. Insert and Extract
   /// are set if the demanded result elements need to be inserted and/or
   /// extracted from vectors.
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index b32dffa9f0fe86..63baea591bac45 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -1405,6 +1405,14 @@ bool TargetTransformInfo::isProfitableToSinkOperands(
   return TTIImpl->isProfitableToSinkOperands(I, OpsToSink);
 }
 
+bool TargetTransformInfo::isSExtCostConstant() const {
+  return TTIImpl->isSExtCostConstant();
+}
+
+bool TargetTransformInfo::isZExtCostConstant() const {
+  return TTIImpl->isZExtCostConstant();
+}
+
 bool TargetTransformInfo::isVectorShiftByScalarCheap(Type *Ty) const {
   return TTIImpl->isVectorShiftByScalarCheap(Ty);
 }
diff --git a/llvm/lib/Target/ARM/ARMTargetTransformInfo.h b/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
index 3a4f940088b2e3..c2522b53987a3b 100644
--- a/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
+++ b/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
@@ -109,6 +109,11 @@ class ARMTTIImpl : public BasicTTIImplBase<ARMTTIImpl> {
 
   bool enableInterleavedAccessVectorization() { return true; }
 
+  // Cost model for vector SExt and ZExt takes into account
+  // source and destination vector type for MVE and NEON.
+  bool isSExtCostConstant() const { return false; }
+  bool isZExtCostConstant() const { return false; }
+
   TTI::AddressingModeKind
     getPreferredAddressingMode(const Loop *L, ScalarEvolution *SE) const;
 
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index c4582df89213d8..7813c1248f9fcf 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -11440,7 +11440,8 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
                                 I->getOpcode());
           });
       if (IsArithmeticExtendedReduction &&
-          (VecOpcode == Instruction::ZExt || VecOpcode == Instruction::SExt))
+          (VecOpcode == Instruction::ZExt || VecOpcode == Instruction::SExt) &&
+          (TTI->isZExtCostConstant() && TTI->isSExtCostConstant()))
         return CommonCost;
       return CommonCost +
              TTI->getCastInstrCost(VecOpcode, VecTy, SrcVecTy, CCH, CostKind,

@llvmbot
Copy link
Member

llvmbot commented Jan 13, 2025

@llvm/pr-subscribers-backend-arm

Author: Nashe Mncube (nasherm)

Changes

PR #117350 made changes to the SLP vectorizer which introduced a regression on some ARM benchmarks. This was due to the changes assuming that SExt/ZExt vector instructions have constant cost. This behaviour is expected for RISCV but not on ARM where we take into account source and destination type of SExt/ZExt instructions when calculating vector cost.


Full diff: https://github.com/llvm/llvm-project/pull/122713.diff

6 Files Affected:

  • (modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (+9)
  • (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+2)
  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+3)
  • (modified) llvm/lib/Analysis/TargetTransformInfo.cpp (+8)
  • (modified) llvm/lib/Target/ARM/ARMTargetTransformInfo.h (+5)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+2-1)
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index 752313ab15858c..bac6500ed7184d 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -1184,6 +1184,11 @@ class TargetTransformInfo {
   /// \return true if vscale is known to be a power of 2
   bool isVScaleKnownToBeAPowerOfTwo() const;
 
+  // \return true if vector implementations assume that SExt and ZExt
+  // instructions have a fixed cost.
+  bool isSExtCostConstant() const;
+  bool isZExtCostConstant() const;
+
   /// \return True if the vectorization factor should be chosen to
   /// make the vector of the smallest element type match the size of a
   /// vector register. For wider element types, this could result in
@@ -2065,6 +2070,8 @@ class TargetTransformInfo::Concept {
   virtual std::optional<unsigned> getMaxVScale() const = 0;
   virtual std::optional<unsigned> getVScaleForTuning() const = 0;
   virtual bool isVScaleKnownToBeAPowerOfTwo() const = 0;
+  virtual bool isSExtCostConstant() const = 0;
+  virtual bool isZExtCostConstant() const = 0;
   virtual bool
   shouldMaximizeVectorBandwidth(TargetTransformInfo::RegisterKind K) const = 0;
   virtual ElementCount getMinimumVF(unsigned ElemWidth,
@@ -2719,6 +2726,8 @@ class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
   bool isVScaleKnownToBeAPowerOfTwo() const override {
     return Impl.isVScaleKnownToBeAPowerOfTwo();
   }
+  bool isSExtCostConstant() const override { return Impl.isSExtCostConstant(); }
+  bool isZExtCostConstant() const override { return Impl.isZExtCostConstant(); }
   bool shouldMaximizeVectorBandwidth(
       TargetTransformInfo::RegisterKind K) const override {
     return Impl.shouldMaximizeVectorBandwidth(K);
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 9c74b2a0c31df1..3120e27139d358 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -530,6 +530,8 @@ class TargetTransformInfoImplBase {
   std::optional<unsigned> getMaxVScale() const { return std::nullopt; }
   std::optional<unsigned> getVScaleForTuning() const { return std::nullopt; }
   bool isVScaleKnownToBeAPowerOfTwo() const { return false; }
+  bool isSExtCostConstant() const { return true; }
+  bool isZExtCostConstant() const { return true; }
 
   bool
   shouldMaximizeVectorBandwidth(TargetTransformInfo::RegisterKind K) const {
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index c9f142d64ae9e4..5785aa8d74b61c 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -801,6 +801,9 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
   std::optional<unsigned> getVScaleForTuning() const { return std::nullopt; }
   bool isVScaleKnownToBeAPowerOfTwo() const { return false; }
 
+  bool isSExtCostConstant() const { return true; }
+  bool isZExtCostConstant() const { return true; }
+
   /// Estimate the overhead of scalarizing an instruction. Insert and Extract
   /// are set if the demanded result elements need to be inserted and/or
   /// extracted from vectors.
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index b32dffa9f0fe86..63baea591bac45 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -1405,6 +1405,14 @@ bool TargetTransformInfo::isProfitableToSinkOperands(
   return TTIImpl->isProfitableToSinkOperands(I, OpsToSink);
 }
 
+bool TargetTransformInfo::isSExtCostConstant() const {
+  return TTIImpl->isSExtCostConstant();
+}
+
+bool TargetTransformInfo::isZExtCostConstant() const {
+  return TTIImpl->isZExtCostConstant();
+}
+
 bool TargetTransformInfo::isVectorShiftByScalarCheap(Type *Ty) const {
   return TTIImpl->isVectorShiftByScalarCheap(Ty);
 }
diff --git a/llvm/lib/Target/ARM/ARMTargetTransformInfo.h b/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
index 3a4f940088b2e3..c2522b53987a3b 100644
--- a/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
+++ b/llvm/lib/Target/ARM/ARMTargetTransformInfo.h
@@ -109,6 +109,11 @@ class ARMTTIImpl : public BasicTTIImplBase<ARMTTIImpl> {
 
   bool enableInterleavedAccessVectorization() { return true; }
 
+  // Cost model for vector SExt and ZExt takes into account
+  // source and destination vector type for MVE and NEON.
+  bool isSExtCostConstant() const { return false; }
+  bool isZExtCostConstant() const { return false; }
+
   TTI::AddressingModeKind
     getPreferredAddressingMode(const Loop *L, ScalarEvolution *SE) const;
 
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index c4582df89213d8..7813c1248f9fcf 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -11440,7 +11440,8 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
                                 I->getOpcode());
           });
       if (IsArithmeticExtendedReduction &&
-          (VecOpcode == Instruction::ZExt || VecOpcode == Instruction::SExt))
+          (VecOpcode == Instruction::ZExt || VecOpcode == Instruction::SExt) &&
+          (TTI->isZExtCostConstant() && TTI->isSExtCostConstant()))
         return CommonCost;
       return CommonCost +
              TTI->getCastInstrCost(VecOpcode, VecTy, SrcVecTy, CCH, CostKind,

@alexey-bataev
Copy link
Member

You just need to modify TTI::getExtendedReductionCost function, no changes in SLP are required. Also, provide a test

@nasherm
Copy link
Contributor Author

nasherm commented Jan 20, 2025

You just need to modify TTI::getExtendedReductionCost function, no changes in SLP are required. Also, provide a test

I've updated the patch with this in mind

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MVE instruction that getExtendedReductionCost is trying to model is vaddlv, which is a i64 vecreduce(v4i64 sext(v4i32)). So it would seem odd to add the cost of the extend again to it. BaseT::getExtendedReductionCost includes the cast cost too, so it would look like this is double-counting.

This is an example of MVE using the vaddlv instruction, which looks decent in this small example: https://llvm.godbolt.org/z/oTrMTcM4E

@nasherm nasherm force-pushed the onb-slp branch 3 times, most recently from bb1060c to 28b9d6a Compare January 21, 2025 13:09
@nasherm nasherm requested a review from davemgreen January 21, 2025 14:16
@nasherm nasherm marked this pull request as draft January 22, 2025 10:17
@nasherm
Copy link
Contributor Author

nasherm commented Jan 22, 2025

My most recent patch fixes the regression with a one line change to ARMTTI. From what I understand, MVE vectors greater than 128 bits aren't handled very well. It appears that the changes to SLP revealed a bug where we were allowing extensions of the type vadd v4i32 -> vadd v4i64.

@davemgreen does my fix make sense with this reasoning? I'm skeptical as I suspect I'm missing something which justifies the original code which allowed this extension.

@nasherm nasherm marked this pull request as ready for review January 22, 2025 13:04
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is usually the input type that matters for MVE, and a vadd v4i32 -> vadd v4i64 reduction is still a single vaddlv. The comment IIRC was talking about things like v8i32->i64 reductions, that would need to be split into two v4i32 reductions but is difficult under MVE due to the way predicates and extensions work.

There are some Loop-vectorize test that would need to be updated, but I believe we would still want to generate an extending reduction pattern. There might be something more complex going on where a mixture of MVE and scalar code doesn't perform as well as sticking with scalar. Or maybe the SLP vectorizer isn't handling multiple uses properly, or the scalar cost should be a little lower for SMLAL DSP instructions.

@nasherm nasherm changed the title [ARM][SLP] Fix cost function for SLP Vectorization of ZExt/SExt [ARM][SLP] Adjust cost returned for vadd reduction Jan 30, 2025
@nasherm
Copy link
Contributor Author

nasherm commented Jan 30, 2025

I've opted for adjusting the cost returned by getExtendedReductionCost for vadd reductions to be slightly more pessimistic. This returns performance numbers on regressed benchmarks while not changing SLPVectorizer and making minimal changes to the rest of the compiler.

@davemgreen
Copy link
Collaborator

There is a test failing, which I think says "we shouldn't do this". I would expect the cost of a VADDV and VADDLV to be about the same as other MVE instructions, unless we have a strong reason otherwise.

nasherm added 2 commits March 5, 2025 17:06
…SExt

PR llvm#117350 made changes to the SLP vectorizer which introduced
a regression on ARM vectorization benchmarks. This was due
to the changes assuming that SExt/ZExt vector instructions have
constant cost. This behaviour is expected for RISCV but not on ARM
where we take into account source and destination type of SExt/ZExt
instructions when calculating vector cost.

Change-Id: I6f995dcde26e5aaf62b779b63e52988fb333f941
Change-Id: Ie2795e0e5ebb0589146eaf07c752410e307a36e6
Changes to the SLPVectorizer caused a regression
in some benchmarks which targetted cores that support
both DSP and MVE instructions. The particular regression
has been reduced to a case where MUL instructions that
are part of a chain of instructions that can be replaced
with DSP SMLAL may also be vectorized. The generated
code ends up being an inefficient of both scalar
and vector ops rather than leaning to one or the other.
By reducing the cost of these MUL instructions in these
patterns we recover lost performance.

Change-Id: I302817cf4fcd18a11d40fba430c44e034a36448b
@nasherm nasherm changed the title [ARM][SLP] Adjust cost returned for vadd reduction [ARM]Adjust cost of muls in SMLAL patterns Mar 12, 2025
@nasherm
Copy link
Contributor Author

nasherm commented Mar 12, 2025

Update:
After revisiting this I narrowed down the cause of our regressions. Changes to the SLPVectorizer caused a regression in some benchmarks which targeted cores that support both DSP and MVE instructions. The particular regression has been reduced to a case where MUL instructions that are part of a chain of instructions that can be replaced with DSP SMLAL may also be vectorized. The generated code ends up being an inefficient combination of both scalar and vector ops rather than leaning to one or the other. By reducing the cost of these MUL instructions in these patterns we recover lost performance.

This means that my changes don't touch the SLPVectorizer and are reduced to just some changes to the cost model

@nasherm nasherm requested a review from davemgreen March 12, 2025 16:38
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - it looks like this will work quite well.

return false;

auto IsSExtInst = [](const Value *V) -> bool {
return (dyn_cast<SExtInst>(V)) ? true : false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return isa(V);

Is it worth adding ZExt at the same time for UMULL?

Copy link
Contributor Author

@nasherm nasherm Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Surprisingly no. Vectorization for my reproducer using unsigned types is done quite well

Correction. I double checked and yes this may be beneficial

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So by adding ZExt we do avoid mixing vector and scalar instructions. The generated code doesn't take advantage of UMULL instructions however and just sticks with simple scalar ops. This was the behavior before the changes to the SLPVectorizer were made.

I do think that once this is merged there is further improvement to be done on SMLAL code gen which I intend to investigate i.e further code folding using DSP instructions. I could add UMULL codegen into my investigation as well.

Regardless adding ZExt support will just return previous behavior so I don't believe it to be a concern.

Change-Id: Ie8c40972ae07e568d767ace37b9dda0f77272569
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - other than the inline comments this looks good to me.

Change-Id: I0c20291a8a1d2cbc895890d02aea963323ee48cb
@nasherm nasherm changed the title [ARM]Adjust cost of muls in SMLAL patterns [ARM]Adjust cost of muls in (U/S)MLAL and patterns Mar 18, 2025
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

Change-Id: I9ce62da8405a08ece5e36c7e1ea0bb93e6f0b6c4
@nasherm nasherm merged commit 4ddc8df into llvm:main Mar 19, 2025
11 checks passed
nasherm added a commit to nasherm/llvm-project that referenced this pull request Mar 19, 2025
Buildbot failures were caused by PR llvm#122713. This
was due to unused captures in a lambda function.

Change-Id: Ie0907396c2551190dd982bfa180934b4075271d8
auto IsExtInst = [](const Value *V) -> bool {
return isa<ZExtInst>(V) || isa<SExtInst>(V);
};
auto IsExtensionFromHalf = [&, IsExtInst](const Value *V) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsExtInst not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Florian, yes. I saw it triggered some buildbot failures. I opened #132018 to fix this. Is it preferable to use this merged branch?

Comment on lines +1506 to +1509
for (auto *U : I->users())
if (!IsExtInst(U))
return false;
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this just be

return all_of(I->users(), IsExtInst)?

nasherm added a commit that referenced this pull request Mar 19, 2025
Buildbot failures were caused by PR #122713. This was due to unused captures in a lambda function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms vectorizers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants