Skip to content

[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 2, 2024

Conversation

MacDue
Copy link
Member

@MacDue MacDue commented Sep 30, 2024

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return multiple values (such as llvm.sincos).

This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-ir

Author: Benjamin Maxwell (MacDue)

Changes

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return multiple values (such as llvm.sincos).


Full diff: https://github.com/llvm/llvm-project/pull/110506.diff

6 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+10-9)
  • (modified) llvm/include/llvm/IR/DerivedTypes.h (+4)
  • (modified) llvm/include/llvm/IR/Operator.h (+12-2)
  • (modified) llvm/lib/IR/Type.cpp (+7-6)
  • (modified) llvm/test/Bitcode/compatibility.ll (+20)
  • (modified) llvm/unittests/IR/InstructionsTest.cpp (+34-6)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3f39d58b322a4f..1eb2982385fda0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -12472,9 +12472,8 @@ instruction's return value on the same edge).
 The optional ``fast-math-flags`` marker indicates that the phi has one
 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
-are only valid for phis that return a floating-point scalar or vector
-type, or an array (nested to any depth) of floating-point scalar or vector
-types.
+are only valid for phis that return a floating-point scalar or vector type,
+possibly within an array (nested to any depth), or a homogeneous struct literal.
 
 Semantics:
 """"""""""
@@ -12523,8 +12522,8 @@ class <t_firstclass>` type.
 #. The optional ``fast-math flags`` marker indicates that the select has one or more
    :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
-   for selects that return a floating-point scalar or vector type, or an array
-   (nested to any depth) of floating-point scalar or vector types.
+   for selects that return a floating-point scalar or vector type, possibly
+   within an array (nested to any depth), or a homogeneous struct literal.
 
 Semantics:
 """"""""""
@@ -12762,8 +12761,8 @@ This instruction requires several arguments:
 #. The optional ``fast-math flags`` marker indicates that the call has one or more
    :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
-   for calls that return a floating-point scalar or vector type, or an array
-   (nested to any depth) of floating-point scalar or vector types.
+   for calls that return a floating-point scalar or vector type, possibly within
+   an array (nested to any depth), or a homogeneous struct literal.
 
 #. The optional "cconv" marker indicates which :ref:`calling
    convention <callingconv>` the call should use. If none is
@@ -20528,7 +20527,8 @@ the explicit vector length.
    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
    enable otherwise unsafe floating-point optimizations. Fast-math flags are
    only valid for selects that return a floating-point scalar or vector type,
-   or an array (nested to any depth) of floating-point scalar or vector types.
+   possibly within an array (nested to any depth), or a homogeneous struct
+   literal.
 
 Semantics:
 """"""""""
@@ -20586,7 +20586,8 @@ is the pivot.
    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
    enable otherwise unsafe floating-point optimizations. Fast-math flags are
    only valid for merges that return a floating-point scalar or vector type,
-   or an array (nested to any depth) of floating-point scalar or vector types.
+   possibly within an array (nested to any depth), or a homogeneous struct
+   literal.
 
 Semantics:
 """"""""""
diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h
index 975c142f1a4572..a24801d8bdf834 100644
--- a/llvm/include/llvm/IR/DerivedTypes.h
+++ b/llvm/include/llvm/IR/DerivedTypes.h
@@ -301,6 +301,10 @@ class StructType : public Type {
   ///  {<vscale x 2 x i32>, <vscale x 4 x i64>}}
   bool containsHomogeneousScalableVectorTypes() const;
 
+  /// Return true if this struct is non-empty and all element types are the
+  /// same.
+  bool containsHomogeneousTypes() const;
+
   /// Return true if this is a named struct that has a non-empty name.
   bool hasName() const { return SymbolTableEntry != nullptr; }
 
diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index 88b9bfc0be4b15..22ffcc730e7b68 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -15,6 +15,7 @@
 #define LLVM_IR_OPERATOR_H
 
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/TypeSwitch.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/FMF.h"
 #include "llvm/IR/GEPNoWrapFlags.h"
@@ -351,8 +352,17 @@ class FPMathOperator : public Operator {
     case Instruction::Select:
     case Instruction::Call: {
       Type *Ty = V->getType();
-      while (ArrayType *ArrTy = dyn_cast<ArrayType>(Ty))
-        Ty = ArrTy->getElementType();
+      TypeSwitch<Type *>(Ty)
+          .Case([&](StructType *StructTy) {
+            if (!StructTy->isLiteral() || !StructTy->containsHomogeneousTypes())
+              return;
+            Ty = StructTy->elements().front();
+          })
+          .Case([&](ArrayType *ArrTy) {
+            do {
+              Ty = ArrTy->getElementType();
+            } while ((ArrTy = dyn_cast<ArrayType>(Ty)));
+          });
       return Ty->isFPOrFPVectorTy();
     }
     default:
diff --git a/llvm/lib/IR/Type.cpp b/llvm/lib/IR/Type.cpp
index 3784ad28d7219d..f618263f79c313 100644
--- a/llvm/lib/IR/Type.cpp
+++ b/llvm/lib/IR/Type.cpp
@@ -430,13 +430,14 @@ bool StructType::containsScalableVectorType(
 }
 
 bool StructType::containsHomogeneousScalableVectorTypes() const {
-  Type *FirstTy = getNumElements() > 0 ? elements()[0] : nullptr;
-  if (!FirstTy || !isa<ScalableVectorType>(FirstTy))
+  if (getNumElements() <= 0 || !isa<ScalableVectorType>(elements().front()))
     return false;
-  for (Type *Ty : elements())
-    if (Ty != FirstTy)
-      return false;
-  return true;
+  return containsHomogeneousTypes();
+}
+
+bool StructType::containsHomogeneousTypes() const {
+  ArrayRef<Type *> ElementTys = elements();
+  return !ElementTys.empty() && all_equal(ElementTys);
 }
 
 void StructType::setBody(ArrayRef<Type*> Elements, bool isPacked) {
diff --git a/llvm/test/Bitcode/compatibility.ll b/llvm/test/Bitcode/compatibility.ll
index ea29ff634a43bb..4fe9d9b11f8831 100644
--- a/llvm/test/Bitcode/compatibility.ll
+++ b/llvm/test/Bitcode/compatibility.ll
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
   ret void
 }
 
+declare { float, float } @fmf_struct_f32()
+declare { double, double } @fmf_struct_f64()
+declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+; CHECK-LABEL: fastMathFlagsForStructCalls(
+define void @fastMathFlagsForStructCalls({ float, float } %f, { double, double } %d1, { <4 x double>, <4 x double> } %d2) {
+  %call.fast = call fast { float, float } @fmf_struct_f32()
+  ; CHECK: %call.fast = call fast { float, float } @fmf_struct_f32()
+
+  ; Throw in some other attributes to make sure those stay in the right places.
+
+  %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+  ; CHECK: %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+
+  %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+  ; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+  ret void
+}
+
 ;; Type System
 %opaquety = type opaque
 define void @typesystem() {
diff --git a/llvm/unittests/IR/InstructionsTest.cpp b/llvm/unittests/IR/InstructionsTest.cpp
index 481fe96607e48e..9d8056a768af2f 100644
--- a/llvm/unittests/IR/InstructionsTest.cpp
+++ b/llvm/unittests/IR/InstructionsTest.cpp
@@ -1559,12 +1559,40 @@ TEST(InstructionsTest, FPCallIsFPMathOperator) {
       CallInst::Create(AVFFnTy, AVFCallee, {}, ""));
   EXPECT_TRUE(isa<FPMathOperator>(AVFCall));
 
-  Type *AAVFTy = ArrayType::get(AVFTy, 2);
-  FunctionType *AAVFFnTy = FunctionType::get(AAVFTy, {});
-  Value *AAVFCallee = Constant::getNullValue(PtrTy);
-  std::unique_ptr<CallInst> AAVFCall(
-      CallInst::Create(AAVFFnTy, AAVFCallee, {}, ""));
-  EXPECT_TRUE(isa<FPMathOperator>(AAVFCall));
+  Type *StructITy = StructType::get(ITy, ITy);
+  FunctionType *StructIFnTy = FunctionType::get(StructITy, {});
+  Value *StructICallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructICall(
+      CallInst::Create(StructIFnTy, StructICallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(StructICall));
+
+  Type *NamedStructFTy = StructType::create({FTy, FTy}, "AStruct");
+  FunctionType *NamedStructFFnTy = FunctionType::get(NamedStructFTy, {});
+  Value *NamedStructFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> NamedStructFCall(
+      CallInst::Create(NamedStructFFnTy, NamedStructFCallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(NamedStructFCall));
+
+  Type *MixedStructTy = StructType::get(FTy, ITy);
+  FunctionType *MixedStructFnTy = FunctionType::get(MixedStructTy, {});
+  Value *MixedStructCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> MixedStructCall(
+      CallInst::Create(MixedStructFnTy, MixedStructCallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(MixedStructCall));
+
+  Type *StructFTy = StructType::get(FTy, FTy);
+  FunctionType *StructFFnTy = FunctionType::get(StructFTy, {});
+  Value *StructFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructFCall(
+      CallInst::Create(StructFFnTy, StructFCallee, {}, ""));
+  EXPECT_TRUE(isa<FPMathOperator>(StructFCall));
+
+  Type *StructVFTy = StructType::get(VFTy, VFTy);
+  FunctionType *StructVFFnTy = FunctionType::get(StructVFTy, {});
+  Value *StructVFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructVFCall(
+      CallInst::Create(StructVFFnTy, StructVFCallee, {}, ""));
+  EXPECT_TRUE(isa<FPMathOperator>(StructVFCall));
 }
 
 TEST(InstructionsTest, FNegInstruction) {

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this extension.

type, or an array (nested to any depth) of floating-point scalar or vector
types.
are only valid for phis that return a floating-point scalar or vector type,
possibly within an array (nested to any depth), or a homogeneous struct literal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "possibly", and elaborate on what a homogenous struct means (also it's not a literal?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "homogeneous literal struct". "literal" here means "unnamed".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "possibly"

I think removing that makes it sound like the float/vector type is required to be within an array or struct type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just list the cases that work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've listed the cases under the "Fast-Math Flags" section and updated the instructions to reference that. I think that's a little clearer than trying to squash call the cases into a single sentence (and makes future updates easier).

I've also updated one of the struct examples to point out it's a homogeneous struct.

@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
ret void
}

declare { float, float } @fmf_struct_f32()
declare { double, double } @fmf_struct_f64()
declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test some cases with more than 2 entries

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And empty and an odd number of entries

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And complex nested structs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested structs and empty structs are not supported (so can't have the attributes in the IR without an error), so I'll instead add those tests to the InstructionsTest.cpp unittest


%call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about struct of array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a test, but it's not a supported case (as I don't currently see a need for it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test with nofpclass attributes on the return / argument? The intent was it would be allowed for the same types as FPMathOperator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nofpclass used a separate check, so I had to update it to support struct types (in the last commit). Not sure if it should be part of this PR, or moved to a later PR though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 labels Sep 30, 2024
@MacDue MacDue force-pushed the fmf_struct_literals branch from ad12ad2 to 8d3353c Compare September 30, 2024 17:14
Copy link

github-actions bot commented Sep 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@MacDue MacDue force-pushed the fmf_struct_literals branch from 6e7bebc to 04b7d82 Compare September 30, 2024 18:55
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the modified clang tests are failing on the pre-merge CI?

�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123�  Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/nofpclass.c

@MacDue
Copy link
Member Author

MacDue commented Oct 1, 2024

It looks like the modified clang tests are failing on the pre-merge CI?

�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123�  Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/nofpclass.c

Ah thanks, they need to be updated for the nofpclass changes too now 👍

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my side, but please wait on @arsenm's approval as well.

@MacDue MacDue force-pushed the fmf_struct_literals branch from b4ef422 to 4687a18 Compare October 1, 2024 15:34
@MacDue MacDue force-pushed the fmf_struct_literals branch from 4687a18 to 98bd284 Compare October 1, 2024 16:11
@MacDue MacDue merged commit 95f00a6 into llvm:main Oct 2, 2024
9 checks passed
@MacDue MacDue deleted the fmf_struct_literals branch October 2, 2024 09:05
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
…lvm#110506)

This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
@mikaelholmen
Copy link
Collaborator

Hi,

I bisected a crash to this patch. I can't share the C reproducer but it's instcombine that crashes and a reduced reproducer for that is
opt -passes=instcombine bbi-99792.ll -o /dev/null

bbi-99792.ll.gz

@MacDue
Copy link
Member Author

MacDue commented Oct 4, 2024

Hi,

I bisected a crash to this patch. I can't share the C reproducer but it's instcombine that crashes and a reduced reproducer for that is
opt -passes=instcombine bbi-99792.ll -o /dev/null

bbi-99792.ll.gz

Thanks for reporting this, I'll have a look into the crash today 👍

@MacDue
Copy link
Member Author

MacDue commented Oct 4, 2024

It looks like this crash is not unique to struct types, I can reproduce it with array types too (which have been allowed for some time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants