Skip to content

[FMF] Set all bits if needed when setting individual flags. #131321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 15, 2025

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Mar 14, 2025

Currently fast() won't return true if all flags are set via setXXX, which is surprising. Update setters to set all bits if needed to make sure isFast() consistently returns the expected result.

Currently fast() won't return true if all flags are set via setXXX,
which is surprising. Update setters to set all bits if needed to
make sure isFast() consistently returns the expected result.
@llvmbot
Copy link
Member

llvmbot commented Mar 14, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-ir

Author: Florian Hahn (fhahn)

Changes

Currently fast() won't return true if all flags are set via setXXX, which is surprising. Update setters to set all bits if needed to make sure isFast() consistently returns the expected result.


Full diff: https://github.com/llvm/llvm-project/pull/131321.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/FMF.h (+13-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/vplan-printing.ll (+1-1)
diff --git a/llvm/include/llvm/IR/FMF.h b/llvm/include/llvm/IR/FMF.h
index d204d0a0e9d9a..d6ecb8f983ae1 100644
--- a/llvm/include/llvm/IR/FMF.h
+++ b/llvm/include/llvm/IR/FMF.h
@@ -23,14 +23,16 @@ class FastMathFlags {
 
   unsigned Flags = 0;
 
-  FastMathFlags(unsigned F) {
+  FastMathFlags(unsigned F) : Flags(F) {
+    setAllBitsIfNeeded();
+  }
+
+  void setAllBitsIfNeeded() {
     // If all 7 bits are set, turn this into -1. If the number of bits grows,
     // this must be updated. This is intended to provide some forward binary
     // compatibility insurance for the meaning of 'fast' in case bits are added.
-    if (F == 0x7F) Flags = ~0U;
-    else Flags = F;
+    if (Flags == 0x7F) Flags = ~0U;
   }
-
 public:
   // This is how the bits are used in Value::SubclassOptionalData so they
   // should fit there too.
@@ -75,24 +77,31 @@ class FastMathFlags {
   /// Flag setters
   void setAllowReassoc(bool B = true) {
     Flags = (Flags & ~AllowReassoc) | B * AllowReassoc;
+    setAllBitsIfNeeded();
   }
   void setNoNaNs(bool B = true) {
     Flags = (Flags & ~NoNaNs) | B * NoNaNs;
+    setAllBitsIfNeeded();
   }
   void setNoInfs(bool B = true) {
     Flags = (Flags & ~NoInfs) | B * NoInfs;
+    setAllBitsIfNeeded();
   }
   void setNoSignedZeros(bool B = true) {
     Flags = (Flags & ~NoSignedZeros) | B * NoSignedZeros;
+    setAllBitsIfNeeded();
   }
   void setAllowReciprocal(bool B = true) {
     Flags = (Flags & ~AllowReciprocal) | B * AllowReciprocal;
+    setAllBitsIfNeeded();
   }
   void setAllowContract(bool B = true) {
     Flags = (Flags & ~AllowContract) | B * AllowContract;
+    setAllBitsIfNeeded();
   }
   void setApproxFunc(bool B = true) {
     Flags = (Flags & ~ApproxFunc) | B * ApproxFunc;
+    setAllBitsIfNeeded();
   }
   void setFast(bool B = true) { B ? set() : clear(); }
 
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll b/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
index a119707bed120..01642a991f8b9 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
@@ -26,7 +26,7 @@ target triple = "arm64-apple-ios"
 ; CHECK-NEXT:     vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.src>
 ; CHECK-NEXT:     WIDEN ir<%l> = load vp<[[VEC_PTR]]>
 ; CHECK-NEXT:     WIDEN-CAST ir<%conv> = fpext ir<%l> to double
-; CHECK-NEXT:     WIDEN-CALL ir<%s> = call reassoc nnan ninf nsz arcp contract afn @llvm.sin.f64(ir<%conv>) (using library function: __simd_sin_v2f64)
+; CHECK-NEXT:     WIDEN-CALL ir<%s> = call fast @llvm.sin.f64(ir<%conv>) (using library function: __simd_sin_v2f64)
 ; CHECK-NEXT:     REPLICATE ir<%gep.dst> = getelementptr inbounds ir<%dst>, vp<[[STEPS]]>
 ; CHECK-NEXT:     REPLICATE store ir<%s>, ir<%gep.dst>
 ; CHECK-NEXT:     EMIT vp<[[CAN_IV_NEXT:%.+]]> = add nuw vp<[[CAN_IV]]>, vp<[[VFxUF]]>
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-printing.ll b/llvm/test/Transforms/LoopVectorize/vplan-printing.ll
index 00d8de67a3b40..cbde30a29b44f 100644
--- a/llvm/test/Transforms/LoopVectorize/vplan-printing.ll
+++ b/llvm/test/Transforms/LoopVectorize/vplan-printing.ll
@@ -800,7 +800,7 @@ define void @print_fast_math_flags(i64 %n, ptr noalias %y, ptr noalias %x, ptr %
 ; CHECK-NEXT:   vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.y>
 ; CHECK-NEXT:   WIDEN ir<%lv> = load vp<[[VEC_PTR]]>
 ; CHECK-NEXT:   WIDEN ir<%add> = fadd nnan ir<%lv>, ir<1.000000e+00>
-; CHECK-NEXT:   WIDEN ir<%mul> = fmul reassoc nnan ninf nsz arcp contract afn ir<%add>, ir<2.000000e+00>
+; CHECK-NEXT:   WIDEN ir<%mul> = fmul fast ir<%add>, ir<2.000000e+00>
 ; CHECK-NEXT:   WIDEN ir<%div> = fdiv reassoc nsz contract ir<%mul>, ir<2.000000e+00>
 ; CHECK-NEXT:   CLONE ir<%gep.x> = getelementptr inbounds ir<%x>, vp<[[STEPS]]>
 ; CHECK-NEXT:   vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.x>

@@ -75,24 +77,31 @@ class FastMathFlags {
/// Flag setters
void setAllowReassoc(bool B = true) {
Flags = (Flags & ~AllowReassoc) | B * AllowReassoc;
setAllBitsIfNeeded();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be easier to fix all to check Flags & Fullmask == FullMask?

Copy link

github-actions bot commented Mar 14, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 7bae61370d613c7fc0d2e315eecf19755b6f2a71 2304a18b4c5b1762b21ae3c4e26ae82280b8583e --extensions h -- llvm/include/llvm/IR/FMF.h
View the diff from clang-format here.
diff --git a/llvm/include/llvm/IR/FMF.h b/llvm/include/llvm/IR/FMF.h
index da274a1d6e..aa4bb50fd6 100644
--- a/llvm/include/llvm/IR/FMF.h
+++ b/llvm/include/llvm/IR/FMF.h
@@ -31,14 +31,14 @@ public:
   // WARNING: We're out of space. SubclassOptionalData only has 7 bits. New
   // functionality will require a change in how this information is stored.
   enum {
-    AllowReassoc    = (1 << 0),
-    NoNaNs          = (1 << 1),
-    NoInfs          = (1 << 2),
-    NoSignedZeros   = (1 << 3),
+    AllowReassoc = (1 << 0),
+    NoNaNs = (1 << 1),
+    NoInfs = (1 << 2),
+    NoSignedZeros = (1 << 3),
     AllowReciprocal = (1 << 4),
-    AllowContract   = (1 << 5),
-    ApproxFunc      = (1 << 6),
-    FlagEnd         = (1 << 7)
+    AllowContract = (1 << 5),
+    ApproxFunc = (1 << 6),
+    FlagEnd = (1 << 7)
   };
 
   constexpr static unsigned AllFlagsMask = FlagEnd - 1;

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the current code tries to make a distinction between "fast" and "all bits set individual" from a bitcode upgrade perspective? I think if we're going to change that, it would be better to drop the special -1 value for fast and make it actually equivalent to "all bits set".

@fhahn
Copy link
Contributor Author

fhahn commented Mar 14, 2025

It looks like the current code tries to make a distinction between "fast" and "all bits set individual" from a bitcode upgrade perspective? I think if we're going to change that, it would be better to drop the special -1 value for fast and make it actually equivalent to "all bits set".

Yes the reason I went with setting all bits instead of going the other way and checking all mask bits set was the comment stating the all bits are intentional set. Could also remove that special case if preferred?

     // If all 7 bits are set, turn this into -1. If the number of bits grows, 
    // this must be updated. This is intended to provide some forward binary
    // compatibility insurance for the meaning of 'fast' in case bits are added.

@nikic
Copy link
Contributor

nikic commented Mar 14, 2025

It looks like the current code tries to make a distinction between "fast" and "all bits set individual" from a bitcode upgrade perspective? I think if we're going to change that, it would be better to drop the special -1 value for fast and make it actually equivalent to "all bits set".

Yes the reason I went with setting all bits instead of going the other way and checking all mask bits set was the comment stating the all bits are intentional set. Could also remove that special case if preferred?

     // If all 7 bits are set, turn this into -1. If the number of bits grows, 
    // this must be updated. This is intended to provide some forward binary
    // compatibility insurance for the meaning of 'fast' in case bits are added.

Yes, I believe you should drop the special case. I just checked the bit code implementation and it has this:

} else if (const auto *FPMO = dyn_cast<FPMathOperator>(V)) {
if (FPMO->hasAllowReassoc())
Flags |= bitc::AllowReassoc;
if (FPMO->hasNoNaNs())
Flags |= bitc::NoNaNs;
if (FPMO->hasNoInfs())
Flags |= bitc::NoInfs;
if (FPMO->hasNoSignedZeros())
Flags |= bitc::NoSignedZeros;
if (FPMO->hasAllowReciprocal())
Flags |= bitc::AllowReciprocal;
if (FPMO->hasAllowContract())
Flags |= bitc::AllowContract;
if (FPMO->hasApproxFunc())
Flags |= bitc::ApproxFunc;
So I don't think this "fast" special case actually does anything.

@fhahn
Copy link
Contributor Author

fhahn commented Mar 14, 2025

Updated to remove setting all bits

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jcranmer-intel jcranmer-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've definitely run into this when doing my previous experiments on how to add more FMF bits if we want to add a new flag. Great to see it being fixed finally!

@fhahn fhahn merged commit 37a57ca into llvm:main Mar 15, 2025
10 of 11 checks passed
@fhahn fhahn deleted the fmf-set-all-bits-if-needed branch March 15, 2025 18:46
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 15, 2025
…. (#131321)

Currently fast() won't return true if all flags are set via setXXX,
which is surprising. Update setters to set all bits if needed to make
sure isFast() consistently returns the expected result.

PR: llvm/llvm-project#131321
pawosm-arm pushed a commit to arm/arm-toolchain that referenced this pull request Apr 9, 2025
Currently fast() won't return true if all flags are set via setXXX,
which is surprising. Update setters to set all bits if needed to make
sure isFast() consistently returns the expected result.

PR: llvm/llvm-project#131321
pawosm-arm pushed a commit to pawosm-arm/llvm-project that referenced this pull request Apr 10, 2025
)

Currently fast() won't return true if all flags are set via setXXX,
which is surprising. Update setters to set all bits if needed to make
sure isFast() consistently returns the expected result.

PR: llvm#131321
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants