-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[PartiallyInlineLibCalls] Emit missed- and passed-optimization remarks when partially inlining sqrt #123966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
e8d7b20
to
4bc7e6c
Compare
@llvm/pr-subscribers-llvm-transforms Author: None (TiborGY) ChangesPR 2/2 IntroductionThis build on PR #122654 and adds both passed- and missed-optimization remarks to the PartiallyInlineLibCalls pass. This pass is currently only responsible for the partial inlining of sqrt/sqrtf C library calls, on targets that have a fast HW instruction for square root but demand that About this PRBefore this PR, the pass was silent, which can be confusing if someone is looking at the asm output and finds the branch it is inserting. So I have added a passed-optimization remark. I have also added some further missed-optimization remarks that are emitted when a C library call cannot be optimized by the pass due to strict FP exception behavior or a tail call circumstance. TestingI have added some regression testing to the existing tests that I could find for this pass. I am not 100% sure they are good/sufficient tests, but they are passing. Let me know if more testing is required for new opt-remarks. Full diff: https://github.com/llvm/llvm-project/pull/123966.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/PartiallyInlineLibCalls.cpp b/llvm/lib/Transforms/Scalar/PartiallyInlineLibCalls.cpp
index 2b50ccdc2eeb4f..e56b561ef1f591 100644
--- a/llvm/lib/Transforms/Scalar/PartiallyInlineLibCalls.cpp
+++ b/llvm/lib/Transforms/Scalar/PartiallyInlineLibCalls.cpp
@@ -56,6 +56,20 @@ static bool optimizeSQRT(CallInst *Call, Function *CalledFunc,
// dst = phi(v0, v1)
//
+ ORE->emit([&]() {
+ return OptimizationRemark(DEBUG_TYPE, "SqrtPartiallyInlined",
+ Call->getDebugLoc(), &CurrBB)
+ << "Partially inlined call to sqrt function despite having to use "
+ "errno for error handling: target has fast sqrt instruction";
+ });
+ ORE->emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "BranchInserted",
+ Call->getDebugLoc(), &CurrBB)
+ << "Branch to library sqrt fn had to be inserted to satisfy the "
+ "current target's requirement for math functions to set errno on "
+ "invalid inputs";
+ });
+
Type *Ty = Call->getType();
IRBuilder<> Builder(Call->getNextNode());
@@ -125,11 +139,29 @@ static bool runPartiallyInlineLibCalls(Function &F, TargetLibraryInfo *TLI,
if (!Call || !(CalledFunc = Call->getCalledFunction()))
continue;
- if (Call->isNoBuiltin() || Call->isStrictFP())
+ if (Call->isNoBuiltin())
continue;
-
- if (Call->isMustTailCall())
+ if (Call->isStrictFP()) {
+ ORE->emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "StrictFloat",
+ Call->getDebugLoc(), &*CurrBB)
+ << "Could not consider library function for partial inlining:"
+ " strict FP exception behavior is active";
+ });
continue;
+ }
+ // Partially inlining a libcall that has the musttail attribute leads to
+ // broken LLVM IR, triggering an assertion in the IR verifier.
+ // Work around that by forgoing this optimization for musttail calls.
+ if (Call->isMustTailCall()) {
+ ORE->emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "MustTailCall",
+ Call->getDebugLoc(), &*CurrBB)
+ << "Could not consider library function for partial inlining:"
+ " must tail call";
+ });
+ continue;
+ }
// Skip if function either has local linkage or is not a known library
// function.
diff --git a/llvm/test/Transforms/PartiallyInlineLibCalls/X86/good-prototype.ll b/llvm/test/Transforms/PartiallyInlineLibCalls/X86/good-prototype.ll
index e6c2a7e629a5d6..49a34d9d4a0a6f 100644
--- a/llvm/test/Transforms/PartiallyInlineLibCalls/X86/good-prototype.ll
+++ b/llvm/test/Transforms/PartiallyInlineLibCalls/X86/good-prototype.ll
@@ -1,8 +1,12 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -S -passes=partially-inline-libcalls -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s
+; RUN: opt -S -passes=partially-inline-libcalls -mtriple=x86_64-unknown-linux-gnu -pass-remarks=partially-inline-libcalls \
+; RUN: -pass-remarks-missed=partially-inline-libcalls 2>%t < %s | FileCheck %s
+; RUN: cat %t | FileCheck %s -check-prefix=CHECK-REMARK
define float @f(float %val) {
; CHECK-LABEL: @f(
+; CHECK-REMARK: Partially inlined call to sqrt function despite having to use errno for error handling: target has fast sqrt instruction
+; CHECK-REMARK: Branch to library sqrt fn had to be inserted to satisfy the current target's requirement for math functions to set errno on invalid inputs
; CHECK-NEXT: entry:
; CHECK-NEXT: [[RES:%.*]] = tail call float @sqrtf(float [[VAL:%.*]]) #[[READNONE:.*]]
; CHECK-NEXT: [[TMP0:%.*]] = fcmp oge float [[VAL]], 0.000000e+00
diff --git a/llvm/test/Transforms/PartiallyInlineLibCalls/X86/musttail.ll b/llvm/test/Transforms/PartiallyInlineLibCalls/X86/musttail.ll
index 65dd616b43ea69..c877990df309a0 100644
--- a/llvm/test/Transforms/PartiallyInlineLibCalls/X86/musttail.ll
+++ b/llvm/test/Transforms/PartiallyInlineLibCalls/X86/musttail.ll
@@ -1,8 +1,10 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -S -passes=partially-inline-libcalls -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s
+; RUN: opt -S -passes=partially-inline-libcalls -mtriple=x86_64-unknown-linux-gnu -pass-remarks-missed=partially-inline-libcalls 2>%t < %s | FileCheck %s
+; RUN: cat %t | FileCheck %s -check-prefix=CHECK-REMARK
define double @foo(double %x) {
; CHECK-LABEL: @foo(
+; CHECK-REMARK: Could not consider library function for partial inlining: must tail call
; CHECK-NEXT: [[R:%.*]] = musttail call double @sqrt(double [[X:%.*]])
; CHECK-NEXT: ret double [[R]]
;
diff --git a/llvm/test/Transforms/PartiallyInlineLibCalls/strictfp.ll b/llvm/test/Transforms/PartiallyInlineLibCalls/strictfp.ll
index 8d18d1969b04d3..6a58de096b84a5 100644
--- a/llvm/test/Transforms/PartiallyInlineLibCalls/strictfp.ll
+++ b/llvm/test/Transforms/PartiallyInlineLibCalls/strictfp.ll
@@ -1,7 +1,9 @@
-; RUN: opt -S -passes=partially-inline-libcalls -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s
+; RUN: opt -S -passes=partially-inline-libcalls -mtriple=x86_64-unknown-linux-gnu -pass-remarks-missed=partially-inline-libcalls 2>%t < %s | FileCheck %s
+; RUN: cat %t | FileCheck %s -check-prefix=CHECK-REMARK
define float @f(float %val) strictfp {
; CHECK-LABEL: @f
+; CHECK-REMARK: Could not consider library function for partial inlining: strict FP exception behavior is active
; CHECK: call{{.*}}@sqrtf
; CHECK-NOT: call{{.*}}@sqrtf
%res = tail call float @sqrtf(float %val) strictfp
|
Tagging @arsenm since they have reviewed the first PR in the series. |
Also tagging @kazutakahirata since they were suggested as a reviewer for the previous PR in the series, and are listed as an inliner maintainer. |
Ping |
return OptimizationRemarkMissed(DEBUG_TYPE, "BranchInserted", | ||
Call->getDebugLoc(), &CurrBB) | ||
<< "Branch to library sqrt fn had to be inserted to satisfy the " | ||
"current target's requirement for math functions to set errno on " | ||
"invalid inputs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the OptimizationRemarkMissed
is the best category for this. Maybe include it in the remark above, as it isn't really a missed optimization BranchInserted
only happens ifSqrtPartiallyInlined
also happens ?
Or maybe just use OptimizationRemark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I can see how this could be a matter of interpretation. I have felt the need to separate the two messages as they convey two different aspects of the same event.
First, the OptimizationRemark
signals that a transformation which is expected to be profitable has been done.
Second, the OptimizationRemarkMissed
signals that that there is a condition inhibiting this optimization from being even better.
Splitting these enables users to have more relevant information with filtering, eg. if they are only interested in spots where LLVM is struggling to optimize something, they can look at OptimizationRemarkMissed
only.
Combining the two messages or turning the second one into a second OptimizationRemark
would defeat this.
Perhaps my mental model is wrong, but to me OptimizationRemark
means "LLVM did something it believes to be useful" and OptimizationRemarkMissed
means "LLVM cannot or will not do something that (hypothetically) could have been useful". Hence the separation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my explanation is satisfactory, and there are no further concerns to be discussed, may I request a merge?
I do not have write access to the LLVM repo.
ping |
return OptimizationRemark(DEBUG_TYPE, "SqrtPartiallyInlined", | ||
Call->getDebugLoc(), &CurrBB) | ||
<< "Partially inlined call to sqrt function despite having to use " | ||
"errno for error handling: target has fast sqrt instruction"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't know if the target has a "fast sqrt instruction" and this shouldn't be considering it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But doesn't optimizeSQRT
only get called if TTI->haveFastSqrt(Call->getType()) == true
?
This is its only call site:
if (TTI->haveFastSqrt(Call->getType()) &&
optimizeSQRT(Call, CalledFunc, *CurrBB, BB, TTI,
DTU ? &*DTU : nullptr, ORE))
break;
If the target does not have fast sqrt, the &&
gets short-circuited to false and optimizeSQRT
is never called.
Therefore optimizeSQRT
implicitly knows the HW sqrt is fast, in the context of the current codebase.
But you are right, this is quite brittle, so I have moved the remark after the condition that may call optimizeSQRT
.
FWIW, I have taken the "fast sqrt instruction" language from here:
https://github.com/llvm/llvm-project/blob/9c60431b673c478606e63ff1e47860eb4eb6af09/llvm/include/llvm/Analysis/TargetTransformInfo.h#L1048C1-L1049C37
return OptimizationRemarkMissed(DEBUG_TYPE, "BranchInserted", | ||
Call->getDebugLoc(), &CurrBB) | ||
<< "Branch to library sqrt fn had to be inserted to satisfy the " | ||
"current target's requirement for math functions to set errno on " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a property of the target. I also don't see why these two remarks are emitted under the same exact conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, target may not be the best word for it indeed, but the docs are using a similar language:
On some targets, math library functions never set errno, and so -fno-math-errno is the default. This includes most BSD-derived systems, including Darwin.
In guess I could say target triplet instead of target?
PS. I have created two remarks so I can split them into an optimization passed/missed pair. In my mental model passed means "LLVM did some profitable transformation" and missed means "something is inhibiting an optimization that could have been profitable" or "LLVM thinks something is not profitable". In this case, applying the partial inlining with the branch is simultaneously both: profitable but could have been better if setting errno wasn't a requirement.
Separating the two aspects enables the user to filter the remark output and e.g. look at only the missed optimization remarks. I often use Compiler Explorer to look at what LLVM says about a piece of code, and I find the ability to filter by remark type useful.
PR 2/2, closes #78785
Introduction
This builds on PR #122654 and adds both passed- and missed-optimization remarks to the PartiallyInlineLibCalls pass. This pass is currently only responsible for the partial inlining of sqrt/sqrtf C library calls, on targets that have a fast HW instruction for square root but demand that
errno
is set on invalid input. One such target is x86_64-unknown-linux-gnu.About this PR
Before this PR, the pass was silent, which can be confusing if someone is looking at the asm output and finds the branch it is inserting. So I have added a passed-optimization remark.
At the same time this branch may also impede vectorization in some cases, the sqrt is often found in scientific codes with nested loops, etc. Therefore I have also added a missed-optimization remark alerting the user to the fact that the
errno
requirement is having an effect on optimizations.I have also added some further missed-optimization remarks that are emitted when a C library call cannot be optimized by the pass due to strict FP exception behavior or a tail call circumstance.
Testing
I have added some regression testing to the existing tests that I could find for this pass. I am not 100% sure they are good/sufficient tests, but they are passing. Let me know if more testing is required for new opt-remarks.