-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[FuncSpec] Consider literal constants of recursive functions #111162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Enable specialization of literal constants by default for recursive functions.
@llvm/pr-subscribers-function-specialization @llvm/pr-subscribers-llvm-transforms Author: Hari Limaye (hazzlim) ChangesEnable specialization of literal constants by default for recursive functions. Full diff: https://github.com/llvm/llvm-project/pull/111162.diff 4 Files Affected:
diff --git a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
index b001771951e0fe..329491f2a0fe97 100644
--- a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
+++ b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
@@ -311,9 +311,11 @@ class FunctionSpecializer {
/// @param FuncSize Cost of specializing a function.
/// @param AllSpecs A vector to add potential specializations to.
/// @param SM A map for a function's specialisation range
+ /// @param ConsiderLiterals Whether to specialize on literal constants
/// @return True, if any potential specializations were found
bool findSpecializations(Function *F, unsigned FuncSize,
- SmallVectorImpl<Spec> &AllSpecs, SpecMap &SM);
+ SmallVectorImpl<Spec> &AllSpecs, SpecMap &SM,
+ bool ConsiderLiterals);
/// Compute the inlining bonus for replacing argument \p A with constant \p C.
unsigned getInliningBonus(Argument *A, Constant *C);
@@ -328,7 +330,7 @@ class FunctionSpecializer {
/// Determine if it is possible to specialise the function for constant values
/// of the formal parameter \p A.
- bool isArgumentInteresting(Argument *A);
+ bool isArgumentInteresting(Argument *A, bool ConsiderLiterals);
/// Check if the value \p V (an actual argument) is a constant or can only
/// have a constant value. Return that constant.
diff --git a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
index 548335d750e33d..db0f06ec8ac17d 100644
--- a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
@@ -663,7 +663,8 @@ bool FunctionSpecializer::run() {
if (Inserted && Metrics.isRecursive)
promoteConstantStackValues(&F);
- if (!findSpecializations(&F, FuncSize, AllSpecs, SM)) {
+ bool ConsiderLiterals = SpecializeLiteralConstant || Metrics.isRecursive;
+ if (!findSpecializations(&F, FuncSize, AllSpecs, SM, ConsiderLiterals)) {
LLVM_DEBUG(
dbgs() << "FnSpecialization: No possible specializations found for "
<< F.getName() << "\n");
@@ -803,7 +804,8 @@ static Function *cloneCandidateFunction(Function *F, unsigned NSpecs) {
bool FunctionSpecializer::findSpecializations(Function *F, unsigned FuncSize,
SmallVectorImpl<Spec> &AllSpecs,
- SpecMap &SM) {
+ SpecMap &SM,
+ bool ConsiderLiterals) {
// A mapping from a specialisation signature to the index of the respective
// entry in the all specialisation array. Used to ensure uniqueness of
// specialisations.
@@ -812,7 +814,7 @@ bool FunctionSpecializer::findSpecializations(Function *F, unsigned FuncSize,
// Get a list of interesting arguments.
SmallVector<Argument *> Args;
for (Argument &Arg : F->args())
- if (isArgumentInteresting(&Arg))
+ if (isArgumentInteresting(&Arg, ConsiderLiterals))
Args.push_back(&Arg);
if (Args.empty())
@@ -1032,14 +1034,16 @@ unsigned FunctionSpecializer::getInliningBonus(Argument *A, Constant *C) {
/// Determine if it is possible to specialise the function for constant values
/// of the formal parameter \p A.
-bool FunctionSpecializer::isArgumentInteresting(Argument *A) {
+bool FunctionSpecializer::isArgumentInteresting(Argument *A,
+ bool ConsiderLiterals) {
// No point in specialization if the argument is unused.
if (A->user_empty())
return false;
Type *Ty = A->getType();
- if (!Ty->isPointerTy() && (!SpecializeLiteralConstant ||
- (!Ty->isIntegerTy() && !Ty->isFloatingPointTy() && !Ty->isStructTy())))
+ if (!Ty->isPointerTy() &&
+ (!ConsiderLiterals ||
+ (!Ty->isIntegerTy() && !Ty->isFloatingPointTy() && !Ty->isStructTy())))
return false;
// SCCP solver does not record an argument that will be constructed on
diff --git a/llvm/test/Transforms/FunctionSpecialization/function-specialization2.ll b/llvm/test/Transforms/FunctionSpecialization/function-specialization2.ll
index ef830a0e9a4a9e..00319f1424d766 100644
--- a/llvm/test/Transforms/FunctionSpecialization/function-specialization2.ll
+++ b/llvm/test/Transforms/FunctionSpecialization/function-specialization2.ll
@@ -64,27 +64,27 @@ define i32 @main(ptr %0, i32 %1) {
; CHECK-LABEL: define i32 @main(
; CHECK-SAME: ptr [[TMP0:%.*]], i32 [[TMP1:%.*]]) {
; CHECK-NEXT: call void @func.specialized.2(ptr [[TMP0]], i32 [[TMP1]])
-; CHECK-NEXT: call void @func.specialized.1(ptr [[TMP0]], i32 0)
+; CHECK-NEXT: call void @func.specialized.1(ptr [[TMP0]])
; CHECK-NEXT: ret i32 0
;
;
; CHECK-LABEL: define internal void @func.specialized.1(
-; CHECK-SAME: ptr [[TMP0:%.*]], i32 [[TMP1:%.*]]) {
+; CHECK-SAME: ptr [[TMP0:%.*]]) {
; CHECK-NEXT: [[TMP2:%.*]] = alloca i32, align 4
-; CHECK-NEXT: store i32 [[TMP1]], ptr [[TMP2]], align 4
+; CHECK-NEXT: store i32 0, ptr [[TMP2]], align 4
; CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4
; CHECK-NEXT: [[TMP4:%.*]] = icmp slt i32 [[TMP3]], 1
-; CHECK-NEXT: br i1 [[TMP4]], label %[[BB12:.*]], label %[[BB6:.*]]
-; CHECK: [[BB6]]:
+; CHECK-NEXT: br i1 [[TMP4]], label %[[BB11:.*]], label %[[BB5:.*]]
+; CHECK: [[BB5]]:
; CHECK-NEXT: [[TMP6:%.*]] = load i32, ptr [[TMP2]], align 4
; CHECK-NEXT: [[TMP7:%.*]] = sext i32 [[TMP6]] to i64
; CHECK-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[TMP7]]
; CHECK-NEXT: call void @decrement(ptr [[TMP8]])
; CHECK-NEXT: [[TMP9:%.*]] = load i32, ptr [[TMP2]], align 4
; CHECK-NEXT: [[TMP10:%.*]] = add nsw i32 [[TMP9]], -1
-; CHECK-NEXT: call void @func.specialized.1(ptr [[TMP0]], i32 [[TMP10]])
-; CHECK-NEXT: br label %[[BB12]]
-; CHECK: [[BB12]]:
+; CHECK-NEXT: call void @func.specialized.3(ptr [[TMP0]], i32 [[TMP10]])
+; CHECK-NEXT: br label %[[BB11]]
+; CHECK: [[BB11]]:
; CHECK-NEXT: ret void
;
;
@@ -108,6 +108,46 @@ define i32 @main(ptr %0, i32 %1) {
; CHECK-NEXT: ret void
;
;
+; CHECK-LABEL: define internal void @func.specialized.3(
+; CHECK-SAME: ptr [[TMP0:%.*]], i32 [[TMP1:%.*]]) {
+; CHECK-NEXT: [[TMP3:%.*]] = alloca i32, align 4
+; CHECK-NEXT: store i32 [[TMP1]], ptr [[TMP3]], align 4
+; CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[TMP3]], align 4
+; CHECK-NEXT: [[TMP5:%.*]] = icmp slt i32 [[TMP4]], 1
+; CHECK-NEXT: br i1 [[TMP5]], label %[[BB12:.*]], label %[[BB6:.*]]
+; CHECK: [[BB6]]:
+; CHECK-NEXT: [[TMP7:%.*]] = load i32, ptr [[TMP3]], align 4
+; CHECK-NEXT: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64
+; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[TMP8]]
+; CHECK-NEXT: call void @decrement(ptr [[TMP9]])
+; CHECK-NEXT: [[TMP10:%.*]] = load i32, ptr [[TMP3]], align 4
+; CHECK-NEXT: [[TMP11:%.*]] = add nsw i32 [[TMP10]], -1
+; CHECK-NEXT: call void @func.specialized.3(ptr [[TMP0]], i32 [[TMP11]])
+; CHECK-NEXT: br label %[[BB12]]
+; CHECK: [[BB12]]:
+; CHECK-NEXT: ret void
+;
+;
+; ONE-ITER-LABEL: define internal void @func(
+; ONE-ITER-SAME: ptr [[TMP0:%.*]], i32 [[TMP1:%.*]], ptr nocapture [[TMP2:%.*]]) {
+; ONE-ITER-NEXT: [[TMP4:%.*]] = alloca i32, align 4
+; ONE-ITER-NEXT: store i32 [[TMP1]], ptr [[TMP4]], align 4
+; ONE-ITER-NEXT: [[TMP5:%.*]] = load i32, ptr [[TMP4]], align 4
+; ONE-ITER-NEXT: [[TMP6:%.*]] = icmp slt i32 [[TMP5]], 1
+; ONE-ITER-NEXT: br i1 [[TMP6]], label %[[BB13:.*]], label %[[BB7:.*]]
+; ONE-ITER: [[BB7]]:
+; ONE-ITER-NEXT: [[TMP8:%.*]] = load i32, ptr [[TMP4]], align 4
+; ONE-ITER-NEXT: [[TMP9:%.*]] = sext i32 [[TMP8]] to i64
+; ONE-ITER-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[TMP9]]
+; ONE-ITER-NEXT: call void [[TMP2]](ptr [[TMP10]])
+; ONE-ITER-NEXT: [[TMP11:%.*]] = load i32, ptr [[TMP4]], align 4
+; ONE-ITER-NEXT: [[TMP12:%.*]] = add nsw i32 [[TMP11]], -1
+; ONE-ITER-NEXT: call void @func(ptr [[TMP0]], i32 [[TMP12]], ptr [[TMP2]])
+; ONE-ITER-NEXT: br label %[[BB13]]
+; ONE-ITER: [[BB13]]:
+; ONE-ITER-NEXT: ret void
+;
+;
; ONE-ITER-LABEL: define internal void @increment(
; ONE-ITER-SAME: ptr nocapture [[TMP0:%.*]]) {
; ONE-ITER-NEXT: [[TMP2:%.*]] = load i32, ptr [[TMP0]], align 4
@@ -127,27 +167,27 @@ define i32 @main(ptr %0, i32 %1) {
; ONE-ITER-LABEL: define i32 @main(
; ONE-ITER-SAME: ptr [[TMP0:%.*]], i32 [[TMP1:%.*]]) {
; ONE-ITER-NEXT: call void @func.specialized.2(ptr [[TMP0]], i32 [[TMP1]])
-; ONE-ITER-NEXT: call void @func.specialized.1(ptr [[TMP0]], i32 0)
+; ONE-ITER-NEXT: call void @func.specialized.1(ptr [[TMP0]])
; ONE-ITER-NEXT: ret i32 0
;
;
; ONE-ITER-LABEL: define internal void @func.specialized.1(
-; ONE-ITER-SAME: ptr [[TMP0:%.*]], i32 [[TMP1:%.*]]) {
+; ONE-ITER-SAME: ptr [[TMP0:%.*]]) {
; ONE-ITER-NEXT: [[TMP2:%.*]] = alloca i32, align 4
-; ONE-ITER-NEXT: store i32 [[TMP1]], ptr [[TMP2]], align 4
+; ONE-ITER-NEXT: store i32 0, ptr [[TMP2]], align 4
; ONE-ITER-NEXT: [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4
; ONE-ITER-NEXT: [[TMP4:%.*]] = icmp slt i32 [[TMP3]], 1
-; ONE-ITER-NEXT: br i1 [[TMP4]], label %[[BB12:.*]], label %[[BB6:.*]]
-; ONE-ITER: [[BB6]]:
+; ONE-ITER-NEXT: br i1 [[TMP4]], label %[[BB11:.*]], label %[[BB5:.*]]
+; ONE-ITER: [[BB5]]:
; ONE-ITER-NEXT: [[TMP6:%.*]] = load i32, ptr [[TMP2]], align 4
; ONE-ITER-NEXT: [[TMP7:%.*]] = sext i32 [[TMP6]] to i64
; ONE-ITER-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[TMP7]]
; ONE-ITER-NEXT: call void @decrement(ptr [[TMP8]])
; ONE-ITER-NEXT: [[TMP9:%.*]] = load i32, ptr [[TMP2]], align 4
; ONE-ITER-NEXT: [[TMP10:%.*]] = add nsw i32 [[TMP9]], -1
-; ONE-ITER-NEXT: call void @func.specialized.1(ptr [[TMP0]], i32 [[TMP10]])
-; ONE-ITER-NEXT: br label %[[BB12]]
-; ONE-ITER: [[BB12]]:
+; ONE-ITER-NEXT: call void @func(ptr [[TMP0]], i32 [[TMP10]], ptr @decrement)
+; ONE-ITER-NEXT: br label %[[BB11]]
+; ONE-ITER: [[BB11]]:
; ONE-ITER-NEXT: ret void
;
;
diff --git a/llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll b/llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll
index 930ed6627f7f1e..e7e2fffac5d6ec 100644
--- a/llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll
+++ b/llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll
@@ -51,7 +51,7 @@ entry:
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: br i1 [[FLAG]], label %[[PLUS:.*]], label %[[MINUS:.*]]
; CHECK: [[PLUS]]:
-; CHECK-NEXT: [[CMP0:%.*]] = call i64 @compute.specialized.2(i64 [[X]], i64 [[Y]], ptr @plus, ptr @minus)
+; CHECK-NEXT: [[CMP0:%.*]] = call i64 @compute.specialized.1(i64 [[X]], i64 [[Y]], ptr @plus, ptr @minus)
; CHECK-NEXT: br label %[[MERGE:.*]]
; CHECK: [[MINUS]]:
; CHECK-NEXT: [[CMP1:%.*]] = call i64 @compute.specialized.3(i64 [[X]], i64 [[Y]], ptr @minus, ptr @plus)
@@ -79,10 +79,10 @@ entry:
; CHECK-LABEL: define internal i64 @compute.specialized.1(
; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], ptr [[BINOP1:%.*]], ptr [[BINOP2:%.*]]) {
; CHECK-NEXT: [[ENTRY:.*:]]
-; CHECK-NEXT: [[OP1:%.*]] = call i64 [[BINOP1]](i64 [[X]], i64 [[Y]])
; CHECK-NEXT: [[OP0:%.*]] = call i64 @plus(i64 [[X]], i64 [[Y]])
-; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.1(i64 [[X]], i64 [[Y]], ptr [[BINOP1]], ptr @plus)
-; CHECK-NEXT: [[ADD0:%.*]] = add i64 [[OP1]], [[OP0]]
+; CHECK-NEXT: [[OP1:%.*]] = call i64 @minus(i64 [[X]], i64 [[Y]])
+; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.5(i64 [[X]], i64 [[Y]], ptr @plus, ptr @plus)
+; CHECK-NEXT: [[ADD0:%.*]] = add i64 [[OP0]], [[OP1]]
; CHECK-NEXT: [[ADD1:%.*]] = add i64 [[ADD0]], [[OP2]]
; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 [[ADD1]], [[X]]
; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], [[Y]]
@@ -93,13 +93,13 @@ entry:
; CHECK-LABEL: define internal i64 @compute.specialized.2(
; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], ptr [[BINOP1:%.*]], ptr [[BINOP2:%.*]]) {
; CHECK-NEXT: [[ENTRY:.*:]]
-; CHECK-NEXT: [[OP0:%.*]] = call i64 @plus(i64 [[X]], i64 [[Y]])
-; CHECK-NEXT: [[OP1:%.*]] = call i64 @minus(i64 [[X]], i64 [[Y]])
-; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.1(i64 [[X]], i64 [[Y]], ptr @plus, ptr @plus)
+; CHECK-NEXT: [[OP0:%.*]] = call i64 @plus(i64 [[X]], i64 42)
+; CHECK-NEXT: [[OP1:%.*]] = call i64 @minus(i64 [[X]], i64 42)
+; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.4(i64 [[X]], i64 42, ptr @plus, ptr @plus)
; CHECK-NEXT: [[ADD0:%.*]] = add i64 [[OP0]], [[OP1]]
; CHECK-NEXT: [[ADD1:%.*]] = add i64 [[ADD0]], [[OP2]]
; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 [[ADD1]], [[X]]
-; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], [[Y]]
+; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], 42
; CHECK-NEXT: [[MUL:%.*]] = mul i64 [[SUB]], 2
; CHECK-NEXT: ret i64 [[MUL]]
;
@@ -117,3 +117,45 @@ entry:
; CHECK-NEXT: [[MUL:%.*]] = mul i64 [[SUB]], 2
; CHECK-NEXT: ret i64 [[MUL]]
;
+;
+; CHECK-LABEL: define internal i64 @compute.specialized.4(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], ptr [[BINOP1:%.*]], ptr [[BINOP2:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[OP0:%.*]] = call i64 @plus(i64 [[X]], i64 42)
+; CHECK-NEXT: [[OP1:%.*]] = call i64 @plus(i64 [[X]], i64 42)
+; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.4(i64 [[X]], i64 42, ptr @plus, ptr @plus)
+; CHECK-NEXT: [[ADD0:%.*]] = add i64 [[OP0]], [[OP1]]
+; CHECK-NEXT: [[ADD1:%.*]] = add i64 [[ADD0]], [[OP2]]
+; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 [[ADD1]], [[X]]
+; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], 42
+; CHECK-NEXT: [[MUL:%.*]] = mul i64 [[SUB]], 2
+; CHECK-NEXT: ret i64 [[MUL]]
+;
+;
+; CHECK-LABEL: define internal i64 @compute.specialized.5(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], ptr [[BINOP1:%.*]], ptr [[BINOP2:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[OP0:%.*]] = call i64 @plus(i64 [[X]], i64 [[Y]])
+; CHECK-NEXT: [[OP1:%.*]] = call i64 @plus(i64 [[X]], i64 [[Y]])
+; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.5(i64 [[X]], i64 [[Y]], ptr @plus, ptr @plus)
+; CHECK-NEXT: [[ADD0:%.*]] = add i64 [[OP0]], [[OP1]]
+; CHECK-NEXT: [[ADD1:%.*]] = add i64 [[ADD0]], [[OP2]]
+; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 [[ADD1]], [[X]]
+; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], [[Y]]
+; CHECK-NEXT: [[MUL:%.*]] = mul i64 [[SUB]], 2
+; CHECK-NEXT: ret i64 [[MUL]]
+;
+;
+; CHECK-LABEL: define internal i64 @compute.specialized.6(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], ptr [[BINOP1:%.*]], ptr [[BINOP2:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[OP0:%.*]] = call i64 [[BINOP1]](i64 [[X]], i64 [[Y]])
+; CHECK-NEXT: [[OP1:%.*]] = call i64 @plus(i64 [[X]], i64 [[Y]])
+; CHECK-NEXT: [[OP2:%.*]] = call i64 @compute.specialized.6(i64 [[X]], i64 [[Y]], ptr [[BINOP1]], ptr @plus)
+; CHECK-NEXT: [[ADD0:%.*]] = add i64 [[OP0]], [[OP1]]
+; CHECK-NEXT: [[ADD1:%.*]] = add i64 [[ADD0]], [[OP2]]
+; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 [[ADD1]], [[X]]
+; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], [[Y]]
+; CHECK-NEXT: [[MUL:%.*]] = mul i64 [[SUB]], 2
+; CHECK-NEXT: ret i64 [[MUL]]
+;
|
This change is intended to prevent regressions in code which would otherwise be negatively affected by #111163. This change has negligible impact on CTMark compile times (Geomean 0.00%) for both LTO and non-LTO pipelines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove promoteConstantStackValues() ? Do you see any regressions doing so? As far as I remember that was a hack for exchange2_r, since fortran passes arguments by reference.
I think that provided both #111163 and this PR get in, we would be able to remove |
Hi @hazzlim, I do understand why you want to make this change, because it can lead to a significant gain in the bwaves benchmark (in combination with PR #111163). However, it feels like this patch is adding a fairly arbitrary heuristic, i.e. that only Fortran routines that are recursive are able to benefit from PR #111163. If I understand correctly, PR #111163 will promote references to constants to actual constant arguments, which unlocks optimisations later in the pipeline, in particular IPSCCP. I thought the fundamental reason why we are careful to specialise based on a constant literal is we want to avoid a potential explosion in specialisations, which not only increases compile time but also binary size? This patch could also lead to an explosion in compile times for certain benchmarks if there are lots of different calls to the recursive function where we pass in different literals. It just so happens for bwaves this isn't a problem, if I understand correctly? So I suppose what this patch implies is that if a function is recursive there will be at least two callers passing in that literal value, i.e. the original call and the recursive call. That makes me wonder if there is a better heuristic we can use that would benefit non-recursive Fortran functions too? For example, what if we decided to specialise functions with literal constants only if we can prove there that it's worth it based on the number or percentage of callers using that same literal? Would such a heuristic also work for bwaves? |
Or we could also apply a weight or hotness to calls, where recursive calls are essentially hotter? Just for the record, I'm not expecting you to solve all the problems at once in this initial patch, but we might be able to sketch out an initial algorithm as part of this work, even if it is only limited to recursive functions for now. |
; ONE-ITER-NEXT: br label %[[BB13]] | ||
; ONE-ITER: [[BB13]]: | ||
; ONE-ITER-NEXT: ret void | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any new tests should be moved to a new file named accordingly based on what the test is trying to check. In this case something like recursive-literal-const-args.ll
or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I've realised you're not adding tests but changing the codegen of existing ones. That doesn't seem right, I'd change the RUN lines of this file to prevent specialization on literal constants.
; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 [[ADD1]], [[X]] | ||
; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[DIV]], [[Y]] | ||
; CHECK-NEXT: [[MUL:%.*]] = mul i64 [[SUB]], 2 | ||
; CHECK-NEXT: ret i64 [[MUL]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any new tests should be moved to a new file named accordingly based on what the test is trying to check. In this case something like
recursive-literal-const-args.ll
or similar.
Oh I've realised you're not adding tests but changing the codegen of existing ones. That doesn't seem right, I'd change the RUN lines of this file to prevent specialization on literal constants.
Same here
Closing this in favor of #113442 |
Enable specialization of literal constants by default for recursive functions.