-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstCombine] Re-queue users of phi when nsw/nuw flags of add are inferred #113933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if (auto *PHI = dyn_cast<PHINode>(U)) | ||
Worklist.pushUsersToWorkList(*PHI); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This obviously has a positive effect, but I don't quite understand why this isn't already handled by pushUsersToWorkList
in the normal InstCombine loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand why this isn't already handled by pushUsersToWorkList in the normal InstCombine loop
Take the above case as an example. If the add instruction is updated, %iv
and %phi
will be re-queued. But %cmp
cannot be re-queued since we do not change %iv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah duh
8dbe0d8
to
cbcc1dd
Compare
@llvm/pr-subscribers-llvm-transforms Author: Yingwei Zheng (dtcxzyw) ChangesThis patch re-queue users of phi when one of its incoming add instructions is updated. If an add instruction is updated, the analysis results of phis may be improved. Thus we may further fold some users of this phi node. See the following case:
However, it is weird that this patch doesn't handle new add instructions with nsw/nuw flags. I tried to do this in If this patch makes sense, I will add some comments and more tests. Address review comment #112642 (comment). This patch also fixes some non-fixpoint issues in tests. Full diff: https://github.com/llvm/llvm-project/pull/113933.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index 21588aca512758..cde3c4f3a630db 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -1868,6 +1868,13 @@ Instruction *InstCombinerImpl::visitAdd(BinaryOperator &I) {
if (Instruction *Res = foldBinOpOfSelectAndCastOfSelectCondition(I))
return Res;
+ if (Changed) {
+ for (User *U : I.users()) {
+ if (auto *PHI = dyn_cast<PHINode>(U))
+ Worklist.pushUsersToWorkList(*PHI);
+ }
+ }
+
return Changed ? &I : nullptr;
}
diff --git a/llvm/test/Transforms/InstCombine/2007-10-31-RangeCrash.ll b/llvm/test/Transforms/InstCombine/2007-10-31-RangeCrash.ll
index b5ae08e1daa3af..3936d027de599d 100644
--- a/llvm/test/Transforms/InstCombine/2007-10-31-RangeCrash.ll
+++ b/llvm/test/Transforms/InstCombine/2007-10-31-RangeCrash.ll
@@ -1,25 +1,18 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
; RUN: opt < %s -S -passes=instcombine | FileCheck %s
-; We do not reach a fixpoint, because we first have to infer nsw on the IV add,
-; and could eliminate the icmp slt afterwards, but don't revisit it.
-
target datalayout = "E-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f128:64:128"
-define i32 @test() "instcombine-no-verify-fixpoint" {
-; CHECK-LABEL: define i32 @test(
-; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+define i32 @test() {
+; CHECK-LABEL: define i32 @test() {
; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 true, label [[BB_I:%.*]], label [[CALCULATECOLORSPECIFICBLACKLEVEL_EXIT:%.*]]
; CHECK: bb.i:
; CHECK-NEXT: br label [[BB51_I_I:%.*]]
; CHECK: bb27.i.i:
-; CHECK-NEXT: [[TMP50_I_I:%.*]] = add nsw i32 [[X_0_I_I:%.*]], 2
; CHECK-NEXT: br label [[BB51_I_I]]
; CHECK: bb51.i.i:
-; CHECK-NEXT: [[X_0_I_I]] = phi i32 [ [[TMP50_I_I]], [[BB27_I_I:%.*]] ], [ 0, [[BB_I]] ]
-; CHECK-NEXT: [[TMP54_I_I:%.*]] = icmp slt i32 [[X_0_I_I]], 0
-; CHECK-NEXT: br i1 [[TMP54_I_I]], label [[BB27_I_I]], label [[BB57_I_I:%.*]]
+; CHECK-NEXT: br i1 false, label [[BB27_I_I:%.*]], label [[BB57_I_I:%.*]]
; CHECK: bb57.i.i:
; CHECK-NEXT: ret i32 0
; CHECK: calculateColorSpecificBlackLevel.exit:
diff --git a/llvm/test/Transforms/InstCombine/cast_phi.ll b/llvm/test/Transforms/InstCombine/cast_phi.ll
index 6b05edc31deb87..a457e520d30cd6 100644
--- a/llvm/test/Transforms/InstCombine/cast_phi.ll
+++ b/llvm/test/Transforms/InstCombine/cast_phi.ll
@@ -316,7 +316,7 @@ define i8 @trunc_in_loop_exit_block() "instcombine-no-verify-fixpoint" {
; CHECK: loop:
; CHECK-NEXT: [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[PHI:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[IV_NEXT]], [[LOOP_LATCH]] ]
-; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[IV]], 100
+; CHECK-NEXT: [[CMP:%.*]] = icmp samesign ult i32 [[IV]], 100
; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[EXIT:%.*]]
; CHECK: loop.latch:
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/2007-08-01-LCSSA.ll b/llvm/test/Transforms/SimpleLoopUnswitch/2007-08-01-LCSSA.ll
index fb342322b2da7a..1ad57ce936bb9b 100644
--- a/llvm/test/Transforms/SimpleLoopUnswitch/2007-08-01-LCSSA.ll
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/2007-08-01-LCSSA.ll
@@ -1,40 +1,29 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
; RUN: opt < %s -S -passes='loop(simple-loop-unswitch),instcombine' -verify-memoryssa | FileCheck %s
-; We do not reach a fixpoint, because we first have to infer nsw on the IV add,
-; and could eliminate the icmp slt afterwards, but don't revisit it.
-
@.str9 = external constant [1 x i8]
declare i32 @strcmp(ptr, ptr)
-define i32 @_ZN9Generator6strregEPKc(ptr %this, ptr %s) "instcombine-no-verify-fixpoint" {
+define i32 @_ZN9Generator6strregEPKc(ptr %this, ptr %s) {
; CHECK-LABEL: define i32 @_ZN9Generator6strregEPKc(
-; CHECK-SAME: ptr [[THIS:%.*]], ptr [[S:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-SAME: ptr [[THIS:%.*]], ptr [[S:%.*]]) {
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP122:%.*]] = icmp eq ptr [[S]], null
; CHECK-NEXT: br label [[BB184:%.*]]
; CHECK: bb55:
; CHECK-NEXT: ret i32 0
; CHECK: bb88:
-; CHECK-NEXT: br i1 [[TMP122]], label [[BB154:%.*]], label [[BB128:%.*]]
+; CHECK-NEXT: br i1 poison, label [[BB154:%.*]], label [[BB128:%.*]]
; CHECK: bb128:
-; CHECK-NEXT: [[TMP138:%.*]] = call i32 @strcmp(ptr noundef nonnull dereferenceable(1) null, ptr noundef nonnull dereferenceable(1) [[S]])
-; CHECK-NEXT: [[IFTMP_37_0_IN4:%.*]] = icmp eq i32 [[TMP138]], 0
-; CHECK-NEXT: br i1 [[IFTMP_37_0_IN4]], label [[BB250:%.*]], label [[BB166:%.*]]
+; CHECK-NEXT: br i1 poison, label [[BB250:%.*]], label [[BB166:%.*]]
; CHECK: bb154:
; CHECK-NEXT: br i1 false, label [[BB250]], label [[BB166]]
; CHECK: bb166:
-; CHECK-NEXT: [[TMP175:%.*]] = add i32 [[IDX_0:%.*]], 1
-; CHECK-NEXT: [[TMP183:%.*]] = add nsw i32 [[I33_0:%.*]], 1
; CHECK-NEXT: br label [[BB184]]
; CHECK: bb184:
-; CHECK-NEXT: [[I33_0]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[TMP183]], [[BB166]] ]
-; CHECK-NEXT: [[IDX_0]] = phi i32 [ 0, [[ENTRY]] ], [ [[TMP175]], [[BB166]] ]
-; CHECK-NEXT: [[TMP49:%.*]] = icmp slt i32 [[I33_0]], 0
-; CHECK-NEXT: br i1 [[TMP49]], label [[BB88:%.*]], label [[BB55:%.*]]
+; CHECK-NEXT: br i1 false, label [[BB88:%.*]], label [[BB55:%.*]]
; CHECK: bb250:
-; CHECK-NEXT: ret i32 [[IDX_0]]
+; CHECK-NEXT: ret i32 poison
;
entry:
%s_addr.0 = select i1 false, ptr @.str9, ptr %s
|
@@ -1868,6 +1868,13 @@ Instruction *InstCombinerImpl::visitAdd(BinaryOperator &I) { | |||
if (Instruction *Res = foldBinOpOfSelectAndCastOfSelectCondition(I)) | |||
return Res; | |||
|
|||
if (Changed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're only interested in recurrence phis here, right? So can we check first if an add operand is a phi, and then whether it is in the user list? That way we can avoid the full users scan in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not mentioning it earlier, but I think the cleanest way to do this is to use matchSimpleRecurrence to get the PHINode. If it's not a simple recurrence we're unlikely to get anything useful out of it anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cbcc1dd
to
d2a5040
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This patch re-queue users of phi when one of its incoming add instructions is updated. If an add instruction is updated, the analysis results of phis may be improved. Thus we may further fold some users of this phi node.
See the following case:
%iv u< 100
-> infernsw/nuw
for%iv.next = add i32 %iv, 1
->
%iv
is non-negative -> infersamesign
for%cmp = icmp ult i32 %iv, 100
.Without re-queuing users of phi nodes, we cannot improve
%cmp
in one iteration.However, it is weird that this patch doesn't handle new add instructions with nsw/nuw flags. I tried to do this in
InstructionWorklist::pushUsersToWorkList
. Unfortunately the compile-time impact is huge: http://llvm-compile-time-tracker.com/compare.php?from=2d26ef09fc87472cd42ea219c8f9267599872958&to=c92bfb8e4b98790c444031dda95d78772d9e1ea4&stat=instructions%3AuIf this patch makes sense, I will add some comments and more tests.
Address review comment #112642 (comment). This patch also fixes some non-fixpoint issues in tests.