-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SimplifyCFG] Handle trunc condition in foldBranchToCommonDest. #135490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Andreas Jonson (andjo403) ChangesNoticed regressions due to this missing fold when trying to add the fold proof: https://alive2.llvm.org/ce/z/v32Aof Full diff: https://github.com/llvm/llvm-project/pull/135490.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index eac7e7c209c95..7f53aa7d4f73d 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -4087,9 +4087,7 @@ bool llvm::foldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU,
Instruction *Cond = dyn_cast<Instruction>(BI->getCondition());
- if (!Cond ||
- (!isa<CmpInst>(Cond) && !isa<BinaryOperator>(Cond) &&
- !isa<SelectInst>(Cond)) ||
+ if (!Cond || !isa<CmpInst, BinaryOperator, SelectInst, TruncInst>(Cond) ||
Cond->getParent() != BB || !Cond->hasOneUse())
return false;
diff --git a/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest.ll b/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest.ll
index 37f48a9a7e03a..7b88ec338cf5e 100644
--- a/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest.ll
+++ b/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest.ll
@@ -88,10 +88,9 @@ define void @one_pred_trunc_cond(i8 %v0, i8 %v1) {
; CHECK-LABEL: @one_pred_trunc_cond(
; CHECK-NEXT: pred:
; CHECK-NEXT: [[C0:%.*]] = icmp eq i8 [[V0:%.*]], 0
-; CHECK-NEXT: br i1 [[C0]], label [[DISPATCH:%.*]], label [[FINAL_RIGHT:%.*]]
-; CHECK: dispatch:
; CHECK-NEXT: [[C1:%.*]] = trunc i8 [[V1:%.*]] to i1
-; CHECK-NEXT: br i1 [[C1]], label [[FINAL_LEFT:%.*]], label [[FINAL_RIGHT]]
+; CHECK-NEXT: [[OR_COND:%.*]] = select i1 [[C0]], i1 [[C1]], i1 false
+; CHECK-NEXT: br i1 [[OR_COND]], label [[FINAL_LEFT:%.*]], label [[FINAL_RIGHT:%.*]]
; CHECK: common.ret:
; CHECK-NEXT: ret void
; CHECK: final_left:
|
There are test failures. |
f2bf8e2
to
ca29cef
Compare
sorry about that the tests are fixed now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/9813 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/94/builds/6162 Here is the relevant piece of the build log for the reference
|
Looks like this has negative impact on clang stage2: https://llvm-compile-time-tracker.com/compare.php?from=0f607f3df54e22896b484510f0c1ccfb718de67a&to=ed43207306f7351f2b4f8284710b028df973d74e&stat=instructions:u |
So shall I revert it? do I understand correct that it is not the optimization that is slow but it generates a slower binary |
Yes, it generates a slower binary, at least in terms of instructions retired. I think part of the problem here may be that a lot of targets seem to treat all truncates as free? https://llvm.godbolt.org/z/49Tq8qEqe So they don't count towards the bonus inst threshold. A trunc i1 without nuw should have the same cost as an and 1, I think. |
I thought this patch would reduce the number of branches. Perhaps we can adjust |
The condition is not included in the cost comparison see: llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp Lines 4150 to 4152 in da17ced
so I do not know if there will be some change to execution time except that less branches will be removed. |
Noticed regressions due to this missing fold when trying to add the fold
icmp ne (and X, 1), 0 --> trunc X to i1
proof: https://alive2.llvm.org/ce/z/v32Aof