-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[llvm][InstCombine] Fold select to cmp for weak and inverted inequalities #143445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Yash Solanki (yashnator) ChangesCurrently: int signum(int x) {
if (x < 0) return -1;
if (x > 0) return +1;
return 0;
} is not identified as This patch adds folding for the edge case: Alive2 proof (taken from issue) Resolves #143259 Full diff: https://github.com/llvm/llvm-project/pull/143445.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 8f46ae304353d..ec0cda18a6492 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -3603,6 +3603,18 @@ Instruction *InstCombinerImpl::foldSelectToCmp(SelectInst &SI) {
ICmpInst::getSwappedPredicate(ExtendedCmpPredicate) == Pred))
Replace = true;
+ // Handle the edge case (x > -1) ? zext(x != 0), -1
+ if (IsSigned && ICmpInst::isGT(Pred) && match(FV, m_AllOnes()) &&
+ match(TV, m_ZExt(m_c_ICmp(ExtendedCmpPredicate, m_Specific(LHS),
+ m_Zero()))) &&
+ (ExtendedCmpPredicate == ICmpInst::ICMP_NE ||
+ ICmpInst::getSwappedPredicate(ExtendedCmpPredicate) == Pred)) {
+ Value *Zero = ConstantInt::get(LHS->getType(), 0);
+ return replaceInstUsesWith(
+ SI,
+ Builder.CreateIntrinsic(SI.getType(), Intrinsic::scmp, {LHS, Zero}));
+ }
+
// (x == y) ? 0 : (x > y ? 1 : -1)
CmpPredicate FalseBranchSelectPredicate;
const APInt *InnerTV, *InnerFV;
diff --git a/llvm/test/Transforms/InstCombine/scmp.ll b/llvm/test/Transforms/InstCombine/scmp.ll
index 2140a59de3fa9..b685ca20998bd 100644
--- a/llvm/test/Transforms/InstCombine/scmp.ll
+++ b/llvm/test/Transforms/InstCombine/scmp.ll
@@ -473,3 +473,17 @@ define i8 @scmp_from_select_eq_and_gt_neg3(i32 %x, i32 %y) {
%r = select i1 %eq, i8 0, i8 %sel1
ret i8 %r
}
+
+; Fold (x > -1) ? zext(x != 0), -1 to scmp(x, 0)
+define i32 @scmp_x_0_from_gt_minus_1(i32 noundef %0) local_unnamed_addr #0 {
+; CHECK-LABEL: define i32 @scmp_x_0_from_gt_minus_1(
+; CHECK-SAME: i32 noundef [[TMP0:%.*]]) local_unnamed_addr {
+; CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.scmp.i32.i32(i32 [[TMP0]], i32 0)
+; CHECK-NEXT: ret i32 [[TMP2]]
+;
+ %2 = icmp ne i32 %0, 0
+ %3 = zext i1 %2 to i32
+ %4 = icmp sgt i32 %0, -1
+ %5 = select i1 %4, i32 %3, i32 -1
+ ret i32 %5
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any test for the change?
@el-ev I had added a test in scmp.ll- |
Could you please add additional test cases and pre-commit the tests? Please check https://llvm.org/docs/InstCombineContributorGuide.html |
Irrelevant to this PR in itself, but I did notice when running both src in tgt in llc, but why does llc lower the src in the alive2 posted better than the intrinsic? |
I'm guessing you mean on X86. It looks neutral on RISC-V and scmp looks better for AArch64. |
1d1ad2b
to
a9e0877
Compare
@el-ev I have added more tests. I also added changes to handle more general predicates |
Yeah, might have to work on that. I mean human written asm for this in particular is:
but I do not know if I can get the compiler to lower that to this. |
|
Why? |
Nevermind, I think its ok, but there's one partial register write and one false dependency in that code. |
partial register write, yes, deliberate |
if you're looking at I don't know if LLVM is aware of special cases like that though. Or if it's worth teaching LLVM that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a getFlippedStrictnessPredicateAndConstant() helper for cases like this. I think we can handle this generically by doing a swap of the select arms to get the constant into TV, inverting the predicate, and then using getFlippedStrictnessPredicateAndConstant() to convert the now non-strict predicate into a strict predicate.
For a long time it was only special cased on AMD CPUs. I don't if Intel finally fixed it or not. There's a |
to cmp with constants in non-canonical inequalities
…ualities For constant y, inequalities aren't always canonical so we need to check the conditions such as - (x > y - 1) ? zext(x != y) : -1 - (x > y - 1) ? zext(x > y) : -1 - (x < y + 1) ? sext(x != y) : 1 - (x < y + 1) ? sext(x < y) : 1 and similary non-strict equalities. Fold select into scmp/ucmp based on signedness of the comparison predicate. Resolves llvm#143259
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I don't have commit access, can you help merge? |
Currently:
is not identified as
scmp(x,0)
.This patch adds folding for the edge case:
(x > -1) ? zext(x != 0) : -1
, which is generated for the above signum and is equivalent toscmp(x, 0)
. For other constants/variables, the fold is already optimised.Alive2 proof (taken from issue)
Resolves #143259