Skip to content

[InstCombine] Teach foldSelectOpOp about samesign #122723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 14, 2025

Conversation

artagnon
Copy link
Contributor

Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid of one of the FIXMEs it introduced by replacing a predicate comparison with CmpPredicate::getMatching.

Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of one of the FIXMEs it introduced by replacing a predicate comparison
with CmpPredicate::getMatching.
@artagnon artagnon requested a review from dtcxzyw January 13, 2025 15:19
@artagnon artagnon requested a review from nikic as a code owner January 13, 2025 15:19
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jan 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 13, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid of one of the FIXMEs it introduced by replacing a predicate comparison with CmpPredicate::getMatching.


Full diff: https://github.com/llvm/llvm-project/pull/122723.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp.ll (+96)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index f66a976ccb47fe..d5d9a829c3068a 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -428,10 +428,10 @@ Instruction *InstCombinerImpl::foldSelectOpOp(SelectInst &SI, Instruction *TI,
     CmpPredicate TPred, FPred;
     if (match(TI, m_ICmp(TPred, m_Value(), m_Value())) &&
         match(FI, m_ICmp(FPred, m_Value(), m_Value()))) {
-      // FIXME: Use CmpPredicate::getMatching here.
-      CmpInst::Predicate T = TPred, F = FPred;
-      if (T == F || T == ICmpInst::getSwappedCmpPredicate(F)) {
-        bool Swapped = T != F;
+      bool Swapped = ICmpInst::isRelational(FPred) &&
+                     CmpPredicate::getMatching(
+                         TPred, ICmpInst::getSwappedCmpPredicate(FPred));
+      if (CmpPredicate::getMatching(TPred, FPred) || Swapped) {
         if (Value *MatchOp =
                 getCommonOp(TI, FI, ICmpInst::isEquality(TPred), Swapped)) {
           Value *NewSel = Builder.CreateSelect(Cond, OtherOpT, OtherOpF,
diff --git a/llvm/test/Transforms/InstCombine/select-cmp.ll b/llvm/test/Transforms/InstCombine/select-cmp.ll
index f7505bd85f89eb..7e5d5821d9f6a7 100644
--- a/llvm/test/Transforms/InstCombine/select-cmp.ll
+++ b/llvm/test/Transforms/InstCombine/select-cmp.ll
@@ -23,6 +23,18 @@ define i1 @icmp_ne_common_op00(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_ne_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_ne_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp ne i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp samesign ne i6 %x, %y
+  %cmp2 = icmp ne i6 %x, %z
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_ne_common_op01(i1 %c, i3 %x, i3 %y, i3 %z) {
 ; CHECK-LABEL: @icmp_ne_common_op01(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i3 [[Y:%.*]], i3 [[Z:%.*]]
@@ -71,6 +83,18 @@ define i1 @icmp_eq_common_op00(i1 %c, i5 %x, i5 %y, i5 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_eq_samesign_common(i1 %c, i5 %x, i5 %y, i5 %z) {
+; CHECK-LABEL: @icmp_eq_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i5 [[Y:%.*]], i5 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp eq i5 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp eq i5 %x, %y
+  %cmp2 = icmp samesign eq i5 %x, %z
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define <5 x i1> @icmp_eq_common_op01(<5 x i1> %c, <5 x i7> %x, <5 x i7> %y, <5 x i7> %z) {
 ; CHECK-LABEL: @icmp_eq_common_op01(
 ; CHECK-NEXT:    [[R_V:%.*]] = select <5 x i1> [[C:%.*]], <5 x i7> [[Y:%.*]], <5 x i7> [[Z:%.*]]
@@ -134,6 +158,18 @@ define i1 @icmp_slt_common(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_slt_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_slt_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp samesign ult i6 %x, %y
+  %cmp2 = icmp slt i6 %x, %z
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_sgt_common(i1 %c, i6 %x, i6 %y, i6 %z) {
 ; CHECK-LABEL: @icmp_sgt_common(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
@@ -146,6 +182,18 @@ define i1 @icmp_sgt_common(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_sgt_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_sgt_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp ugt i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp samesign ugt i6 %x, %y
+  %cmp2 = icmp sgt i6 %x, %z
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_sle_common(i1 %c, i6 %x, i6 %y, i6 %z) {
 ; CHECK-LABEL: @icmp_sle_common(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
@@ -158,6 +206,18 @@ define i1 @icmp_sle_common(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_sle_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_sle_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp sge i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp sle i6 %y, %x
+  %cmp2 = icmp samesign ule i6 %z, %x
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_sge_common(i1 %c, i6 %x, i6 %y, i6 %z) {
 ; CHECK-LABEL: @icmp_sge_common(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
@@ -170,6 +230,18 @@ define i1 @icmp_sge_common(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_sge_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_sge_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp sle i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp sge i6 %y, %x
+  %cmp2 = icmp samesign uge i6 %z, %x
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_slt_sgt_common(i1 %c, i6 %x, i6 %y, i6 %z) {
 ; CHECK-LABEL: @icmp_slt_sgt_common(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
@@ -182,6 +254,18 @@ define i1 @icmp_slt_sgt_common(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_slt_sgt_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_slt_sgt_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp samesign ult i6 %x, %y
+  %cmp2 = icmp sgt i6 %z, %x
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_sle_sge_common(i1 %c, i6 %x, i6 %y, i6 %z) {
 ; CHECK-LABEL: @icmp_sle_sge_common(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
@@ -194,6 +278,18 @@ define i1 @icmp_sle_sge_common(i1 %c, i6 %x, i6 %y, i6 %z) {
   ret i1 %r
 }
 
+define i1 @icmp_sle_sge_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
+; CHECK-LABEL: @icmp_sle_sge_samesign_common(
+; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = icmp sge i6 [[X:%.*]], [[R_V]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %cmp1 = icmp sle i6 %y, %x
+  %cmp2 = icmp samesign uge i6 %x, %z
+  %r = select i1 %c, i1 %cmp1, i1 %cmp2
+  ret i1 %r
+}
+
 define i1 @icmp_ult_common(i1 %c, i6 %x, i6 %y, i6 %z) {
 ; CHECK-LABEL: @icmp_ult_common(
 ; CHECK-NEXT:    [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]

@nikic nikic changed the title IC: teach foldSelectOpOp about samesign [InstCombine] Teach foldSelectOpOp about samesign Jan 13, 2025
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@artagnon artagnon merged commit 48757e0 into llvm:main Jan 14, 2025
9 of 11 checks passed
@artagnon artagnon deleted the ic-samesign-getmatching branch January 14, 2025 19:58
@alexfh
Copy link
Contributor

alexfh commented Jan 20, 2025

Hi @artagnon, is this change expected to change floating point evaluation precision? We've found a case where this leads to different results.

@asmok-g
Copy link

asmok-g commented Jan 20, 2025

A bit of an incomplete hint; I think the most interesting diff in IR before and after this change is:
there are two missing adds before icmp

-  %111 = add nsw i32 %110, -131072
-  %112 = icmp ult i32 %111, -262144
-  %113 = select i1 %112, i32 0, i32 %110
+  %111 = icmp ugt i32 %110, 131071
+  %112 = select i1 %111, i32 0, i32 %110

and

-  %195 = ashr exact i32 %194, 13
-  %196 = add nsw i32 %195, -131072
-  %197 = icmp ult i32 %196, -262144
...
+  %194 = ashr exact i32 %193, 13
+  %195 = icmp ugt i32 %194, 131071

@artagnon
Copy link
Contributor Author

Hi @artagnon, is this change expected to change floating point evaluation precision? We've found a case where this leads to different results.

Seems a bit strange because the change did not touch FP compares. Could you kindly post an IR diff?

A bit of an incomplete hint; I think the most interesting diff in IR before and after this change is: there are two missing adds before icmp

-  %111 = add nsw i32 %110, -131072
-  %112 = icmp ult i32 %111, -262144
-  %113 = select i1 %112, i32 0, i32 %110
+  %111 = icmp ugt i32 %110, 131071
+  %112 = select i1 %111, i32 0, i32 %110

and

-  %195 = ashr exact i32 %194, 13
-  %196 = add nsw i32 %195, -131072
-  %197 = icmp ult i32 %196, -262144
...
+  %194 = ashr exact i32 %193, 13
+  %195 = icmp ugt i32 %194, 131071

A strange diff, because the patch was only supposed to affect icmp samesign.

@alexfh
Copy link
Contributor

alexfh commented Jan 21, 2025

This diff is a result of the whole -O3 optimization pipeline. It could be that the change here may affect further passes. We couldn't use -opt-bisec-limit to get to a specific optimization pass, since there's a large range of -opt-bisect-limit values where Clang just crashes for the input file in question.

@alexfh
Copy link
Contributor

alexfh commented Jan 23, 2025

A reduced test case, which, I believe, demonstrates a miscompile:

$ cat reduced.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%class.ac_fixed.365.82211.87406.89902.90318.91358.93022.97780.98869.98881.98884.98887.98899.98905.98965.98974.98983.98989.98995.98998.99004.99010.99016.99022 = type { %"class.ac_private::iv.16.82007.87359.89855.90271.91311.92975.97767.98868.98880.98883.98886.98898.98904.98964.98973.98982.98988.98994.98997.99003.99009.99015.99021" }
%"class.ac_private::iv.16.82007.87359.89855.90271.91311.92975.97767.98868.98880.98883.98886.98898.98904.98964.98973.98982.98988.98994.98997.99003.99009.99015.99021" = type { [1 x i32] }
%class.ac_fixed.626.82218.87557.90053.90469.91509.93173.97931.98870.98882.98885.98888.98900.98906.98966.98975.98984.98990.98996.98999.99005.99011.99017.99023 = type { %"class.ac_private::iv.16.82007.87359.89855.90271.91311.92975.97767.98868.98880.98883.98886.98898.98904.98964.98973.98982.98988.98994.98997.99003.99009.99015.99021" }

; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #0

define void @_f9() {
  %1 = alloca [0 x [0 x [0 x [0 x %class.ac_fixed.365.82211.87406.89902.90318.91358.93022.97780.98869.98881.98884.98887.98899.98905.98965.98974.98983.98989.98995.98998.99004.99010.99016.99022]]]], i32 0, align 4
  call void @_f11(ptr %1)
  %2 = call i1 @_f12(ptr %1, double 5.000000e-01)
  br i1 %2, label %5, label %3

3:                                                ; preds = %0
  %4 = call i1 @_f1(ptr %1, double -5.000000e-01)
  br i1 %4, label %5, label %6

5:                                                ; preds = %3, %0
  call void @llvm.lifetime.start.p0(i64 0, ptr null)
  br label %6

6:                                                ; preds = %5, %3
  ret void
}

define void @_f11(ptr %0) {
  %2 = load volatile i1, ptr %0, align 1
  call void @_f10(ptr %0)
  ret void
}

define i1 @_f12(ptr %0, double %1) {
  %3 = call i1 @_f1(ptr %0, double %1)
  %4 = xor i1 %3, true
  ret i1 %4
}

define i1 @_f1(ptr %0, double %1) {
  %3 = alloca [0 x [0 x [0 x [0 x %class.ac_fixed.626.82218.87557.90053.90469.91509.93173.97931.98870.98882.98885.98888.98900.98906.98966.98975.98984.98990.98996.98999.99005.99011.99017.99023]]]], i32 0, align 4
  %4 = call i1 @_f2(ptr %0)
  %5 = fcmp uge double %1, 0.000000e+00
  %.not = xor i1 %4, %5
  br i1 %.not, label %8, label %6

6:                                                ; preds = %2
  %7 = call i1 @_f2(ptr %0)
  br label %11

8:                                                ; preds = %2
  %9 = call double @_f3(double %1)
  call void @_f4(ptr %3, double %9)
  %10 = call i1 @_f5(ptr %0, ptr %3)
  br label %11

11:                                               ; preds = %8, %6
  %.0 = phi i1 [ %7, %6 ], [ %10, %8 ]
  ret i1 %.0
}

define i1 @_f6(ptr %0, ptr %1) {
  %3 = call i1 @_f7(ptr %0, ptr %1)
  ret i1 %3
}

define i1 @_f7(ptr %0, ptr %1) {
  %3 = load i32, ptr %0, align 4
  %4 = load i32, ptr %1, align 4
  %5 = icmp slt i32 %3, %4
  ret i1 %5
}

define void @_f4(ptr %0, double %1) {
  call void @_f8(double %1, ptr %0)
  ret void
}

define void @_f8(double %0, ptr %1) {
  %3 = fcmp olt double %0, 0.000000e+00
  %4 = fmul double %0, 0x41F0000000000000
  %5 = select i1 %3, double 0.000000e+00, double %4
  %6 = fptoui double %5 to i32
  %7 = sext i1 %3 to i32
  %8 = xor i32 %6, %7
  store i32 %8, ptr %1, align 4
  ret void
}

define void @_f10(ptr %0) {
  %2 = load i32, ptr %0, align 4
  %3 = ashr i32 %2, 1
  store i32 %3, ptr %0, align 4
  ret void
}

define i1 @_f2(ptr %0) {
  %2 = load i32, ptr %0, align 4
  %3 = icmp slt i32 %2, 0
  ret i1 %3
}

define double @_f3(double %0) {
  %2 = fmul double %0, 0x3F10000000000000
  ret double %2
}

define i1 @_f5(ptr %0, ptr %1) {
  %3 = call i1 @_f6(ptr %0, ptr %1)
  ret i1 %3
}

; uselistorder directives
uselistorder ptr @_f1, { 1, 0 }
uselistorder ptr @_f2, { 1, 0 }

attributes #0 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
$ diff -u <(./clang-before -O3 reduced.ll -emit-llvm -S -o -) <(./clang-after -O3  reduced.ll -emit-llvm -S -o -)
--- /dev/fd/63  2025-01-23 04:25:41.674046441 +0100
+++ /dev/fd/62  2025-01-23 04:25:41.682046526 +0100
@@ -15,16 +15,14 @@
   %0 = alloca [0 x [0 x [0 x [0 x [0 x %class.ac_fixed.365.82211.87406.89902.90318.91358.93022.97780.98869.98881.98884.98887.98899.98905.98965.98974.98983.98989.98995.98998.99004.99010.99016.99022]]]]], align 4
   %1 = load volatile i1, ptr %0, align 4
   %2 = load i32, ptr %0, align 4
-  %3 = ashr i32 %2, 1
-  %4 = add nsw i32 %3, -131072
-  %or.cond2 = icmp ult i32 %4, -131073
-  br i1 %or.cond2, label %5, label %_f1.exit.thread
+  %spec.select3 = icmp ugt i32 %2, 262143
+  br i1 %spec.select3, label %3, label %_f1.exit.thread

-5:                                                ; preds = %_f12.exit
+3:                                                ; preds = %_f12.exit
   tail call void @llvm.lifetime.start.p0(i64 0, ptr null)
   br label %_f1.exit.thread

-_f1.exit.thread:                                  ; preds = %_f12.exit, %5
+_f1.exit.thread:                                  ; preds = %_f12.exit, %3
   ret void
 }

If I read LLVM IR correctly, the branch condition in the "before" version of the diff corresponds to (%2 >> 1) - 131072 < -131073, in the "after" it is %2 > 262143, which is not equivalent. clang-before is built from 0e7b754, clang-after - from 48757e0.

If my analysis is correct, please fix or revert soon.

@alexfh
Copy link
Contributor

alexfh commented Jan 23, 2025

@nikic @dtcxzyw could you check my reasoning in the previous comment? Am I missing something or is it indeed a miscompile?

@alexfh
Copy link
Contributor

alexfh commented Jan 23, 2025

Created #124123

@artagnon
Copy link
Contributor Author

Thanks for the test case @alexfh. May I request that you run the IR through metarenamer, create a new issue, and assign it to me, so we don't lose track of it? In the meantime, I think a revert is okay, and have left a comment on #124123.

@goldsteinn
Copy link
Contributor

@nikic @dtcxzyw could you check my reasoning in the previous comment? Am I missing something or is it indeed a miscompile?

Unless there is a contraint on %2 I think its a miscompile: https://alive2.llvm.org/ce/z/B67KTZ

@nikic
Copy link
Contributor

nikic commented Jan 23, 2025

Single-function reproducer for -passes=instcombine (https://alive2.llvm.org/ce/z/dMgLMj):

define i1 @test(i32 %3) {
entry:
  %4 = icmp slt i32 %3, 0
  %.not.i.i = xor i1 true, %4
  %5 = icmp samesign ult i32 %3, 131072
  %spec.select = select i1 %.not.i.i, i1 %5, i1 %4
  %6 = xor i1 %spec.select, true
  %7 = icmp samesign ult i32 %3, -1
  %or.cond = select i1 %4, i1 %7, i1 false
  %or.cond2 = select i1 %6, i1 true, i1 %or.cond
  ret i1 %or.cond2
}

@nikic
Copy link
Contributor

nikic commented Jan 23, 2025

Reduced:

define i1 @src(i1 %c, i32 %arg) {
  %cmp1 = icmp samesign ult i32 %arg, 131072
  %cmp2 = icmp slt i32 %arg, 0
  %select = select i1 %c, i1 %cmp1, i1 %cmp2
  ret i1 %select
}

define i1 @tgt(i1 %c, i32 %arg) {
  %select.v = select i1 %c, i32 131072, i32 0
  %select = icmp ult i32 %arg, %select.v
  ret i1 %select
}

The predicate in tgt should be slt, not ult.

bool Swapped = ICmpInst::isRelational(FPred) &&
CmpPredicate::getMatching(
TPred, ICmpInst::getSwappedCmpPredicate(FPred));
if (CmpPredicate::getMatching(TPred, FPred) || Swapped) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use TPred below, instead of the matching predicate. Sorry for missing this obvious bug :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the investigation and test case: I was sick over the last few days, and couldn't do it myself :(
Will post a re-land shortly.

define i1 @icmp_slt_samesign_common(i1 %c, i6 %x, i6 %y, i6 %z) {
; CHECK-LABEL: @icmp_slt_samesign_common(
; CHECK-NEXT: [[R_V:%.*]] = select i1 [[C:%.*]], i6 [[Y:%.*]], i6 [[Z:%.*]]
; CHECK-NEXT: [[R:%.*]] = icmp ult i6 [[X:%.*]], [[R_V]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug is also visible in this test case.

alexfh added a commit that referenced this pull request Jan 24, 2025
Reverts #122723 due to a miscompilation

See
#122723 (comment)
for details and the test case.
@alexfh
Copy link
Contributor

alexfh commented Jan 24, 2025

May I request that you run the IR through metarenamer, create a new issue, and assign it to me, so we don't lose track of it? In the meantime, I think a revert is okay, and have left a comment on #124123.

Sure, #124213

github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 24, 2025
artagnon added a commit to artagnon/llvm-project that referenced this pull request Jan 24, 2025
Changes: There was a serious bug in the previous patch, leading to a
miscompile. See llvm#122723 for the miscompile report from Alexander, and
the follow-up investigation by Nikita. The patch has since been
reworked, and now includes the testcase from the miscompile.

Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of one of the FIXMEs it introduced by replacing a predicate comparison
with CmpPredicate::getMatching.

Co-authored-by: Nikita Popov <[email protected]>
artagnon added a commit to artagnon/llvm-project that referenced this pull request Jan 28, 2025
Changes: There was a serious bug in the previous patch, leading to a
miscompile. See llvm#122723 for the miscompile report from Alexander, and
the follow-up investigation by Nikita. The patch has since been
reworked, and now includes the testcase from the miscompile.

Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of one of the FIXMEs it introduced by replacing a predicate comparison
with CmpPredicate::getMatching.

Co-authored-by: Nikita Popov <[email protected]>
artagnon added a commit that referenced this pull request Jan 28, 2025
Changes: There was a serious bug in the previous patch, leading to a
miscompile. See #122723 for the miscompile report from Alexander, and
the follow-up investigation by Nikita. The patch has since been
reworked, and now includes the testcase from the miscompile.

Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of one of the FIXMEs it introduced by replacing a predicate comparison
with CmpPredicate::getMatching.

Co-authored-by: Nikita Popov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants