-
Notifications
You must be signed in to change notification settings - Fork 14.3k
SimplifyIndVar: teach widenLoopCompare about samesign #125764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Ramkumar Ramachandra (artagnon) ChangesThere is still some way to go to optimize optimally with samesign. Patch is 26.91 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125764.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
index e41a1adadfcc5bd..7b9c5c77cbe986e 100644
--- a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
@@ -1614,7 +1614,8 @@ bool WidenIV::widenLoopCompare(WidenIV::NarrowIVDefUse DU) {
// (A) == icmp slt i32 sext(%narrow), sext(%val)
// == icmp slt i32 zext(%narrow), sext(%val)
bool IsSigned = getExtendKind(DU.NarrowDef) == ExtendKind::Sign;
- if (!(DU.NeverNegative || IsSigned == Cmp->isSigned()))
+ bool CmpPreferredSign = Cmp->hasSameSign() ? IsSigned : Cmp->isSigned();
+ if (!DU.NeverNegative && IsSigned != CmpPreferredSign)
return false;
Value *Op = Cmp->getOperand(Cmp->getOperand(0) == DU.NarrowDef ? 1 : 0);
@@ -1627,7 +1628,7 @@ bool WidenIV::widenLoopCompare(WidenIV::NarrowIVDefUse DU) {
// Widen the other operand of the compare, if necessary.
if (CastWidth < IVWidth) {
- Value *ExtOp = createExtendInst(Op, WideType, Cmp->isSigned(), Cmp);
+ Value *ExtOp = createExtendInst(Op, WideType, CmpPreferredSign, Cmp);
DU.NarrowUse->replaceUsesOfWith(Op, ExtOp);
}
return true;
diff --git a/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll b/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
new file mode 100644
index 000000000000000..2b7cbeb97fb1a76
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
@@ -0,0 +1,484 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=indvars -S | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+
+define i32 @iv_zext_zext_sgt_slt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_zext_zext_sgt_slt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: [[INDVARS_IV_NEXT2:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT2]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[IV_OUTER:%.*]] = phi i32 [ [[IV_OUTER_1:%.*]], %[[PH]] ], [ [[ITER_COUNT]], %[[ENTRY]] ]
+; CHECK-NEXT: [[IV_OUTER_1]] = add nsw i32 [[IV_OUTER]], -1
+; CHECK-NEXT: [[EXT_OUTER:%.*]] = zext nneg i32 [[IV_OUTER_1]] to i64
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[EXT_OUTER]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp sgt i32 [[IV_OUTER]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK: [[EXIT:.*:]]
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ br label %outer.loop
+
+ph:
+ br label %outer.loop
+
+outer.loop:
+ %iv.outer = phi i32 [ %iv.outer.1, %ph ], [ %iter.count, %entry ]
+ %iv.outer.1 = add nsw i32 %iv.outer, -1
+ %ext.outer = zext nneg i32 %iv.outer.1 to i64
+ %gep.outer = getelementptr double, ptr %ptr, i64 %ext.outer
+ store double poison, ptr %gep.outer
+ %exit.cond.outer = icmp sgt i32 %iv.outer, 1
+ br i1 %exit.cond.outer, label %inner.loop, label %ph
+
+inner.loop:
+ %iv.inner = phi i32 [ %iv.next, %inner.loop ], [ 0, %outer.loop ]
+ %ext.inner = zext nneg i32 %iv.inner to i64
+ %gep.inner = getelementptr double, ptr %ptr, i64 %ext.inner
+ store double poison, ptr %gep.inner
+ %iv.next = add nuw nsw i32 %iv.inner, 1
+ %exit.cond.inner = icmp slt i32 %iv.next, %iv.outer.1
+ br i1 %exit.cond.inner, label %inner.loop, label %ph
+
+exit:
+ ret i32 0
+}
+
+define i32 @iv_zext_zext_gt_slt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_zext_zext_gt_slt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[ITER_COUNT]] to i64
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[INDVARS_IV_NEXT2]] = add nsw i64 [[INDVARS_IV1]], -1
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV_NEXT2]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i64 [[INDVARS_IV1]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXIT_COND_INNER:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], [[INDVARS_IV_NEXT2]]
+; CHECK-NEXT: br i1 [[EXIT_COND_INNER]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK: [[EXIT:.*:]]
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ br label %outer.loop
+
+ph:
+ br label %outer.loop
+
+outer.loop:
+ %iv.outer = phi i32 [ %iv.outer.1, %ph ], [ %iter.count, %entry ]
+ %iv.outer.1 = add nsw i32 %iv.outer, -1
+ %ext.outer = zext nneg i32 %iv.outer.1 to i64
+ %gep.outer = getelementptr double, ptr %ptr, i64 %ext.outer
+ store double poison, ptr %gep.outer
+ %exit.cond.outer = icmp samesign ugt i32 %iv.outer, 1
+ br i1 %exit.cond.outer, label %inner.loop, label %ph
+
+inner.loop:
+ %iv.inner = phi i32 [ %iv.next, %inner.loop ], [ 0, %outer.loop ]
+ %ext.inner = zext nneg i32 %iv.inner to i64
+ %gep.inner = getelementptr double, ptr %ptr, i64 %ext.inner
+ store double poison, ptr %gep.inner
+ %iv.next = add nuw nsw i32 %iv.inner, 1
+ %exit.cond.inner = icmp slt i32 %iv.next, %iv.outer.1
+ br i1 %exit.cond.inner, label %inner.loop, label %ph
+
+exit:
+ ret i32 0
+}
+
+define i32 @iv_zext_zext_sgt_lt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_zext_zext_sgt_lt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: [[INDVARS_IV_NEXT2:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT2]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[IV_OUTER:%.*]] = phi i32 [ [[IV_OUTER_1:%.*]], %[[PH]] ], [ [[ITER_COUNT]], %[[ENTRY]] ]
+; CHECK-NEXT: [[IV_OUTER_1]] = add nsw i32 [[IV_OUTER]], -1
+; CHECK-NEXT: [[EXT_OUTER:%.*]] = zext nneg i32 [[IV_OUTER_1]] to i64
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[EXT_OUTER]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp sgt i32 [[IV_OUTER]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK: [[EXIT:.*:]]
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ br label %outer.loop
+
+ph:
+ br label %outer.loop
+
+outer.loop:
+ %iv.outer = phi i32 [ %iv.outer.1, %ph ], [ %iter.count, %entry ]
+ %iv.outer.1 = add nsw i32 %iv.outer, -1
+ %ext.outer = zext nneg i32 %iv.outer.1 to i64
+ %gep.outer = getelementptr double, ptr %ptr, i64 %ext.outer
+ store double poison, ptr %gep.outer
+ %exit.cond.outer = icmp sgt i32 %iv.outer, 1
+ br i1 %exit.cond.outer, label %inner.loop, label %ph
+
+inner.loop:
+ %iv.inner = phi i32 [ %iv.next, %inner.loop ], [ 0, %outer.loop ]
+ %ext.inner = zext nneg i32 %iv.inner to i64
+ %gep.inner = getelementptr double, ptr %ptr, i64 %ext.inner
+ store double poison, ptr %gep.inner
+ %iv.next = add nuw nsw i32 %iv.inner, 1
+ %exit.cond.inner = icmp samesign ult i32 %iv.next, %iv.outer.1
+ br i1 %exit.cond.inner, label %inner.loop, label %ph
+
+exit:
+ ret i32 0
+}
+
+define i32 @iv_zext_zext_gt_lt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_zext_zext_gt_lt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: [[INDVARS_IV_NEXT2:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT2]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[IV_OUTER:%.*]] = phi i32 [ [[IV_OUTER_1:%.*]], %[[PH]] ], [ [[ITER_COUNT]], %[[ENTRY]] ]
+; CHECK-NEXT: [[IV_OUTER_1]] = add nsw i32 [[IV_OUTER]], -1
+; CHECK-NEXT: [[EXT_OUTER:%.*]] = zext nneg i32 [[IV_OUTER_1]] to i64
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[EXT_OUTER]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i32 [[IV_OUTER]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK: [[EXIT:.*:]]
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ br label %outer.loop
+
+ph:
+ br label %outer.loop
+
+outer.loop:
+ %iv.outer = phi i32 [ %iv.outer.1, %ph ], [ %iter.count, %entry ]
+ %iv.outer.1 = add nsw i32 %iv.outer, -1
+ %ext.outer = zext nneg i32 %iv.outer.1 to i64
+ %gep.outer = getelementptr double, ptr %ptr, i64 %ext.outer
+ store double poison, ptr %gep.outer
+ %exit.cond.outer = icmp samesign ugt i32 %iv.outer, 1
+ br i1 %exit.cond.outer, label %inner.loop, label %ph
+
+inner.loop:
+ %iv.inner = phi i32 [ %iv.next, %inner.loop ], [ 0, %outer.loop ]
+ %ext.inner = zext nneg i32 %iv.inner to i64
+ %gep.inner = getelementptr double, ptr %ptr, i64 %ext.inner
+ store double poison, ptr %gep.inner
+ %iv.next = add nuw nsw i32 %iv.inner, 1
+ %exit.cond.inner = icmp samesign ult i32 %iv.next, %iv.outer.1
+ br i1 %exit.cond.inner, label %inner.loop, label %ph
+
+exit:
+ ret i32 0
+}
+
+define i32 @iv_sext_sext_sgt_slt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_sext_sext_sgt_slt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
+; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[ITER_COUNT]] to i64
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: [[INDVARS_IV_NEXT2:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV3:%.*]] = phi i64 [ [[INDVARS_IV_NEXT4:%.*]], %[[PH]] ], [ [[TMP1]], %[[ENTRY]] ]
+; CHECK-NEXT: [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT2]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[INDVARS_IV_NEXT4]] = add nsw i64 [[INDVARS_IV3]], -1
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV_NEXT4]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp sgt i64 [[INDVARS_IV3]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK: [[EXIT:.*:]]
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ br label %outer.loop
+
+ph:
+ br label %outer.loop
+
+outer.loop:
+ %iv.outer = phi i32 [ %iv.outer.1, %ph ], [ %iter.count, %entry ]
+ %iv.outer.1 = add nsw i32 %iv.outer, -1
+ %ext.outer = sext i32 %iv.outer.1 to i64
+ %gep.outer = getelementptr double, ptr %ptr, i64 %ext.outer
+ store double poison, ptr %gep.outer
+ %exit.cond.outer = icmp sgt i32 %iv.outer, 1
+ br i1 %exit.cond.outer, label %inner.loop, label %ph
+
+inner.loop:
+ %iv.inner = phi i32 [ %iv.next, %inner.loop ], [ 0, %outer.loop ]
+ %ext.inner = sext i32 %iv.inner to i64
+ %gep.inner = getelementptr double, ptr %ptr, i64 %ext.inner
+ store double poison, ptr %gep.inner
+ %iv.next = add nuw nsw i32 %iv.inner, 1
+ %exit.cond.inner = icmp slt i32 %iv.next, %iv.outer.1
+ br i1 %exit.cond.inner, label %inner.loop, label %ph
+
+exit:
+ ret i32 0
+}
+
+define i32 @iv_sext_sext_gt_slt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_sext_sext_gt_slt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[ITER_COUNT]] to i64
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[INDVARS_IV_NEXT2]] = add nsw i64 [[INDVARS_IV1]], -1
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV_NEXT2]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i64 [[INDVARS_IV1]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXIT_COND_INNER:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], [[INDVARS_IV_NEXT2]]
+; CHECK-NEXT: br i1 [[EXIT_COND_INNER]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK: [[EXIT:.*:]]
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ br label %outer.loop
+
+ph:
+ br label %outer.loop
+
+outer.loop:
+ %iv.outer = phi i32 [ %iv.outer.1, %ph ], [ %iter.count, %entry ]
+ %iv.outer.1 = add nsw i32 %iv.outer, -1
+ %ext.outer = sext i32 %iv.outer.1 to i64
+ %gep.outer = getelementptr double, ptr %ptr, i64 %ext.outer
+ store double poison, ptr %gep.outer
+ %exit.cond.outer = icmp samesign ugt i32 %iv.outer, 1
+ br i1 %exit.cond.outer, label %inner.loop, label %ph
+
+inner.loop:
+ %iv.inner = phi i32 [ %iv.next, %inner.loop ], [ 0, %outer.loop ]
+ %ext.inner = sext i32 %iv.inner to i64
+ %gep.inner = getelementptr double, ptr %ptr, i64 %ext.inner
+ store double poison, ptr %gep.inner
+ %iv.next = add nuw nsw i32 %iv.inner, 1
+ %exit.cond.inner = icmp slt i32 %iv.next, %iv.outer.1
+ br i1 %exit.cond.inner, label %inner.loop, label %ph
+
+exit:
+ ret i32 0
+}
+
+define i32 @iv_sext_sext_sgt_lt(i32 %iter.count, ptr %ptr) {
+; CHECK-LABEL: define i32 @iv_sext_sext_sgt_lt(
+; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
+; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[ITER_COUNT]] to i64
+; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
+; CHECK: [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[PH:.*]]
+; CHECK: [[PH]]:
+; CHECK-NEXT: [[INDVARS_IV_NEXT2:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
+; CHECK-NEXT: br label %[[OUTER_LOOP]]
+; CHECK: [[OUTER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV3:%.*]] = phi i64 [ [[INDVARS_IV_NEXT4:%.*]], %[[PH]] ], [ [[TMP1]], %[[ENTRY]] ]
+; CHECK-NEXT: [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT2]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[INDVARS_IV_NEXT4]] = add nsw i64 [[INDVARS_IV3]], -1
+; CHECK-NEXT: [[GEP_OUTER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV_NEXT4]]
+; CHECK-NEXT: store double poison, ptr [[GEP_OUTER]], align 8
+; CHECK-NEXT: [[EXIT_COND_OUTER:%.*]] = icmp sgt i64 [[INDVARS_IV3]], 1
+; CHECK-NEXT: br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
+; CHECK: [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
+; CHECK: [[INNER_LOOP]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
+; CHECK-NEXT: [[GEP_INNER:%.*]] = getelementptr double, ptr [[PTR]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: store double poison, ptr [[GEP_INNER]], align 8
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proof: https://alive2.llvm.org/ce/z/NVXaeo
I think something the tests don't cover is that the RHS is sign extended rather than zero extended. (They use constant 1 where it makes no difference.)
0202f3b
to
1348f3a
Compare
I didn't want to change the gold tests, so I've added some additional tests. Let me know if this is fine. |
There is still some way to go to optimize optimally with samesign.
1348f3a
to
a6e387e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though I'd suggest to replace the "store double poison" with zero stores. There doesn't seem to be a need to use poison here.
I'm also wondering why the nested loops are needed for these tests. Shouldn't it be possible to show the same behavior without them and simplify the tests?
For this patch itself, yes. I merely re-used the tests that Yingwei had provided from the original regression: for that, I'm not able to reproduce the behavior without nested loops. |
Would it make sense to only keep the original test case once as motivation, but test the details of the change itself with simple tests? |
Variations of the original test case are needed anyway to show that the SCEV patch doesn't regress: if you recall, the regression was reported on samesign, and sext/zext/signed/unsigned variations are needed to compare outputs. Not sure if we want a reduced test just for this patch with all the variations duplicated. The regression isn't fully squashed yet. |
Okay, I don't mind landing the patch as-is. |
GitHub seems to have hung on this PR. Will try opening a duplicate and merging that. |
I encountered the same problem recently. Be patient and retry later :) |
It is still stuck after 24h, so I don't think the issue is going to be resolved. Merging #125851, and closing this one. |
Proof: https://alive2.llvm.org/ce/z/NVXaeo