-
Notifications
You must be signed in to change notification settings - Fork 14.3k
LAA: strip unit-stride check in getPtrStride #122879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Strip an unnecessary unit-stride check in getPtrStride, when checking whether null pointer dereference is undefined. This patch is similar in spirit to a353e25 (Don't require Stride == 1/-1 for inbounds pointer AddRecs nowrap.), and has the effect of removing redundant SCEV checks.
@llvm/pr-subscribers-llvm-transforms Author: Ramkumar Ramachandra (artagnon) ChangesStrip an unnecessary unit-stride check in getPtrStride, when checking whether null pointer dereference is undefined. This patch is similar in spirit to a353e25 (Don't require Stride == 1/-1 for inbounds pointer AddRecs nowrap.), and has the effect of removing redundant SCEV checks. -- 8< -- Patch is 64.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122879.diff 7 Files Affected:
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 2a68979add666d..4122afcc1db3c8 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1520,12 +1520,11 @@ llvm::getPtrStride(PredicatedScalarEvolution &PSE, Type *AccessTy, Value *Ptr,
GEP && GEP->hasNoUnsignedSignedWrap())
return Stride;
- // If the null pointer is undefined, then a access sequence which would
- // otherwise access it can be assumed not to unsigned wrap. Note that this
- // assumes the object in memory is aligned to the natural alignment.
+ // If the null pointer dereference is undefined, then a access sequence which
+ // would otherwise access it can be assumed not to unsigned wrap. Note that
+ // this assumes the object in memory is aligned to the natural alignment.
unsigned AddrSpace = Ty->getPointerAddressSpace();
- if (!NullPointerIsDefined(Lp->getHeader()->getParent(), AddrSpace) &&
- (Stride == 1 || Stride == -1))
+ if (!NullPointerIsDefined(Lp->getHeader()->getParent(), AddrSpace))
return Stride;
if (Assume) {
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll b/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll
index 5234d8f107271a..f2c1fb3e22b071 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll
@@ -19,7 +19,6 @@ define void @int_and_pointer_predicate(ptr %v, i32 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,1}<%loop> Added Flags: <nusw>
-; CHECK-NEXT: {%v,+,4}<%loop> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
;
@@ -93,8 +92,6 @@ define void @int_and_multiple_pointer_predicates(ptr %v, ptr %w, i32 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,1}<%loop> Added Flags: <nusw>
-; CHECK-NEXT: {%v,+,4}<%loop> Added Flags: <nusw>
-; CHECK-NEXT: {%w,+,4}<%loop> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
;
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll b/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
index 71c20bc2b2a824..457c85d2d7f916 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
@@ -39,7 +39,6 @@ define void @f1(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,2}<%for.body> Added Flags: <nusw>
-; CHECK-NEXT: {%a,+,4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
@@ -115,7 +114,6 @@ define void @f2(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nusw>
-; CHECK-NEXT: {((4 * (zext i31 (trunc i64 %N to i31) to i64))<nuw><nsw> + %a),+,-4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
@@ -176,7 +174,6 @@ define void @f3(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,2}<%for.body> Added Flags: <nssw>
-; CHECK-NEXT: {%a,+,4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
@@ -233,7 +230,6 @@ define void @f4(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nssw>
-; CHECK-NEXT: {((2 * (sext i32 (2 * (trunc i64 %N to i32)) to i64))<nsw> + %a),+,-4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
index caa98d766a8c34..e6f1099f12b39b 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
@@ -1120,35 +1120,8 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-SAME: ptr noalias [[SRC_1:%.*]], ptr noalias [[SRC_2:%.*]], ptr noalias [[SRC_3:%.*]], ptr noalias [[SRC_4:%.*]], ptr noalias [[DST:%.*]], i64 [[N:%.*]]) #[[ATTR3:[0-9]+]] {
; DEFAULT-NEXT: entry:
; DEFAULT-NEXT: [[TMP0:%.*]] = add i64 [[N]], 1
-; DEFAULT-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], 32
-; DEFAULT-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_SCEVCHECK:%.*]]
-; DEFAULT: vector.scevcheck:
-; DEFAULT-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[DST]], i64 4
-; DEFAULT-NEXT: [[MUL:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; DEFAULT-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i64, i1 } [[MUL]], 0
-; DEFAULT-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i64, i1 } [[MUL]], 1
-; DEFAULT-NEXT: [[TMP1:%.*]] = sub i64 0, [[MUL_RESULT]]
-; DEFAULT-NEXT: [[TMP2:%.*]] = getelementptr i8, ptr [[SCEVGEP]], i64 [[MUL_RESULT]]
-; DEFAULT-NEXT: [[TMP3:%.*]] = icmp ult ptr [[TMP2]], [[SCEVGEP]]
-; DEFAULT-NEXT: [[TMP4:%.*]] = or i1 [[TMP3]], [[MUL_OVERFLOW]]
-; DEFAULT-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[DST]], i64 8
-; DEFAULT-NEXT: [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; DEFAULT-NEXT: [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
-; DEFAULT-NEXT: [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
-; DEFAULT-NEXT: [[TMP5:%.*]] = sub i64 0, [[MUL_RESULT3]]
-; DEFAULT-NEXT: [[TMP6:%.*]] = getelementptr i8, ptr [[SCEVGEP1]], i64 [[MUL_RESULT3]]
-; DEFAULT-NEXT: [[TMP7:%.*]] = icmp ult ptr [[TMP6]], [[SCEVGEP1]]
-; DEFAULT-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW4]]
-; DEFAULT-NEXT: [[MUL5:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; DEFAULT-NEXT: [[MUL_RESULT6:%.*]] = extractvalue { i64, i1 } [[MUL5]], 0
-; DEFAULT-NEXT: [[MUL_OVERFLOW7:%.*]] = extractvalue { i64, i1 } [[MUL5]], 1
-; DEFAULT-NEXT: [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT6]]
-; DEFAULT-NEXT: [[TMP10:%.*]] = getelementptr i8, ptr [[DST]], i64 [[MUL_RESULT6]]
-; DEFAULT-NEXT: [[TMP11:%.*]] = icmp ult ptr [[TMP10]], [[DST]]
-; DEFAULT-NEXT: [[TMP12:%.*]] = or i1 [[TMP11]], [[MUL_OVERFLOW7]]
-; DEFAULT-NEXT: [[TMP13:%.*]] = or i1 [[TMP4]], [[TMP8]]
-; DEFAULT-NEXT: [[TMP14:%.*]] = or i1 [[TMP13]], [[TMP12]]
-; DEFAULT-NEXT: br i1 [[TMP14]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
+; DEFAULT-NEXT: [[TMP14:%.*]] = icmp ult i64 [[TMP0]], 8
+; DEFAULT-NEXT: br i1 [[TMP14]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
; DEFAULT: vector.ph:
; DEFAULT-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], 8
; DEFAULT-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
@@ -1190,7 +1163,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT: pred.store.continue:
; DEFAULT-NEXT: [[TMP31:%.*]] = extractelement <8 x i1> [[TMP22]], i32 1
; DEFAULT-NEXT: br i1 [[TMP31]], label [[PRED_STORE_IF14:%.*]], label [[PRED_STORE_CONTINUE15:%.*]]
-; DEFAULT: pred.store.if14:
+; DEFAULT: pred.store.if7:
; DEFAULT-NEXT: [[TMP32:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 1
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP32]], align 4
; DEFAULT-NEXT: [[TMP33:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 1
@@ -1202,10 +1175,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP37:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 1
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP37]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE15]]
-; DEFAULT: pred.store.continue15:
+; DEFAULT: pred.store.continue8:
; DEFAULT-NEXT: [[TMP38:%.*]] = extractelement <8 x i1> [[TMP22]], i32 2
; DEFAULT-NEXT: br i1 [[TMP38]], label [[PRED_STORE_IF16:%.*]], label [[PRED_STORE_CONTINUE17:%.*]]
-; DEFAULT: pred.store.if16:
+; DEFAULT: pred.store.if9:
; DEFAULT-NEXT: [[TMP39:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 2
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP39]], align 4
; DEFAULT-NEXT: [[TMP40:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 2
@@ -1217,10 +1190,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP44:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 2
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP44]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE17]]
-; DEFAULT: pred.store.continue17:
+; DEFAULT: pred.store.continue10:
; DEFAULT-NEXT: [[TMP45:%.*]] = extractelement <8 x i1> [[TMP22]], i32 3
; DEFAULT-NEXT: br i1 [[TMP45]], label [[PRED_STORE_IF18:%.*]], label [[PRED_STORE_CONTINUE19:%.*]]
-; DEFAULT: pred.store.if18:
+; DEFAULT: pred.store.if11:
; DEFAULT-NEXT: [[TMP46:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 3
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP46]], align 4
; DEFAULT-NEXT: [[TMP47:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 3
@@ -1232,10 +1205,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP51:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 3
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP51]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE19]]
-; DEFAULT: pred.store.continue19:
+; DEFAULT: pred.store.continue12:
; DEFAULT-NEXT: [[TMP52:%.*]] = extractelement <8 x i1> [[TMP22]], i32 4
; DEFAULT-NEXT: br i1 [[TMP52]], label [[PRED_STORE_IF20:%.*]], label [[PRED_STORE_CONTINUE21:%.*]]
-; DEFAULT: pred.store.if20:
+; DEFAULT: pred.store.if13:
; DEFAULT-NEXT: [[TMP53:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 4
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP53]], align 4
; DEFAULT-NEXT: [[TMP54:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 4
@@ -1247,10 +1220,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP58:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 4
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP58]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE21]]
-; DEFAULT: pred.store.continue21:
+; DEFAULT: pred.store.continue14:
; DEFAULT-NEXT: [[TMP59:%.*]] = extractelement <8 x i1> [[TMP22]], i32 5
; DEFAULT-NEXT: br i1 [[TMP59]], label [[PRED_STORE_IF22:%.*]], label [[PRED_STORE_CONTINUE23:%.*]]
-; DEFAULT: pred.store.if22:
+; DEFAULT: pred.store.if15:
; DEFAULT-NEXT: [[TMP60:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 5
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP60]], align 4
; DEFAULT-NEXT: [[TMP61:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 5
@@ -1262,10 +1235,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP65:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 5
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP65]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE23]]
-; DEFAULT: pred.store.continue23:
+; DEFAULT: pred.store.continue16:
; DEFAULT-NEXT: [[TMP66:%.*]] = extractelement <8 x i1> [[TMP22]], i32 6
; DEFAULT-NEXT: br i1 [[TMP66]], label [[PRED_STORE_IF24:%.*]], label [[PRED_STORE_CONTINUE25:%.*]]
-; DEFAULT: pred.store.if24:
+; DEFAULT: pred.store.if17:
; DEFAULT-NEXT: [[TMP67:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 6
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP67]], align 4
; DEFAULT-NEXT: [[TMP68:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 6
@@ -1277,10 +1250,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP72:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 6
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP72]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE25]]
-; DEFAULT: pred.store.continue25:
+; DEFAULT: pred.store.continue18:
; DEFAULT-NEXT: [[TMP73:%.*]] = extractelement <8 x i1> [[TMP22]], i32 7
; DEFAULT-NEXT: br i1 [[TMP73]], label [[PRED_STORE_IF26:%.*]], label [[PRED_STORE_CONTINUE27]]
-; DEFAULT: pred.store.if26:
+; DEFAULT: pred.store.if19:
; DEFAULT-NEXT: [[TMP74:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 7
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP74]], align 4
; DEFAULT-NEXT: [[TMP75:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 7
@@ -1292,7 +1265,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP79:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 7
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP79]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE27]]
-; DEFAULT: pred.store.continue27:
+; DEFAULT: pred.store.continue20:
; DEFAULT-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
; DEFAULT-NEXT: [[VEC_IND_NEXT]] = add <8 x i64> [[VEC_IND]], splat (i64 8)
; DEFAULT-NEXT: [[TMP80:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
@@ -1301,7 +1274,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
; DEFAULT-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
; DEFAULT: scalar.ph:
-; DEFAULT-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[VECTOR_SCEVCHECK]] ], [ 0, [[ENTRY:%.*]] ]
+; DEFAULT-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ]
; DEFAULT-NEXT: br label [[LOOP_HEADER:%.*]]
; DEFAULT: loop.header:
; DEFAULT-NEXT: [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
@@ -1336,33 +1309,6 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: entry:
; PRED-NEXT: [[TMP0:%.*]] = add i64 [[N]], 1
; PRED-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_SCEVCHECK:%.*]]
-; PRED: vector.scevcheck:
-; PRED-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[DST]], i64 4
-; PRED-NEXT: [[MUL:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; PRED-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i64, i1 } [[MUL]], 0
-; PRED-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i64, i1 } [[MUL]], 1
-; PRED-NEXT: [[TMP1:%.*]] = sub i64 0, [[MUL_RESULT]]
-; PRED-NEXT: [[TMP2:%.*]] = getelementptr i8, ptr [[SCEVGEP]], i64 [[MUL_RESULT]]
-; PRED-NEXT: [[TMP3:%.*]] = icmp ult ptr [[TMP2]], [[SCEVGEP]]
-; PRED-NEXT: [[TMP4:%.*]] = or i1 [[TMP3]], [[MUL_OVERFLOW]]
-; PRED-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[DST]], i64 8
-; PRED-NEXT: [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; PRED-NEXT: [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
-; PRED-NEXT: [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
-; PRED-NEXT: [[TMP5:%.*]] = sub i64 0, [[MUL_RESULT3]]
-; PRED-NEXT: [[TMP6:%.*]] = getelementptr i8, ptr [[SCEVGEP1]], i64 [[MUL_RESULT3]]
-; PRED-NEXT: [[TMP7:%.*]] = icmp ult ptr [[TMP6]], [[SCEVGEP1]]
-; PRED-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW4]]
-; PRED-NEXT: [[MUL5:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; PRED-NEXT: [[MUL_RESULT6:%.*]] = extractvalue { i64, i1 } [[MUL5]], 0
-; PRED-NEXT: [[MUL_OVERFLOW7:%.*]] = extractvalue { i64, i1 } [[MUL5]], 1
-; PRED-NEXT: [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT6]]
-; PRED-NEXT: [[TMP10:%.*]] = getelementptr i8, ptr [[DST]], i64 [[MUL_RESULT6]]
-; PRED-NEXT: [[TMP11:%.*]] = icmp ult ptr [[TMP10]], [[DST]]
-; PRED-NEXT: [[TMP12:%.*]] = or i1 [[TMP11]], [[MUL_OVERFLOW7]]
-; PRED-NEXT: [[TMP13:%.*]] = or i1 [[TMP4]], [[TMP8]]
-; PRED-NEXT: [[TMP14:%.*]] = or i1 [[TMP13]], [[TMP12]]
-; PRED-NEXT: br i1 [[TMP14]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
; PRED: vector.ph:
; PRED-NEXT: [[N_RND_UP:%.*]] = add i64 [[TMP0]], 7
; PRED-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N_RND_UP]], 8
@@ -1373,9 +1319,9 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: [[ACTIVE_LANE_MASK_ENTRY:%.*]] = call <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 0, i64 [[TMP0]])
; PRED-NEXT: br label [[VECTOR_BODY:%.*]]
; PRED: vector.body:
-; PRED-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE27:%.*]] ]
-; PRED-NEXT: [[ACTIVE_LANE_MASK:%.*]] = phi <8 x i1> [ [[ACTIVE_LANE_MASK_ENTRY]], [[VECTOR_PH]] ], [ [[ACTIVE_LANE_MASK_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
-; PRED-NEXT: [[VEC_IND:%.*]] = phi <8 x i64> [ <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
+; PRED-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_SCEVCHECK]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE27:%.*]] ]
+; PRED-NEXT: [[ACTIVE_LANE_MASK:%.*]] = phi <8 x i1> [ [[ACTIVE_LANE_MASK_ENTRY]], [[VECTOR_SCEVCHECK]] ], [ [[ACTIVE_LANE_MASK_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
+; PRED-NEXT: [[VEC_IND:%.*]] = phi <8 x i64> [ <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7>, [[VECTOR_SCEVCHECK]] ], [ [[VEC_IND_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
; PRED-NEXT: [[TMP18:%.*]] = load float, ptr [[SRC_1]], align 4
; PRED-NEXT: [[BROADCAST_SPLATINSERT8:%.*]] = insertelement <8 x float> poison, float [[TMP18]], i64 0
; PRED-NEXT: [[BROADCAST_SPLAT9:%.*]] = shufflevector <8 x float> [[BROADCAST_SPLATINSERT8]], <8 x float> poison, <8 x i32> zeroinitializer
@@ -1411,7 +1357,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED: pred.store.continue:
; PRED-NEXT: [[TMP35:%.*]] = extractelement <8 x i1> [[TMP26]], i32 1
; PRED-NEXT: br i1 [[TMP35]], label [[PRED_STORE_IF14:%.*]], label [[PRED_STORE_CONTINUE15:%.*]]
-; PRED: pred.store.if14:
+; PRED: pred.store.if7:
; PRED-NEXT: [[TMP36:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 1
; PRED-NEXT: store float 0.000000e+00, ptr [[TMP36]], align 4
; PRED-NEXT: [[TMP37:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 1
@@ -1423,10 +1369,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: [[TMP41:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 1
; PRED-NEXT: store float 0.000000e+00, ptr [[TMP41]], align 4
; PRED-NEXT: br label [[PRED_STORE_CONTINUE15]]
-; PRED: pred.store.continue15:
+; PRED: pred.store.continue8:
; PRED-NEXT: [[TMP42:%.*]] = extractelement <8 x i1> [[TMP26]], i32 2
; PRED-NEXT: br i1 [[TMP42]], label [[PRED_STORE_IF16:%.*]], label [[PRED_STORE_CONTINUE17:%.*]]
-; PRED: pred.store.if16:
+; PRED: pred.store.if9:
; PRED-NEXT: [[TMP43:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 2
; PRED-NEXT: store float 0.000000e+00, ptr [[TMP43]], align 4
; PRED-NEXT: [[TMP44:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 2
@@ -1438,10 +1384,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: [[TMP48:%.*]] = extractelem...
[truncated]
|
@llvm/pr-subscribers-llvm-analysis Author: Ramkumar Ramachandra (artagnon) ChangesStrip an unnecessary unit-stride check in getPtrStride, when checking whether null pointer dereference is undefined. This patch is similar in spirit to a353e25 (Don't require Stride == 1/-1 for inbounds pointer AddRecs nowrap.), and has the effect of removing redundant SCEV checks. -- 8< -- Patch is 64.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122879.diff 7 Files Affected:
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 2a68979add666d..4122afcc1db3c8 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1520,12 +1520,11 @@ llvm::getPtrStride(PredicatedScalarEvolution &PSE, Type *AccessTy, Value *Ptr,
GEP && GEP->hasNoUnsignedSignedWrap())
return Stride;
- // If the null pointer is undefined, then a access sequence which would
- // otherwise access it can be assumed not to unsigned wrap. Note that this
- // assumes the object in memory is aligned to the natural alignment.
+ // If the null pointer dereference is undefined, then a access sequence which
+ // would otherwise access it can be assumed not to unsigned wrap. Note that
+ // this assumes the object in memory is aligned to the natural alignment.
unsigned AddrSpace = Ty->getPointerAddressSpace();
- if (!NullPointerIsDefined(Lp->getHeader()->getParent(), AddrSpace) &&
- (Stride == 1 || Stride == -1))
+ if (!NullPointerIsDefined(Lp->getHeader()->getParent(), AddrSpace))
return Stride;
if (Assume) {
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll b/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll
index 5234d8f107271a..f2c1fb3e22b071 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll
@@ -19,7 +19,6 @@ define void @int_and_pointer_predicate(ptr %v, i32 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,1}<%loop> Added Flags: <nusw>
-; CHECK-NEXT: {%v,+,4}<%loop> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
;
@@ -93,8 +92,6 @@ define void @int_and_multiple_pointer_predicates(ptr %v, ptr %w, i32 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,1}<%loop> Added Flags: <nusw>
-; CHECK-NEXT: {%v,+,4}<%loop> Added Flags: <nusw>
-; CHECK-NEXT: {%w,+,4}<%loop> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
;
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll b/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
index 71c20bc2b2a824..457c85d2d7f916 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
@@ -39,7 +39,6 @@ define void @f1(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,2}<%for.body> Added Flags: <nusw>
-; CHECK-NEXT: {%a,+,4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
@@ -115,7 +114,6 @@ define void @f2(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nusw>
-; CHECK-NEXT: {((4 * (zext i31 (trunc i64 %N to i31) to i64))<nuw><nsw> + %a),+,-4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
@@ -176,7 +174,6 @@ define void @f3(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {0,+,2}<%for.body> Added Flags: <nssw>
-; CHECK-NEXT: {%a,+,4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
@@ -233,7 +230,6 @@ define void @f4(ptr noalias %a, ptr noalias %b, i64 %N) {
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
; CHECK-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nssw>
-; CHECK-NEXT: {((2 * (sext i32 (2 * (trunc i64 %N to i32)) to i64))<nsw> + %a),+,-4}<%for.body> Added Flags: <nusw>
; CHECK-EMPTY:
; CHECK-NEXT: Expressions re-written:
; CHECK-NEXT: [PSE] %arrayidxA = getelementptr i16, ptr %a, i64 %mul_ext:
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
index caa98d766a8c34..e6f1099f12b39b 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
@@ -1120,35 +1120,8 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-SAME: ptr noalias [[SRC_1:%.*]], ptr noalias [[SRC_2:%.*]], ptr noalias [[SRC_3:%.*]], ptr noalias [[SRC_4:%.*]], ptr noalias [[DST:%.*]], i64 [[N:%.*]]) #[[ATTR3:[0-9]+]] {
; DEFAULT-NEXT: entry:
; DEFAULT-NEXT: [[TMP0:%.*]] = add i64 [[N]], 1
-; DEFAULT-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], 32
-; DEFAULT-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_SCEVCHECK:%.*]]
-; DEFAULT: vector.scevcheck:
-; DEFAULT-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[DST]], i64 4
-; DEFAULT-NEXT: [[MUL:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; DEFAULT-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i64, i1 } [[MUL]], 0
-; DEFAULT-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i64, i1 } [[MUL]], 1
-; DEFAULT-NEXT: [[TMP1:%.*]] = sub i64 0, [[MUL_RESULT]]
-; DEFAULT-NEXT: [[TMP2:%.*]] = getelementptr i8, ptr [[SCEVGEP]], i64 [[MUL_RESULT]]
-; DEFAULT-NEXT: [[TMP3:%.*]] = icmp ult ptr [[TMP2]], [[SCEVGEP]]
-; DEFAULT-NEXT: [[TMP4:%.*]] = or i1 [[TMP3]], [[MUL_OVERFLOW]]
-; DEFAULT-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[DST]], i64 8
-; DEFAULT-NEXT: [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; DEFAULT-NEXT: [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
-; DEFAULT-NEXT: [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
-; DEFAULT-NEXT: [[TMP5:%.*]] = sub i64 0, [[MUL_RESULT3]]
-; DEFAULT-NEXT: [[TMP6:%.*]] = getelementptr i8, ptr [[SCEVGEP1]], i64 [[MUL_RESULT3]]
-; DEFAULT-NEXT: [[TMP7:%.*]] = icmp ult ptr [[TMP6]], [[SCEVGEP1]]
-; DEFAULT-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW4]]
-; DEFAULT-NEXT: [[MUL5:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; DEFAULT-NEXT: [[MUL_RESULT6:%.*]] = extractvalue { i64, i1 } [[MUL5]], 0
-; DEFAULT-NEXT: [[MUL_OVERFLOW7:%.*]] = extractvalue { i64, i1 } [[MUL5]], 1
-; DEFAULT-NEXT: [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT6]]
-; DEFAULT-NEXT: [[TMP10:%.*]] = getelementptr i8, ptr [[DST]], i64 [[MUL_RESULT6]]
-; DEFAULT-NEXT: [[TMP11:%.*]] = icmp ult ptr [[TMP10]], [[DST]]
-; DEFAULT-NEXT: [[TMP12:%.*]] = or i1 [[TMP11]], [[MUL_OVERFLOW7]]
-; DEFAULT-NEXT: [[TMP13:%.*]] = or i1 [[TMP4]], [[TMP8]]
-; DEFAULT-NEXT: [[TMP14:%.*]] = or i1 [[TMP13]], [[TMP12]]
-; DEFAULT-NEXT: br i1 [[TMP14]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
+; DEFAULT-NEXT: [[TMP14:%.*]] = icmp ult i64 [[TMP0]], 8
+; DEFAULT-NEXT: br i1 [[TMP14]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
; DEFAULT: vector.ph:
; DEFAULT-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], 8
; DEFAULT-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
@@ -1190,7 +1163,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT: pred.store.continue:
; DEFAULT-NEXT: [[TMP31:%.*]] = extractelement <8 x i1> [[TMP22]], i32 1
; DEFAULT-NEXT: br i1 [[TMP31]], label [[PRED_STORE_IF14:%.*]], label [[PRED_STORE_CONTINUE15:%.*]]
-; DEFAULT: pred.store.if14:
+; DEFAULT: pred.store.if7:
; DEFAULT-NEXT: [[TMP32:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 1
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP32]], align 4
; DEFAULT-NEXT: [[TMP33:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 1
@@ -1202,10 +1175,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP37:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 1
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP37]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE15]]
-; DEFAULT: pred.store.continue15:
+; DEFAULT: pred.store.continue8:
; DEFAULT-NEXT: [[TMP38:%.*]] = extractelement <8 x i1> [[TMP22]], i32 2
; DEFAULT-NEXT: br i1 [[TMP38]], label [[PRED_STORE_IF16:%.*]], label [[PRED_STORE_CONTINUE17:%.*]]
-; DEFAULT: pred.store.if16:
+; DEFAULT: pred.store.if9:
; DEFAULT-NEXT: [[TMP39:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 2
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP39]], align 4
; DEFAULT-NEXT: [[TMP40:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 2
@@ -1217,10 +1190,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP44:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 2
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP44]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE17]]
-; DEFAULT: pred.store.continue17:
+; DEFAULT: pred.store.continue10:
; DEFAULT-NEXT: [[TMP45:%.*]] = extractelement <8 x i1> [[TMP22]], i32 3
; DEFAULT-NEXT: br i1 [[TMP45]], label [[PRED_STORE_IF18:%.*]], label [[PRED_STORE_CONTINUE19:%.*]]
-; DEFAULT: pred.store.if18:
+; DEFAULT: pred.store.if11:
; DEFAULT-NEXT: [[TMP46:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 3
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP46]], align 4
; DEFAULT-NEXT: [[TMP47:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 3
@@ -1232,10 +1205,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP51:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 3
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP51]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE19]]
-; DEFAULT: pred.store.continue19:
+; DEFAULT: pred.store.continue12:
; DEFAULT-NEXT: [[TMP52:%.*]] = extractelement <8 x i1> [[TMP22]], i32 4
; DEFAULT-NEXT: br i1 [[TMP52]], label [[PRED_STORE_IF20:%.*]], label [[PRED_STORE_CONTINUE21:%.*]]
-; DEFAULT: pred.store.if20:
+; DEFAULT: pred.store.if13:
; DEFAULT-NEXT: [[TMP53:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 4
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP53]], align 4
; DEFAULT-NEXT: [[TMP54:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 4
@@ -1247,10 +1220,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP58:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 4
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP58]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE21]]
-; DEFAULT: pred.store.continue21:
+; DEFAULT: pred.store.continue14:
; DEFAULT-NEXT: [[TMP59:%.*]] = extractelement <8 x i1> [[TMP22]], i32 5
; DEFAULT-NEXT: br i1 [[TMP59]], label [[PRED_STORE_IF22:%.*]], label [[PRED_STORE_CONTINUE23:%.*]]
-; DEFAULT: pred.store.if22:
+; DEFAULT: pred.store.if15:
; DEFAULT-NEXT: [[TMP60:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 5
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP60]], align 4
; DEFAULT-NEXT: [[TMP61:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 5
@@ -1262,10 +1235,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP65:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 5
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP65]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE23]]
-; DEFAULT: pred.store.continue23:
+; DEFAULT: pred.store.continue16:
; DEFAULT-NEXT: [[TMP66:%.*]] = extractelement <8 x i1> [[TMP22]], i32 6
; DEFAULT-NEXT: br i1 [[TMP66]], label [[PRED_STORE_IF24:%.*]], label [[PRED_STORE_CONTINUE25:%.*]]
-; DEFAULT: pred.store.if24:
+; DEFAULT: pred.store.if17:
; DEFAULT-NEXT: [[TMP67:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 6
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP67]], align 4
; DEFAULT-NEXT: [[TMP68:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 6
@@ -1277,10 +1250,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP72:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 6
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP72]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE25]]
-; DEFAULT: pred.store.continue25:
+; DEFAULT: pred.store.continue18:
; DEFAULT-NEXT: [[TMP73:%.*]] = extractelement <8 x i1> [[TMP22]], i32 7
; DEFAULT-NEXT: br i1 [[TMP73]], label [[PRED_STORE_IF26:%.*]], label [[PRED_STORE_CONTINUE27]]
-; DEFAULT: pred.store.if26:
+; DEFAULT: pred.store.if19:
; DEFAULT-NEXT: [[TMP74:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 7
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP74]], align 4
; DEFAULT-NEXT: [[TMP75:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 7
@@ -1292,7 +1265,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[TMP79:%.*]] = extractelement <8 x ptr> [[TMP23]], i32 7
; DEFAULT-NEXT: store float 0.000000e+00, ptr [[TMP79]], align 4
; DEFAULT-NEXT: br label [[PRED_STORE_CONTINUE27]]
-; DEFAULT: pred.store.continue27:
+; DEFAULT: pred.store.continue20:
; DEFAULT-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
; DEFAULT-NEXT: [[VEC_IND_NEXT]] = add <8 x i64> [[VEC_IND]], splat (i64 8)
; DEFAULT-NEXT: [[TMP80:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
@@ -1301,7 +1274,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; DEFAULT-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
; DEFAULT-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
; DEFAULT: scalar.ph:
-; DEFAULT-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[VECTOR_SCEVCHECK]] ], [ 0, [[ENTRY:%.*]] ]
+; DEFAULT-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ]
; DEFAULT-NEXT: br label [[LOOP_HEADER:%.*]]
; DEFAULT: loop.header:
; DEFAULT-NEXT: [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
@@ -1336,33 +1309,6 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: entry:
; PRED-NEXT: [[TMP0:%.*]] = add i64 [[N]], 1
; PRED-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_SCEVCHECK:%.*]]
-; PRED: vector.scevcheck:
-; PRED-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[DST]], i64 4
-; PRED-NEXT: [[MUL:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; PRED-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i64, i1 } [[MUL]], 0
-; PRED-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i64, i1 } [[MUL]], 1
-; PRED-NEXT: [[TMP1:%.*]] = sub i64 0, [[MUL_RESULT]]
-; PRED-NEXT: [[TMP2:%.*]] = getelementptr i8, ptr [[SCEVGEP]], i64 [[MUL_RESULT]]
-; PRED-NEXT: [[TMP3:%.*]] = icmp ult ptr [[TMP2]], [[SCEVGEP]]
-; PRED-NEXT: [[TMP4:%.*]] = or i1 [[TMP3]], [[MUL_OVERFLOW]]
-; PRED-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[DST]], i64 8
-; PRED-NEXT: [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; PRED-NEXT: [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
-; PRED-NEXT: [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
-; PRED-NEXT: [[TMP5:%.*]] = sub i64 0, [[MUL_RESULT3]]
-; PRED-NEXT: [[TMP6:%.*]] = getelementptr i8, ptr [[SCEVGEP1]], i64 [[MUL_RESULT3]]
-; PRED-NEXT: [[TMP7:%.*]] = icmp ult ptr [[TMP6]], [[SCEVGEP1]]
-; PRED-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW4]]
-; PRED-NEXT: [[MUL5:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 16, i64 [[N]])
-; PRED-NEXT: [[MUL_RESULT6:%.*]] = extractvalue { i64, i1 } [[MUL5]], 0
-; PRED-NEXT: [[MUL_OVERFLOW7:%.*]] = extractvalue { i64, i1 } [[MUL5]], 1
-; PRED-NEXT: [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT6]]
-; PRED-NEXT: [[TMP10:%.*]] = getelementptr i8, ptr [[DST]], i64 [[MUL_RESULT6]]
-; PRED-NEXT: [[TMP11:%.*]] = icmp ult ptr [[TMP10]], [[DST]]
-; PRED-NEXT: [[TMP12:%.*]] = or i1 [[TMP11]], [[MUL_OVERFLOW7]]
-; PRED-NEXT: [[TMP13:%.*]] = or i1 [[TMP4]], [[TMP8]]
-; PRED-NEXT: [[TMP14:%.*]] = or i1 [[TMP13]], [[TMP12]]
-; PRED-NEXT: br i1 [[TMP14]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
; PRED: vector.ph:
; PRED-NEXT: [[N_RND_UP:%.*]] = add i64 [[TMP0]], 7
; PRED-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N_RND_UP]], 8
@@ -1373,9 +1319,9 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: [[ACTIVE_LANE_MASK_ENTRY:%.*]] = call <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 0, i64 [[TMP0]])
; PRED-NEXT: br label [[VECTOR_BODY:%.*]]
; PRED: vector.body:
-; PRED-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE27:%.*]] ]
-; PRED-NEXT: [[ACTIVE_LANE_MASK:%.*]] = phi <8 x i1> [ [[ACTIVE_LANE_MASK_ENTRY]], [[VECTOR_PH]] ], [ [[ACTIVE_LANE_MASK_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
-; PRED-NEXT: [[VEC_IND:%.*]] = phi <8 x i64> [ <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
+; PRED-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_SCEVCHECK]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE27:%.*]] ]
+; PRED-NEXT: [[ACTIVE_LANE_MASK:%.*]] = phi <8 x i1> [ [[ACTIVE_LANE_MASK_ENTRY]], [[VECTOR_SCEVCHECK]] ], [ [[ACTIVE_LANE_MASK_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
+; PRED-NEXT: [[VEC_IND:%.*]] = phi <8 x i64> [ <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7>, [[VECTOR_SCEVCHECK]] ], [ [[VEC_IND_NEXT:%.*]], [[PRED_STORE_CONTINUE27]] ]
; PRED-NEXT: [[TMP18:%.*]] = load float, ptr [[SRC_1]], align 4
; PRED-NEXT: [[BROADCAST_SPLATINSERT8:%.*]] = insertelement <8 x float> poison, float [[TMP18]], i64 0
; PRED-NEXT: [[BROADCAST_SPLAT9:%.*]] = shufflevector <8 x float> [[BROADCAST_SPLATINSERT8]], <8 x float> poison, <8 x i32> zeroinitializer
@@ -1411,7 +1357,7 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED: pred.store.continue:
; PRED-NEXT: [[TMP35:%.*]] = extractelement <8 x i1> [[TMP26]], i32 1
; PRED-NEXT: br i1 [[TMP35]], label [[PRED_STORE_IF14:%.*]], label [[PRED_STORE_CONTINUE15:%.*]]
-; PRED: pred.store.if14:
+; PRED: pred.store.if7:
; PRED-NEXT: [[TMP36:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 1
; PRED-NEXT: store float 0.000000e+00, ptr [[TMP36]], align 4
; PRED-NEXT: [[TMP37:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 1
@@ -1423,10 +1369,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: [[TMP41:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 1
; PRED-NEXT: store float 0.000000e+00, ptr [[TMP41]], align 4
; PRED-NEXT: br label [[PRED_STORE_CONTINUE15]]
-; PRED: pred.store.continue15:
+; PRED: pred.store.continue8:
; PRED-NEXT: [[TMP42:%.*]] = extractelement <8 x i1> [[TMP26]], i32 2
; PRED-NEXT: br i1 [[TMP42]], label [[PRED_STORE_IF16:%.*]], label [[PRED_STORE_CONTINUE17:%.*]]
-; PRED: pred.store.if16:
+; PRED: pred.store.if9:
; PRED-NEXT: [[TMP43:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 2
; PRED-NEXT: store float 0.000000e+00, ptr [[TMP43]], align 4
; PRED-NEXT: [[TMP44:%.*]] = extractelement <8 x ptr> [[TMP27]], i32 2
@@ -1438,10 +1384,10 @@ define void @test_conditional_interleave_group (ptr noalias %src.1, ptr noalias
; PRED-NEXT: [[TMP48:%.*]] = extractelem...
[truncated]
|
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' a94f08174c0312bca0ff6405640eb8a3ff986084 f9f1af11e76355c43afba50ca2b296d48fd9d820 llvm/lib/Analysis/LoopAccessAnalysis.cpp llvm/test/Analysis/LoopAccessAnalysis/nusw-predicates.ll llvm/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll llvm/test/Transforms/LoopVectorize/X86/interleave-cost.ll llvm/test/Transforms/LoopVersioning/incorrect-phi.ll llvm/test/Transforms/LoopVersioning/wrapping-pointer-versioning.ll The following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
} Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
} Please refer to the Undefined Behavior Manual for more information. |
unsigned AddrSpace = Ty->getPointerAddressSpace(); | ||
if (!NullPointerIsDefined(Lp->getHeader()->getParent(), AddrSpace) && | ||
(Stride == 1 || Stride == -1)) | ||
if (!NullPointerIsDefined(Lp->getHeader()->getParent(), AddrSpace)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we can apply the same reasoning as for a353e25 here.
If the stride is != 1,-1, we could wrap but never reach null I think
Strip an unnecessary unit-stride check in getPtrStride, when checking whether null pointer dereference is undefined. This patch is similar in spirit to a353e25 (Don't require Stride == 1/-1 for inbounds pointer AddRecs nowrap.), and has the effect of removing redundant SCEV checks.
-- 8< --
Requires rebase after #122876 is landed.