-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstCombine] Infer zext nneg flag #71534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) ChangesUse KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent. Compile-time impact exists but is pretty small: http://llvm-compile-time-tracker.com/compare.php?from=8f76522a61d01cf7d70debd39418259e969bb8d6&to=20a4612687e6f2d75097e67987c1c592359b3b96&stat=instructions:u I don't see any obvious way to avoid it (I don't think we have any existing KnownBits calculations we can reuse here.) Patch is 88.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71534.diff 47 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index efd18b44657e5da..08ead599d525a71 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -1219,6 +1219,11 @@ Instruction *InstCombinerImpl::visitZExt(ZExtInst &Zext) {
}
}
+ if (!Zext.hasNonNeg() && isKnownNonNegative(Src, DL, 0, &AC, &Zext, &DT)) {
+ Zext.setNonNeg();
+ return &Zext;
+ }
+
return nullptr;
}
diff --git a/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll b/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
index eda4053cf0f6988..3081baa2db281e4 100644
--- a/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
+++ b/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
@@ -8,7 +8,7 @@ define i32 @main(i32 %argc) {
; CHECK-NEXT: [[T3163:%.*]] = xor i8 [[T3151]], -1
; CHECK-NEXT: [[TMP1:%.*]] = shl i8 [[T3163]], 5
; CHECK-NEXT: [[T4127:%.*]] = and i8 [[TMP1]], 64
-; CHECK-NEXT: [[T4086:%.*]] = zext i8 [[T4127]] to i32
+; CHECK-NEXT: [[T4086:%.*]] = zext nneg i8 [[T4127]] to i32
; CHECK-NEXT: ret i32 [[T4086]]
;
%t3151 = trunc i32 %argc to i8
diff --git a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
index 8c2ba9701e72a5b..21d5723cbb82d63 100644
--- a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
+++ b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
@@ -2816,7 +2816,7 @@ define <8 x i32> @avx2_psrai_d_256_masked(<8 x i32> %v, i32 %a) {
define <8 x i64> @avx512_psrai_q_512_masked(<8 x i64> %v, i32 %a) {
; CHECK-LABEL: @avx512_psrai_q_512_masked(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <8 x i64> poison, i64 [[TMP2]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <8 x i64> [[DOTSPLATINSERT]], <8 x i64> poison, <8 x i32> zeroinitializer
; CHECK-NEXT: [[TMP3:%.*]] = ashr <8 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2843,7 +2843,7 @@ define <4 x i32> @sse2_psrli_d_128_masked(<4 x i32> %v, i32 %a) {
define <4 x i64> @avx2_psrli_q_256_masked(<4 x i64> %v, i32 %a) {
; CHECK-LABEL: @avx2_psrli_q_256_masked(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP2]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
; CHECK-NEXT: [[TMP3:%.*]] = lshr <4 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2871,7 +2871,7 @@ define <32 x i16> @avx512_psrli_w_512_masked(<32 x i16> %v, i32 %a) {
define <2 x i64> @sse2_pslli_q_128_masked(<2 x i64> %v, i32 %a) {
; CHECK-LABEL: @sse2_pslli_q_128_masked(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[TMP2]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
; CHECK-NEXT: [[TMP3:%.*]] = shl <2 x i64> [[V:%.*]], [[DOTSPLAT]]
diff --git a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
index 63e44fda81552e6..a3b14ef2b1c1bee 100644
--- a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
+++ b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
@@ -2772,8 +2772,8 @@ define <2 x i64> @sse2_psll_q_128_masked_bitcast(<2 x i64> %v, <2 x i64> %a) {
; CHECK-NEXT: [[I:%.*]] = insertelement <4 x i32> [[M]], i32 0, i64 1
; CHECK-NEXT: [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>
; CHECK-NEXT: [[TMP1:%.*]] = shufflevector <2 x i64> [[SHAMT]], <2 x i64> poison, <2 x i32> zeroinitializer
-; CHECK-NEXT: [[TMP2:%.*]] = shl <2 x i64> [[V:%.*]], [[TMP1]]
-; CHECK-NEXT: ret <2 x i64> [[TMP2]]
+; CHECK-NEXT: [[R:%.*]] = shl <2 x i64> [[V:%.*]], [[TMP1]]
+; CHECK-NEXT: ret <2 x i64> [[R]]
;
%b = bitcast <2 x i64> %a to <4 x i32>
%m = and <4 x i32> %b, <i32 31, i32 poison, i32 poison, i32 poison>
@@ -2856,7 +2856,7 @@ define <8 x i32> @avx2_psrai_d_256_masked(<8 x i32> %v, i32 %a) {
define <8 x i64> @avx512_psrai_q_512_masked(<8 x i64> %v, i32 %a) {
; CHECK-LABEL: @avx512_psrai_q_512_masked(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <8 x i64> poison, i64 [[TMP2]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <8 x i64> [[DOTSPLATINSERT]], <8 x i64> poison, <8 x i32> zeroinitializer
; CHECK-NEXT: [[TMP3:%.*]] = ashr <8 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2883,7 +2883,7 @@ define <4 x i32> @sse2_psrli_d_128_masked(<4 x i32> %v, i32 %a) {
define <4 x i64> @avx2_psrli_q_256_masked(<4 x i64> %v, i32 %a) {
; CHECK-LABEL: @avx2_psrli_q_256_masked(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP2]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
; CHECK-NEXT: [[TMP3:%.*]] = lshr <4 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2911,7 +2911,7 @@ define <32 x i16> @avx512_psrli_w_512_masked(<32 x i16> %v, i32 %a) {
define <2 x i64> @sse2_pslli_q_128_masked(<2 x i64> %v, i32 %a) {
; CHECK-LABEL: @sse2_pslli_q_128_masked(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[TMP2]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
; CHECK-NEXT: [[TMP3:%.*]] = shl <2 x i64> [[V:%.*]], [[DOTSPLAT]]
diff --git a/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll b/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
index dced55944505370..76fc7a07be6bd61 100644
--- a/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
+++ b/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
@@ -414,7 +414,7 @@ define <2 x i64> @umax_zext_vec(<2 x i32> %a) {
define i64 @umin_zext(i32 %a) {
; CHECK-LABEL: @umin_zext(
; CHECK-NEXT: [[NARROW:%.*]] = call i32 @llvm.umin.i32(i32 [[A:%.*]], i32 2)
-; CHECK-NEXT: [[MIN:%.*]] = zext i32 [[NARROW]] to i64
+; CHECK-NEXT: [[MIN:%.*]] = zext nneg i32 [[NARROW]] to i64
; CHECK-NEXT: ret i64 [[MIN]]
;
%a_ext = zext i32 %a to i64
@@ -426,7 +426,7 @@ define i64 @umin_zext(i32 %a) {
define <2 x i64> @umin_zext_vec(<2 x i32> %a) {
; CHECK-LABEL: @umin_zext_vec(
; CHECK-NEXT: [[NARROW:%.*]] = call <2 x i32> @llvm.umin.v2i32(<2 x i32> [[A:%.*]], <2 x i32> <i32 2, i32 2>)
-; CHECK-NEXT: [[MIN:%.*]] = zext <2 x i32> [[NARROW]] to <2 x i64>
+; CHECK-NEXT: [[MIN:%.*]] = zext nneg <2 x i32> [[NARROW]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[MIN]]
;
%a_ext = zext <2 x i32> %a to <2 x i64>
diff --git a/llvm/test/Transforms/InstCombine/and-narrow.ll b/llvm/test/Transforms/InstCombine/and-narrow.ll
index 92894090ef66d71..c8c720f5fbc5534 100644
--- a/llvm/test/Transforms/InstCombine/and-narrow.ll
+++ b/llvm/test/Transforms/InstCombine/and-narrow.ll
@@ -47,7 +47,7 @@ define i16 @zext_lshr(i8 %x) {
; CHECK-LABEL: @zext_lshr(
; CHECK-NEXT: [[TMP1:%.*]] = lshr i8 [[X:%.*]], 4
; CHECK-NEXT: [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT: [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT: [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
; CHECK-NEXT: ret i16 [[R]]
;
%z = zext i8 %x to i16
@@ -60,7 +60,7 @@ define i16 @zext_ashr(i8 %x) {
; CHECK-LABEL: @zext_ashr(
; CHECK-NEXT: [[TMP1:%.*]] = lshr i8 [[X:%.*]], 2
; CHECK-NEXT: [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT: [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT: [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
; CHECK-NEXT: ret i16 [[R]]
;
%z = zext i8 %x to i16
@@ -125,7 +125,7 @@ define <2 x i16> @zext_lshr_vec(<2 x i8> %x) {
; CHECK-LABEL: @zext_lshr_vec(
; CHECK-NEXT: [[TMP1:%.*]] = lshr <2 x i8> [[X:%.*]], <i8 4, i8 2>
; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i8> [[TMP1]], [[X]]
-; CHECK-NEXT: [[R:%.*]] = zext <2 x i8> [[TMP2]] to <2 x i16>
+; CHECK-NEXT: [[R:%.*]] = zext nneg <2 x i8> [[TMP2]] to <2 x i16>
; CHECK-NEXT: ret <2 x i16> [[R]]
;
%z = zext <2 x i8> %x to <2 x i16>
@@ -138,7 +138,7 @@ define <2 x i16> @zext_ashr_vec(<2 x i8> %x) {
; CHECK-LABEL: @zext_ashr_vec(
; CHECK-NEXT: [[TMP1:%.*]] = lshr <2 x i8> [[X:%.*]], <i8 2, i8 3>
; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i8> [[TMP1]], [[X]]
-; CHECK-NEXT: [[R:%.*]] = zext <2 x i8> [[TMP2]] to <2 x i16>
+; CHECK-NEXT: [[R:%.*]] = zext nneg <2 x i8> [[TMP2]] to <2 x i16>
; CHECK-NEXT: ret <2 x i16> [[R]]
;
%z = zext <2 x i8> %x to <2 x i16>
diff --git a/llvm/test/Transforms/InstCombine/and-xor-or.ll b/llvm/test/Transforms/InstCombine/and-xor-or.ll
index 741fc1eca65d1e9..69a7890bee22f80 100644
--- a/llvm/test/Transforms/InstCombine/and-xor-or.ll
+++ b/llvm/test/Transforms/InstCombine/and-xor-or.ll
@@ -4207,7 +4207,7 @@ define i16 @and_zext_zext(i8 %x, i4 %y) {
; CHECK-SAME: (i8 [[X:%.*]], i4 [[Y:%.*]]) {
; CHECK-NEXT: [[TMP1:%.*]] = zext i4 [[Y]] to i8
; CHECK-NEXT: [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT: [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT: [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
; CHECK-NEXT: ret i16 [[R]]
;
%zx = zext i8 %x to i16
diff --git a/llvm/test/Transforms/InstCombine/and.ll b/llvm/test/Transforms/InstCombine/and.ll
index 95b1b0e73ea5c7a..386ee3807050140 100644
--- a/llvm/test/Transforms/InstCombine/and.ll
+++ b/llvm/test/Transforms/InstCombine/and.ll
@@ -525,7 +525,7 @@ define <2 x i32> @and_demanded_bits_splat_vec(<2 x i32> %x) {
define i32 @and_zext_demanded(i16 %x, i32 %y) {
; CHECK-LABEL: @and_zext_demanded(
; CHECK-NEXT: [[S:%.*]] = lshr i16 [[X:%.*]], 8
-; CHECK-NEXT: [[Z:%.*]] = zext i16 [[S]] to i32
+; CHECK-NEXT: [[Z:%.*]] = zext nneg i16 [[S]] to i32
; CHECK-NEXT: ret i32 [[Z]]
;
%s = lshr i16 %x, 8
@@ -618,7 +618,7 @@ define i64 @test35(i32 %X) {
; CHECK-LABEL: @test35(
; CHECK-NEXT: [[TMP1:%.*]] = sub i32 0, [[X:%.*]]
; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT: [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT: [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
; CHECK-NEXT: ret i64 [[RES]]
;
%zext = zext i32 %X to i64
@@ -631,7 +631,7 @@ define <2 x i64> @test35_uniform(<2 x i32> %X) {
; CHECK-LABEL: @test35_uniform(
; CHECK-NEXT: [[TMP1:%.*]] = sub <2 x i32> zeroinitializer, [[X:%.*]]
; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT: [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT: [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[RES]]
;
%zext = zext <2 x i32> %X to <2 x i64>
@@ -644,7 +644,7 @@ define i64 @test36(i32 %X) {
; CHECK-LABEL: @test36(
; CHECK-NEXT: [[TMP1:%.*]] = add i32 [[X:%.*]], 7
; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT: [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT: [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
; CHECK-NEXT: ret i64 [[RES]]
;
%zext = zext i32 %X to i64
@@ -657,7 +657,7 @@ define <2 x i64> @test36_uniform(<2 x i32> %X) {
; CHECK-LABEL: @test36_uniform(
; CHECK-NEXT: [[TMP1:%.*]] = add <2 x i32> [[X:%.*]], <i32 7, i32 7>
; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT: [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT: [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[RES]]
;
%zext = zext <2 x i32> %X to <2 x i64>
@@ -683,7 +683,7 @@ define i64 @test37(i32 %X) {
; CHECK-LABEL: @test37(
; CHECK-NEXT: [[TMP1:%.*]] = mul i32 [[X:%.*]], 7
; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT: [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT: [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
; CHECK-NEXT: ret i64 [[RES]]
;
%zext = zext i32 %X to i64
@@ -696,7 +696,7 @@ define <2 x i64> @test37_uniform(<2 x i32> %X) {
; CHECK-LABEL: @test37_uniform(
; CHECK-NEXT: [[TMP1:%.*]] = mul <2 x i32> [[X:%.*]], <i32 7, i32 7>
; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT: [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT: [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[RES]]
;
%zext = zext <2 x i32> %X to <2 x i64>
@@ -721,7 +721,7 @@ define <2 x i64> @test37_nonuniform(<2 x i32> %X) {
define i64 @test38(i32 %X) {
; CHECK-LABEL: @test38(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[X:%.*]], 240
-; CHECK-NEXT: [[RES:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[RES:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: ret i64 [[RES]]
;
%zext = zext i32 %X to i64
@@ -733,7 +733,7 @@ define i64 @test38(i32 %X) {
define i64 @test39(i32 %X) {
; CHECK-LABEL: @test39(
; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[X:%.*]], 240
-; CHECK-NEXT: [[RES:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[RES:%.*]] = zext nneg i32 [[TMP1]] to i64
; CHECK-NEXT: ret i64 [[RES]]
;
%zext = zext i32 %X to i64
diff --git a/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll b/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
index 04b530647d0a26e..a3485978471dc05 100644
--- a/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
+++ b/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
@@ -54,7 +54,7 @@ define <2 x i32> @OrZextOrVec(<2 x i2> %a) {
define i5 @AndZextAnd(i3 %a) {
; CHECK-LABEL: @AndZextAnd(
; CHECK-NEXT: [[TMP1:%.*]] = and i3 [[A:%.*]], 2
-; CHECK-NEXT: [[OP2:%.*]] = zext i3 [[TMP1]] to i5
+; CHECK-NEXT: [[OP2:%.*]] = zext nneg i3 [[TMP1]] to i5
; CHECK-NEXT: ret i5 [[OP2]]
;
%op1 = and i3 %a, 3
@@ -66,7 +66,7 @@ define i5 @AndZextAnd(i3 %a) {
define <2 x i32> @AndZextAndVec(<2 x i8> %a) {
; CHECK-LABEL: @AndZextAndVec(
; CHECK-NEXT: [[TMP1:%.*]] = and <2 x i8> [[A:%.*]], <i8 5, i8 0>
-; CHECK-NEXT: [[OP2:%.*]] = zext <2 x i8> [[TMP1]] to <2 x i32>
+; CHECK-NEXT: [[OP2:%.*]] = zext nneg <2 x i8> [[TMP1]] to <2 x i32>
; CHECK-NEXT: ret <2 x i32> [[OP2]]
;
%op1 = and <2 x i8> %a, <i8 7, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/binop-cast.ll b/llvm/test/Transforms/InstCombine/binop-cast.ll
index 20d5814c05d3aa7..e3345194d0b3284 100644
--- a/llvm/test/Transforms/InstCombine/binop-cast.ll
+++ b/llvm/test/Transforms/InstCombine/binop-cast.ll
@@ -276,7 +276,7 @@ define i64 @PR63321(ptr %ptr, i64 %c) {
define i64 @and_add_non_bool(ptr %ptr, i64 %c) {
; CHECK-LABEL: @and_add_non_bool(
; CHECK-NEXT: [[VAL:%.*]] = load i8, ptr [[PTR:%.*]], align 1, !range [[RNG1:![0-9]+]]
-; CHECK-NEXT: [[RHS:%.*]] = zext i8 [[VAL]] to i64
+; CHECK-NEXT: [[RHS:%.*]] = zext nneg i8 [[VAL]] to i64
; CHECK-NEXT: [[MASK:%.*]] = add nsw i64 [[RHS]], -1
; CHECK-NEXT: [[RES:%.*]] = and i64 [[MASK]], [[C:%.*]]
; CHECK-NEXT: ret i64 [[RES]]
diff --git a/llvm/test/Transforms/InstCombine/cast-mul-select.ll b/llvm/test/Transforms/InstCombine/cast-mul-select.ll
index 23e934de0baeb7e..454522b85a1e843 100644
--- a/llvm/test/Transforms/InstCombine/cast-mul-select.ll
+++ b/llvm/test/Transforms/InstCombine/cast-mul-select.ll
@@ -119,7 +119,7 @@ define i32 @eval_zext_multi_use_in_one_inst(i32 %x) {
; CHECK-NEXT: [[T:%.*]] = trunc i32 [[X:%.*]] to i16
; CHECK-NEXT: [[A:%.*]] = and i16 [[T]], 5
; CHECK-NEXT: [[M:%.*]] = mul nuw nsw i16 [[A]], [[A]]
-; CHECK-NEXT: [[R:%.*]] = zext i16 [[M]] to i32
+; CHECK-NEXT: [[R:%.*]] = zext nneg i16 [[M]] to i32
; CHECK-NEXT: ret i32 [[R]]
;
; DBGINFO-LABEL: @eval_zext_multi_use_in_one_inst(
@@ -129,7 +129,7 @@ define i32 @eval_zext_multi_use_in_one_inst(i32 %x) {
; DBGINFO-NEXT: call void @llvm.dbg.value(metadata i16 [[A]], metadata [[META66:![0-9]+]], metadata !DIExpression()), !dbg [[DBG70]]
; DBGINFO-NEXT: [[M:%.*]] = mul nuw nsw i16 [[A]], [[A]], !dbg [[DBG71:![0-9]+]]
; DBGINFO-NEXT: call void @llvm.dbg.value(metadata i16 [[M]], metadata [[META67:![0-9]+]], metadata !DIExpression()), !dbg [[DBG71]]
-; DBGINFO-NEXT: [[R:%.*]] = zext i16 [[M]] to i32, !dbg [[DBG72:![0-9]+]]
+; DBGINFO-NEXT: [[R:%.*]] = zext nneg i16 [[M]] to i32, !dbg [[DBG72:![0-9]+]]
; DBGINFO-NEXT: call void @llvm.dbg.value(metadata i32 [[R]], metadata [[META68:![0-9]+]], metadata !DIExpression()), !dbg [[DBG72]]
; DBGINFO-NEXT: ret i32 [[R]], !dbg [[DBG73:![0-9]+]]
;
@@ -183,13 +183,13 @@ define void @PR36225(i32 %a, i32 %b, i1 %c1, i3 %v1, i3 %v2) {
; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i32 [[B:%.*]], 0
; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 [[TOBOOL]], i8 0, i8 4
; CHECK-NEXT: switch i3 [[V1:%.*]], label [[EXIT:%.*]] [
-; CHECK-NEXT: i3 0, label [[FOR_END:%.*]]
-; CHECK-NEXT: i3 -1, label [[FOR_END]]
+; CHECK-NEXT: i3 0, label [[FOR_END:%.*]]
+; CHECK-NEXT: i3 -1, label [[FOR_END]]
; CHECK-NEXT: ]
; CHECK: for.body3:
; CHECK-NEXT: switch i3 [[V2:%.*]], label [[EXIT]] [
-; CHECK-NEXT: i3 0, label [[FOR_END]]
-; CHECK-NEXT: i3 -1, label [[FOR_END]]
+; CHECK-NEXT: i3 0, label [[FOR_END]]
+; CHECK-NEXT: i3 -1, label [[FOR_END]]
; CHECK-NEXT: ]
; CHECK: for.end:
; CHECK-NEXT: [[H:%.*]] = phi i8 [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ 0, [[FOR_BODY3]] ], [ 0, [[FOR_BODY3]] ]
@@ -213,13 +213,13 @@ define void @PR36225(i32 %a, i32 %b, i1 %c1, i3 %v1, i3 %v2) {
; DBGINFO-NEXT: [[SPEC_SELECT:%.*]] = select i1 [[TOBOOL]], i8 0, i8 4, !dbg [[DBG97:![0-9]+]]
; DBGINFO-NEXT: call void @llvm.dbg.value(metadata i8 [[SPEC_SELECT]], metadata [[META90:![0-9]+]], metadata !DIExpression()), !dbg [[DBG97]]
; DBGINFO-NEXT: switch i3 [[V1:%.*]], label [[EXIT:%.*]] [
-; DBGINFO-NEXT: i3 0, label [[FOR_END:%.*]]
-; DBGINFO-NEXT: i3 -1, label [[FOR_END]]
+; DBGINFO-NEXT: i3 0, label [[FOR_END:%.*]]
+; DBGINFO-NEXT: i3 -1, label [[FOR_END]]
; DBGINFO-NEXT: ], !dbg [[DBG98:![0-9]+]]
; DBGINFO: for.body3:
; DBGINFO-NEXT: switch i3 [[V2:%.*]], label [[EXIT]] [
-; DBGINFO-NEXT: i3 0, label [[FOR_END]]
-; DBGINFO-NEXT: i3 -1, label [[FOR_END]]
+; DBGINFO-NEXT: i3 0, label [[FOR_END]]
+; DBGINFO-NEXT: i3 -1, label [[FOR_END]]
; DBGINFO-NEXT: ], !dbg [[DBG99:![0-9]+]]
; DBGINFO: for.end:
; DBGINFO-NEXT: [[H:%.*]] = phi i8 [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ 0, [[FOR_BODY3]] ], [ 0, [[FOR_BODY3]] ], !dbg [[DBG100:![0-9]+]]
diff --git a/llvm/test/Transforms/InstCombine/cast.ll b/llvm/test/Transforms/InstCombine/cast.ll
index 59e488f3f23d52a..afa7ac45e96dcb4 100644
--- a/llvm/test/Transforms/InstCombine/cast.ll
+++ b/llvm/test/Transforms/InstCombine/cast.ll
@@ -124,12 +124,12 @@ define void @test_invoke_vararg_cast(ptr %a, ptr %b) personality ptr @__gxx_pers
; ALL-LABEL: @test_invoke_vararg_cast(
; ALL-NEXT: entry:
; ALL-NEXT: invoke void (i32, ...) @varargs(i32 1, ptr [[B:%.*]], ptr [[A:%.*]])
-; ALL-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
+; ALL-NEXT: to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
; ALL: invoke.cont:
; ALL-NEXT: ret void
; ALL: lpad:
; ALL-NEXT: [[TMP0:%.*]] = landingpad { ptr, i32 }
-; ALL-NEXT: cleanup
+; ALL-NEXT: cleanup
; ALL-NEXT: ret void
;
entry:
@@ -619,7 +619,7 @@ define ...
[truncated]
|
Use KnownBits to infer the nneg flag on zext instructions.
9b66984
to
e965141
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
We can improve the compile time by adding nneg
flag during the zext creation (e.g., cttz(zext(x)) -> zext nneg(cttz(x))
).
I think this patch causes miscompiles. Reproduce with
into
I.e the and with 32767 (0x7fff) is gone and instead the zext got "nneg"? |
We also have a couple of internal tests that seem to be failing after this commit. Consider the following code: char print_tmp[1];
void print(char *, void *data, unsigned size) {
unsigned char *bytes = (unsigned char *)data;
for (unsigned i = 0; i != size; ++i)
sprintf(print_tmp + i * 2, "%02x", bytes[size - 1 - i]);
printf(print_tmp);
}
#define PRINT(VAR) print(#VAR, &VAR, sizeof(VAR))
struct {
long b : 17;
} test141_struct_id29534;
struct test141_struct_id29574_ {
test141_struct_id29574_() { INIT(172, *this); }
unsigned a : 15;
} test141_struct_id29574;
int main() {
long id29692 = test141_struct_id29534.b = test141_struct_id29574.a;
PRINT(id29692);
} When compiled without optimizations (and before this change with optimization) it would print out the value You can see the difference at https://godbolt.org/z/vjPvGT5G9. |
Reduced test case: https://godbolt.org/z/d4ETPhbno |
It looks like simplifyAssocCastAssoc() is the problematic transform. It modifies a zext in-place without clearing poison flags. |
Exposed by #71534 and reported there.
Should be fixed by 1b1c817. |
This patch infers `nneg` flags for existing zext instructions in CVP. After #71534 and this patch, we can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`: https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83 This is an alternative to #72049.
This patch infers `nneg` flags for existing zext instructions in CVP. After llvm#71534 and this patch, we can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`: https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83 This is an alternative to llvm#72049.
…lvm#72053) After llvm#71534 and llvm#72052, the transform `zext -> zext nneg` in `RISCVCodeGenPrepare` is redundant.
Exposed by llvm/llvm-project#71534 and reported there.
Use KnownBits to infer the nneg flag on zext instructions.
Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.
Compile-time impact exists but is pretty small: http://llvm-compile-time-tracker.com/compare.php?from=8f76522a61d01cf7d70debd39418259e969bb8d6&to=20a4612687e6f2d75097e67987c1c592359b3b96&stat=instructions:u I don't see any obvious way to avoid it (I don't think we have any existing KnownBits calculations we can reuse here.)