[InstCombine] Infer zext nneg flag #71534

nikic · 2023-11-07T13:28:43Z

Use KnownBits to infer the nneg flag on zext instructions.

Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.

Compile-time impact exists but is pretty small: http://llvm-compile-time-tracker.com/compare.php?from=8f76522a61d01cf7d70debd39418259e969bb8d6&to=20a4612687e6f2d75097e67987c1c592359b3b96&stat=instructions:u I don't see any obvious way to avoid it (I don't think we have any existing KnownBits calculations we can reuse here.)

llvmbot · 2023-11-07T13:29:18Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Use KnownBits to infer the nneg flag on zext instructions.

Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.

Compile-time impact exists but is pretty small: http://llvm-compile-time-tracker.com/compare.php?from=8f76522a61d01cf7d70debd39418259e969bb8d6&to=20a4612687e6f2d75097e67987c1c592359b3b96&stat=instructions:u I don't see any obvious way to avoid it (I don't think we have any existing KnownBits calculations we can reuse here.)

Patch is 88.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71534.diff

47 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (+5)
(modified) llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll (+3-3)
(modified) llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/adjust-for-minmax.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/and-narrow.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/and-xor-or.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/and.ll (+9-9)
(modified) llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/binop-cast.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/cast-mul-select.ll (+10-10)
(modified) llvm/test/Transforms/InstCombine/cast.ll (+9-9)
(modified) llvm/test/Transforms/InstCombine/ctpop.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/cttz.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/fmul.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/freeze.ll (+13-13)
(modified) llvm/test/Transforms/InstCombine/load-bitcast-select.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/lshr.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/minmax-fold.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/minmax-intrinsics.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/narrow-math.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/negated-bitmask.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/overflow-mul.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/reduction-add-sext-zext-i1.ll (+3-3)
(modified) llvm/test/Transforms/InstCombine/reduction-xor-sext-zext-i1.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/rem.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/select-bitext-bitwise-ops.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/select-bitext.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/select-cmp-cttz-ctlz.ll (+9-9)
(modified) llvm/test/Transforms/InstCombine/select-ctlz-to-cttz.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/select-obo-peo-ops.ll (+8-8)
(modified) llvm/test/Transforms/InstCombine/select-with-bitwise-ops.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest-with-truncation-shl.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/shift.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/trunc-inseltpoison.ll (+10-10)
(modified) llvm/test/Transforms/InstCombine/trunc.ll (+10-10)
(modified) llvm/test/Transforms/InstCombine/udiv-simplify.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/udivrem-change-width.ll (+11-11)
(modified) llvm/test/Transforms/InstCombine/vector-casts-inseltpoison.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/vector-casts.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/zeroext-and-reduce.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/zext-or-icmp.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/zext.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll (+2-2)
(modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+5-5)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index efd18b44657e5da..08ead599d525a71 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -1219,6 +1219,11 @@ Instruction *InstCombinerImpl::visitZExt(ZExtInst &Zext) {
     }
   }
 
+  if (!Zext.hasNonNeg() && isKnownNonNegative(Src, DL, 0, &AC, &Zext, &DT)) {
+    Zext.setNonNeg();
+    return &Zext;
+  }
+
   return nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll b/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
index eda4053cf0f6988..3081baa2db281e4 100644
--- a/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
+++ b/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
@@ -8,7 +8,7 @@ define i32 @main(i32 %argc) {
 ; CHECK-NEXT:    [[T3163:%.*]] = xor i8 [[T3151]], -1
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl i8 [[T3163]], 5
 ; CHECK-NEXT:    [[T4127:%.*]] = and i8 [[TMP1]], 64
-; CHECK-NEXT:    [[T4086:%.*]] = zext i8 [[T4127]] to i32
+; CHECK-NEXT:    [[T4086:%.*]] = zext nneg i8 [[T4127]] to i32
 ; CHECK-NEXT:    ret i32 [[T4086]]
 ;
   %t3151 = trunc i32 %argc to i8
diff --git a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
index 8c2ba9701e72a5b..21d5723cbb82d63 100644
--- a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
+++ b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
@@ -2816,7 +2816,7 @@ define <8 x i32> @avx2_psrai_d_256_masked(<8 x i32> %v, i32 %a) {
 define <8 x i64> @avx512_psrai_q_512_masked(<8 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx512_psrai_q_512_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <8 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <8 x i64> [[DOTSPLATINSERT]], <8 x i64> poison, <8 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = ashr <8 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2843,7 +2843,7 @@ define <4 x i32> @sse2_psrli_d_128_masked(<4 x i32> %v, i32 %a) {
 define <4 x i64> @avx2_psrli_q_256_masked(<4 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx2_psrli_q_256_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = lshr <4 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2871,7 +2871,7 @@ define <32 x i16> @avx512_psrli_w_512_masked(<32 x i16> %v, i32 %a) {
 define <2 x i64> @sse2_pslli_q_128_masked(<2 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @sse2_pslli_q_128_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = shl <2 x i64> [[V:%.*]], [[DOTSPLAT]]
diff --git a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
index 63e44fda81552e6..a3b14ef2b1c1bee 100644
--- a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
+++ b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
@@ -2772,8 +2772,8 @@ define <2 x i64> @sse2_psll_q_128_masked_bitcast(<2 x i64> %v, <2 x i64> %a) {
 ; CHECK-NEXT:    [[I:%.*]] = insertelement <4 x i32> [[M]], i32 0, i64 1
 ; CHECK-NEXT:    [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>
 ; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <2 x i64> [[SHAMT]], <2 x i64> poison, <2 x i32> zeroinitializer
-; CHECK-NEXT:    [[TMP2:%.*]] = shl <2 x i64> [[V:%.*]], [[TMP1]]
-; CHECK-NEXT:    ret <2 x i64> [[TMP2]]
+; CHECK-NEXT:    [[R:%.*]] = shl <2 x i64> [[V:%.*]], [[TMP1]]
+; CHECK-NEXT:    ret <2 x i64> [[R]]
 ;
   %b = bitcast <2 x i64> %a to <4 x i32>
   %m = and <4 x i32> %b, <i32 31, i32 poison, i32 poison, i32 poison>
@@ -2856,7 +2856,7 @@ define <8 x i32> @avx2_psrai_d_256_masked(<8 x i32> %v, i32 %a) {
 define <8 x i64> @avx512_psrai_q_512_masked(<8 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx512_psrai_q_512_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <8 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <8 x i64> [[DOTSPLATINSERT]], <8 x i64> poison, <8 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = ashr <8 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2883,7 +2883,7 @@ define <4 x i32> @sse2_psrli_d_128_masked(<4 x i32> %v, i32 %a) {
 define <4 x i64> @avx2_psrli_q_256_masked(<4 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx2_psrli_q_256_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = lshr <4 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2911,7 +2911,7 @@ define <32 x i16> @avx512_psrli_w_512_masked(<32 x i16> %v, i32 %a) {
 define <2 x i64> @sse2_pslli_q_128_masked(<2 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @sse2_pslli_q_128_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = shl <2 x i64> [[V:%.*]], [[DOTSPLAT]]
diff --git a/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll b/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
index dced55944505370..76fc7a07be6bd61 100644
--- a/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
+++ b/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
@@ -414,7 +414,7 @@ define <2 x i64> @umax_zext_vec(<2 x i32> %a) {
 define i64 @umin_zext(i32 %a) {
 ; CHECK-LABEL: @umin_zext(
 ; CHECK-NEXT:    [[NARROW:%.*]] = call i32 @llvm.umin.i32(i32 [[A:%.*]], i32 2)
-; CHECK-NEXT:    [[MIN:%.*]] = zext i32 [[NARROW]] to i64
+; CHECK-NEXT:    [[MIN:%.*]] = zext nneg i32 [[NARROW]] to i64
 ; CHECK-NEXT:    ret i64 [[MIN]]
 ;
   %a_ext = zext i32 %a to i64
@@ -426,7 +426,7 @@ define i64 @umin_zext(i32 %a) {
 define <2 x i64> @umin_zext_vec(<2 x i32> %a) {
 ; CHECK-LABEL: @umin_zext_vec(
 ; CHECK-NEXT:    [[NARROW:%.*]] = call <2 x i32> @llvm.umin.v2i32(<2 x i32> [[A:%.*]], <2 x i32> <i32 2, i32 2>)
-; CHECK-NEXT:    [[MIN:%.*]] = zext <2 x i32> [[NARROW]] to <2 x i64>
+; CHECK-NEXT:    [[MIN:%.*]] = zext nneg <2 x i32> [[NARROW]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[MIN]]
 ;
   %a_ext = zext <2 x i32> %a to <2 x i64>
diff --git a/llvm/test/Transforms/InstCombine/and-narrow.ll b/llvm/test/Transforms/InstCombine/and-narrow.ll
index 92894090ef66d71..c8c720f5fbc5534 100644
--- a/llvm/test/Transforms/InstCombine/and-narrow.ll
+++ b/llvm/test/Transforms/InstCombine/and-narrow.ll
@@ -47,7 +47,7 @@ define i16 @zext_lshr(i8 %x) {
 ; CHECK-LABEL: @zext_lshr(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr i8 [[X:%.*]], 4
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
 ; CHECK-NEXT:    ret i16 [[R]]
 ;
   %z = zext i8 %x to i16
@@ -60,7 +60,7 @@ define i16 @zext_ashr(i8 %x) {
 ; CHECK-LABEL: @zext_ashr(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr i8 [[X:%.*]], 2
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
 ; CHECK-NEXT:    ret i16 [[R]]
 ;
   %z = zext i8 %x to i16
@@ -125,7 +125,7 @@ define <2 x i16> @zext_lshr_vec(<2 x i8> %x) {
 ; CHECK-LABEL: @zext_lshr_vec(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr <2 x i8> [[X:%.*]], <i8 4, i8 2>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i8> [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext <2 x i8> [[TMP2]] to <2 x i16>
+; CHECK-NEXT:    [[R:%.*]] = zext nneg <2 x i8> [[TMP2]] to <2 x i16>
 ; CHECK-NEXT:    ret <2 x i16> [[R]]
 ;
   %z = zext <2 x i8> %x to <2 x i16>
@@ -138,7 +138,7 @@ define <2 x i16> @zext_ashr_vec(<2 x i8> %x) {
 ; CHECK-LABEL: @zext_ashr_vec(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr <2 x i8> [[X:%.*]], <i8 2, i8 3>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i8> [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext <2 x i8> [[TMP2]] to <2 x i16>
+; CHECK-NEXT:    [[R:%.*]] = zext nneg <2 x i8> [[TMP2]] to <2 x i16>
 ; CHECK-NEXT:    ret <2 x i16> [[R]]
 ;
   %z = zext <2 x i8> %x to <2 x i16>
diff --git a/llvm/test/Transforms/InstCombine/and-xor-or.ll b/llvm/test/Transforms/InstCombine/and-xor-or.ll
index 741fc1eca65d1e9..69a7890bee22f80 100644
--- a/llvm/test/Transforms/InstCombine/and-xor-or.ll
+++ b/llvm/test/Transforms/InstCombine/and-xor-or.ll
@@ -4207,7 +4207,7 @@ define i16 @and_zext_zext(i8 %x, i4 %y) {
 ; CHECK-SAME: (i8 [[X:%.*]], i4 [[Y:%.*]]) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = zext i4 [[Y]] to i8
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
 ; CHECK-NEXT:    ret i16 [[R]]
 ;
   %zx = zext i8 %x to i16
diff --git a/llvm/test/Transforms/InstCombine/and.ll b/llvm/test/Transforms/InstCombine/and.ll
index 95b1b0e73ea5c7a..386ee3807050140 100644
--- a/llvm/test/Transforms/InstCombine/and.ll
+++ b/llvm/test/Transforms/InstCombine/and.ll
@@ -525,7 +525,7 @@ define <2 x i32> @and_demanded_bits_splat_vec(<2 x i32> %x) {
 define i32 @and_zext_demanded(i16 %x, i32 %y) {
 ; CHECK-LABEL: @and_zext_demanded(
 ; CHECK-NEXT:    [[S:%.*]] = lshr i16 [[X:%.*]], 8
-; CHECK-NEXT:    [[Z:%.*]] = zext i16 [[S]] to i32
+; CHECK-NEXT:    [[Z:%.*]] = zext nneg i16 [[S]] to i32
 ; CHECK-NEXT:    ret i32 [[Z]]
 ;
   %s = lshr i16 %x, 8
@@ -618,7 +618,7 @@ define i64 @test35(i32 %X) {
 ; CHECK-LABEL: @test35(
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 0, [[X:%.*]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -631,7 +631,7 @@ define <2 x i64> @test35_uniform(<2 x i32> %X) {
 ; CHECK-LABEL: @test35_uniform(
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub <2 x i32> zeroinitializer, [[X:%.*]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT:    [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[RES]]
 ;
   %zext = zext <2 x i32> %X to <2 x i64>
@@ -644,7 +644,7 @@ define i64 @test36(i32 %X) {
 ; CHECK-LABEL: @test36(
 ; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[X:%.*]], 7
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -657,7 +657,7 @@ define <2 x i64> @test36_uniform(<2 x i32> %X) {
 ; CHECK-LABEL: @test36_uniform(
 ; CHECK-NEXT:    [[TMP1:%.*]] = add <2 x i32> [[X:%.*]], <i32 7, i32 7>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT:    [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[RES]]
 ;
   %zext = zext <2 x i32> %X to <2 x i64>
@@ -683,7 +683,7 @@ define i64 @test37(i32 %X) {
 ; CHECK-LABEL: @test37(
 ; CHECK-NEXT:    [[TMP1:%.*]] = mul i32 [[X:%.*]], 7
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -696,7 +696,7 @@ define <2 x i64> @test37_uniform(<2 x i32> %X) {
 ; CHECK-LABEL: @test37_uniform(
 ; CHECK-NEXT:    [[TMP1:%.*]] = mul <2 x i32> [[X:%.*]], <i32 7, i32 7>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT:    [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[RES]]
 ;
   %zext = zext <2 x i32> %X to <2 x i64>
@@ -721,7 +721,7 @@ define <2 x i64> @test37_nonuniform(<2 x i32> %X) {
 define i64 @test38(i32 %X) {
 ; CHECK-LABEL: @test38(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[X:%.*]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -733,7 +733,7 @@ define i64 @test38(i32 %X) {
 define i64 @test39(i32 %X) {
 ; CHECK-LABEL: @test39(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[X:%.*]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
diff --git a/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll b/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
index 04b530647d0a26e..a3485978471dc05 100644
--- a/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
+++ b/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
@@ -54,7 +54,7 @@ define <2 x i32> @OrZextOrVec(<2 x i2> %a) {
 define i5 @AndZextAnd(i3 %a) {
 ; CHECK-LABEL: @AndZextAnd(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i3 [[A:%.*]], 2
-; CHECK-NEXT:    [[OP2:%.*]] = zext i3 [[TMP1]] to i5
+; CHECK-NEXT:    [[OP2:%.*]] = zext nneg i3 [[TMP1]] to i5
 ; CHECK-NEXT:    ret i5 [[OP2]]
 ;
   %op1 = and i3 %a, 3
@@ -66,7 +66,7 @@ define i5 @AndZextAnd(i3 %a) {
 define <2 x i32> @AndZextAndVec(<2 x i8> %a) {
 ; CHECK-LABEL: @AndZextAndVec(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[A:%.*]], <i8 5, i8 0>
-; CHECK-NEXT:    [[OP2:%.*]] = zext <2 x i8> [[TMP1]] to <2 x i32>
+; CHECK-NEXT:    [[OP2:%.*]] = zext nneg <2 x i8> [[TMP1]] to <2 x i32>
 ; CHECK-NEXT:    ret <2 x i32> [[OP2]]
 ;
   %op1 = and <2 x i8> %a, <i8 7, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/binop-cast.ll b/llvm/test/Transforms/InstCombine/binop-cast.ll
index 20d5814c05d3aa7..e3345194d0b3284 100644
--- a/llvm/test/Transforms/InstCombine/binop-cast.ll
+++ b/llvm/test/Transforms/InstCombine/binop-cast.ll
@@ -276,7 +276,7 @@ define i64 @PR63321(ptr %ptr, i64 %c) {
 define i64 @and_add_non_bool(ptr %ptr, i64 %c) {
 ; CHECK-LABEL: @and_add_non_bool(
 ; CHECK-NEXT:    [[VAL:%.*]] = load i8, ptr [[PTR:%.*]], align 1, !range [[RNG1:![0-9]+]]
-; CHECK-NEXT:    [[RHS:%.*]] = zext i8 [[VAL]] to i64
+; CHECK-NEXT:    [[RHS:%.*]] = zext nneg i8 [[VAL]] to i64
 ; CHECK-NEXT:    [[MASK:%.*]] = add nsw i64 [[RHS]], -1
 ; CHECK-NEXT:    [[RES:%.*]] = and i64 [[MASK]], [[C:%.*]]
 ; CHECK-NEXT:    ret i64 [[RES]]
diff --git a/llvm/test/Transforms/InstCombine/cast-mul-select.ll b/llvm/test/Transforms/InstCombine/cast-mul-select.ll
index 23e934de0baeb7e..454522b85a1e843 100644
--- a/llvm/test/Transforms/InstCombine/cast-mul-select.ll
+++ b/llvm/test/Transforms/InstCombine/cast-mul-select.ll
@@ -119,7 +119,7 @@ define i32 @eval_zext_multi_use_in_one_inst(i32 %x) {
 ; CHECK-NEXT:    [[T:%.*]] = trunc i32 [[X:%.*]] to i16
 ; CHECK-NEXT:    [[A:%.*]] = and i16 [[T]], 5
 ; CHECK-NEXT:    [[M:%.*]] = mul nuw nsw i16 [[A]], [[A]]
-; CHECK-NEXT:    [[R:%.*]] = zext i16 [[M]] to i32
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i16 [[M]] to i32
 ; CHECK-NEXT:    ret i32 [[R]]
 ;
 ; DBGINFO-LABEL: @eval_zext_multi_use_in_one_inst(
@@ -129,7 +129,7 @@ define i32 @eval_zext_multi_use_in_one_inst(i32 %x) {
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i16 [[A]], metadata [[META66:![0-9]+]], metadata !DIExpression()), !dbg [[DBG70]]
 ; DBGINFO-NEXT:    [[M:%.*]] = mul nuw nsw i16 [[A]], [[A]], !dbg [[DBG71:![0-9]+]]
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i16 [[M]], metadata [[META67:![0-9]+]], metadata !DIExpression()), !dbg [[DBG71]]
-; DBGINFO-NEXT:    [[R:%.*]] = zext i16 [[M]] to i32, !dbg [[DBG72:![0-9]+]]
+; DBGINFO-NEXT:    [[R:%.*]] = zext nneg i16 [[M]] to i32, !dbg [[DBG72:![0-9]+]]
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i32 [[R]], metadata [[META68:![0-9]+]], metadata !DIExpression()), !dbg [[DBG72]]
 ; DBGINFO-NEXT:    ret i32 [[R]], !dbg [[DBG73:![0-9]+]]
 ;
@@ -183,13 +183,13 @@ define void @PR36225(i32 %a, i32 %b, i1 %c1, i3 %v1, i3 %v2) {
 ; CHECK-NEXT:    [[TOBOOL:%.*]] = icmp eq i32 [[B:%.*]], 0
 ; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[TOBOOL]], i8 0, i8 4
 ; CHECK-NEXT:    switch i3 [[V1:%.*]], label [[EXIT:%.*]] [
-; CHECK-NEXT:    i3 0, label [[FOR_END:%.*]]
-; CHECK-NEXT:    i3 -1, label [[FOR_END]]
+; CHECK-NEXT:      i3 0, label [[FOR_END:%.*]]
+; CHECK-NEXT:      i3 -1, label [[FOR_END]]
 ; CHECK-NEXT:    ]
 ; CHECK:       for.body3:
 ; CHECK-NEXT:    switch i3 [[V2:%.*]], label [[EXIT]] [
-; CHECK-NEXT:    i3 0, label [[FOR_END]]
-; CHECK-NEXT:    i3 -1, label [[FOR_END]]
+; CHECK-NEXT:      i3 0, label [[FOR_END]]
+; CHECK-NEXT:      i3 -1, label [[FOR_END]]
 ; CHECK-NEXT:    ]
 ; CHECK:       for.end:
 ; CHECK-NEXT:    [[H:%.*]] = phi i8 [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ 0, [[FOR_BODY3]] ], [ 0, [[FOR_BODY3]] ]
@@ -213,13 +213,13 @@ define void @PR36225(i32 %a, i32 %b, i1 %c1, i3 %v1, i3 %v2) {
 ; DBGINFO-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[TOBOOL]], i8 0, i8 4, !dbg [[DBG97:![0-9]+]]
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i8 [[SPEC_SELECT]], metadata [[META90:![0-9]+]], metadata !DIExpression()), !dbg [[DBG97]]
 ; DBGINFO-NEXT:    switch i3 [[V1:%.*]], label [[EXIT:%.*]] [
-; DBGINFO-NEXT:    i3 0, label [[FOR_END:%.*]]
-; DBGINFO-NEXT:    i3 -1, label [[FOR_END]]
+; DBGINFO-NEXT:      i3 0, label [[FOR_END:%.*]]
+; DBGINFO-NEXT:      i3 -1, label [[FOR_END]]
 ; DBGINFO-NEXT:    ], !dbg [[DBG98:![0-9]+]]
 ; DBGINFO:       for.body3:
 ; DBGINFO-NEXT:    switch i3 [[V2:%.*]], label [[EXIT]] [
-; DBGINFO-NEXT:    i3 0, label [[FOR_END]]
-; DBGINFO-NEXT:    i3 -1, label [[FOR_END]]
+; DBGINFO-NEXT:      i3 0, label [[FOR_END]]
+; DBGINFO-NEXT:      i3 -1, label [[FOR_END]]
 ; DBGINFO-NEXT:    ], !dbg [[DBG99:![0-9]+]]
 ; DBGINFO:       for.end:
 ; DBGINFO-NEXT:    [[H:%.*]] = phi i8 [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ 0, [[FOR_BODY3]] ], [ 0, [[FOR_BODY3]] ], !dbg [[DBG100:![0-9]+]]
diff --git a/llvm/test/Transforms/InstCombine/cast.ll b/llvm/test/Transforms/InstCombine/cast.ll
index 59e488f3f23d52a..afa7ac45e96dcb4 100644
--- a/llvm/test/Transforms/InstCombine/cast.ll
+++ b/llvm/test/Transforms/InstCombine/cast.ll
@@ -124,12 +124,12 @@ define void @test_invoke_vararg_cast(ptr %a, ptr %b) personality ptr @__gxx_pers
 ; ALL-LABEL: @test_invoke_vararg_cast(
 ; ALL-NEXT:  entry:
 ; ALL-NEXT:    invoke void (i32, ...) @varargs(i32 1, ptr [[B:%.*]], ptr [[A:%.*]])
-; ALL-NEXT:    to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
+; ALL-NEXT:            to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
 ; ALL:       invoke.cont:
 ; ALL-NEXT:    ret void
 ; ALL:       lpad:
 ; ALL-NEXT:    [[TMP0:%.*]] = landingpad { ptr, i32 }
-; ALL-NEXT:    cleanup
+; ALL-NEXT:            cleanup
 ; ALL-NEXT:    ret void
 ;
 entry:
@@ -619,7 +619,7 @@ define ...
[truncated]

Use KnownBits to infer the nneg flag on zext instructions.

dtcxzyw

LGTM. Thanks!
We can improve the compile time by adding nneg flag during the zext creation (e.g., cttz(zext(x)) -> zext nneg(cttz(x))).

mikaelholmen · 2023-11-09T08:58:03Z

I think this patch causes miscompiles. Reproduce with
opt bbi-88690.ll -passes=instcombine -S -o -
So with this patch instcombine turns

@v_936 = global i16 -3276, align 1
@v_937 = global i24 0, align 1

define i16 @main() {
entry:
  %0 = load i16, ptr @v_936, align 1
  %unsclear = and i16 %0, 32767
  %resize = zext i16 %unsclear to i24
  %unsclear1 = and i24 %resize, 8388607
  store i24 %unsclear1, ptr @v_937, align 1
  ret i16 0
}

into

@v_936 = global i16 -3276, align 1
@v_937 = global i24 0, align 1

define i16 @main() {
entry:
  %0 = load i16, ptr @v_936, align 1
  %resize = zext nneg i16 %0 to i24
  store i24 %resize, ptr @v_937, align 1
  ret i16 0
}

I.e the and with 32767 (0x7fff) is gone and instead the zext got "nneg"?
But the value in v_936 can be, and actually is negative.

bbi-88690.ll.gz

dyung · 2023-11-09T09:47:12Z

We also have a couple of internal tests that seem to be failing after this commit. Consider the following code:

char print_tmp[1];
void print(char *, void *data, unsigned size) {
  unsigned char *bytes = (unsigned char *)data;
  for (unsigned i = 0; i != size; ++i)
    sprintf(print_tmp + i * 2, "%02x", bytes[size - 1 - i]);
  printf(print_tmp);
}
#define PRINT(VAR) print(#VAR, &VAR, sizeof(VAR))
struct {
  long b : 17;
} test141_struct_id29534;
struct test141_struct_id29574_ {
  test141_struct_id29574_() { INIT(172, *this); }
  unsigned a : 15;
} test141_struct_id29574;
int main() {
  long id29692 = test141_struct_id29534.b = test141_struct_id29574.a;
  PRINT(id29692);
}

When compiled without optimizations (and before this change with optimization) it would print out the value 0000000000002dac. But after this change, when optimizations are enabled, the program now prints out 000000000000adac.

You can see the difference at https://godbolt.org/z/vjPvGT5G9.

dtcxzyw · 2023-11-09T10:17:11Z

Reduced test case: https://godbolt.org/z/d4ETPhbno

nikic · 2023-11-09T10:35:16Z

It looks like simplifyAssocCastAssoc() is the problematic transform. It modifies a zext in-place without clearing poison flags.

Exposed by #71534 and reported there.

nikic · 2023-11-09T10:58:23Z

Should be fixed by 1b1c817.

mikaelholmen · 2023-11-09T11:19:05Z

Should be fixed by 1b1c817.

I've confirmed that the instances of the problem that we saw are fixed by 1b1c817.
Thanks!

This patch infers `nneg` flags for existing zext instructions in CVP. After #71534 and this patch, we can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`: https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83 This is an alternative to #72049.

…72053) After #71534 and #72052, the transform `zext -> zext nneg` in `RISCVCodeGenPrepare` is redundant.

This patch infers `nneg` flags for existing zext instructions in CVP. After llvm#71534 and this patch, we can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`: https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83 This is an alternative to llvm#72049.

…lvm#72053) After llvm#71534 and llvm#72052, the transform `zext -> zext nneg` in `RISCVCodeGenPrepare` is redundant.

Exposed by llvm/llvm-project#71534 and reported there.

nikic requested review from preames, dtcxzyw and goldsteinn November 7, 2023 13:28

llvmbot added the llvm:transforms label Nov 7, 2023

[InstCombine] Infer zext nneg flag

e965141

Use KnownBits to infer the nneg flag on zext instructions.

nikic force-pushed the instcombine-zext-nneg branch from 9b66984 to e965141 Compare November 7, 2023 15:14

llvmbot added the clang Clang issues not falling into any other category label Nov 7, 2023

dtcxzyw mentioned this pull request Nov 7, 2023

test PR71534 plctlab/llvm-ci#777

Closed

dtcxzyw approved these changes Nov 7, 2023

View reviewed changes

goldsteinn approved these changes Nov 7, 2023

View reviewed changes

nikic merged commit 5918f62 into llvm:main Nov 8, 2023

nikic added a commit that referenced this pull request Nov 9, 2023

[InstCombine] Add test for zext nneg miscompile (NFC)

0bd0d72

Exposed by #71534 and reported there.

dtcxzyw mentioned this pull request Nov 10, 2023

[InstCombine] Infer zext nneg flag directly #71906

Closed

heiher mentioned this pull request Nov 12, 2023

[LoongArch] Broken code is generated after #71534 #72046

Closed

This was referenced Nov 12, 2023

[ValueTracking] Infer signedness from dom conditions #72049

Closed

[CVP] Infer nneg on existing zext #72052

Merged

[RISCV][CodeGenPrepare] Remove duplicated transform for zext. NFC. #72053

Merged

dtcxzyw added a commit that referenced this pull request Nov 13, 2023

[RISCV][CodeGenPrepare] Remove duplicated transform for zext. NFC. (#…

d64d5ea

…72053) After #71534 and #72052, the transform `zext -> zext nneg` in `RISCVCodeGenPrepare` is redundant.

qihangkong pushed a commit to rvgpu/llvm that referenced this pull request Apr 18, 2024

[InstCombine] Add test for zext nneg miscompile (NFC)

b62c929

Exposed by llvm/llvm-project#71534 and reported there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] Infer zext nneg flag #71534

[InstCombine] Infer zext nneg flag #71534

Uh oh!

nikic commented Nov 7, 2023

Uh oh!

llvmbot commented Nov 7, 2023 •

edited

Loading

Uh oh!

dtcxzyw left a comment

Uh oh!

mikaelholmen commented Nov 9, 2023

Uh oh!

dyung commented Nov 9, 2023

Uh oh!

dtcxzyw commented Nov 9, 2023

Uh oh!

nikic commented Nov 9, 2023

Uh oh!

nikic commented Nov 9, 2023

Uh oh!

mikaelholmen commented Nov 9, 2023

Uh oh!

Uh oh!

[InstCombine] Infer zext nneg flag #71534

[InstCombine] Infer zext nneg flag #71534

Uh oh!

Conversation

nikic commented Nov 7, 2023

Uh oh!

llvmbot commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

mikaelholmen commented Nov 9, 2023

Uh oh!

dyung commented Nov 9, 2023

Uh oh!

dtcxzyw commented Nov 9, 2023

Uh oh!

nikic commented Nov 9, 2023

Uh oh!

nikic commented Nov 9, 2023

Uh oh!

mikaelholmen commented Nov 9, 2023

Uh oh!

Uh oh!

llvmbot commented Nov 7, 2023 •

edited

Loading