Skip to content

[InstCombine] Infer zext nneg flag #71534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 8, 2023
Merged

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Nov 7, 2023

Use KnownBits to infer the nneg flag on zext instructions.

Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.

Compile-time impact exists but is pretty small: http://llvm-compile-time-tracker.com/compare.php?from=8f76522a61d01cf7d70debd39418259e969bb8d6&to=20a4612687e6f2d75097e67987c1c592359b3b96&stat=instructions:u I don't see any obvious way to avoid it (I don't think we have any existing KnownBits calculations we can reuse here.)

@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Use KnownBits to infer the nneg flag on zext instructions.

Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.

Compile-time impact exists but is pretty small: http://llvm-compile-time-tracker.com/compare.php?from=8f76522a61d01cf7d70debd39418259e969bb8d6&to=20a4612687e6f2d75097e67987c1c592359b3b96&stat=instructions:u I don't see any obvious way to avoid it (I don't think we have any existing KnownBits calculations we can reuse here.)


Patch is 88.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71534.diff

47 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (+5)
  • (modified) llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/adjust-for-minmax.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/and-narrow.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/and-xor-or.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/and.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/binop-cast.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/cast-mul-select.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/cast.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/ctpop.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/cttz.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/fmul.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/freeze.ll (+13-13)
  • (modified) llvm/test/Transforms/InstCombine/load-bitcast-select.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/lshr.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/minmax-fold.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/minmax-intrinsics.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/narrow-math.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/negated-bitmask.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/overflow-mul.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/reduction-add-sext-zext-i1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/reduction-xor-sext-zext-i1.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/rem.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/select-bitext-bitwise-ops.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/select-bitext.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp-cttz-ctlz.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/select-ctlz-to-cttz.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/select-obo-peo-ops.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/select-with-bitwise-ops.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest-with-truncation-shl.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/shift.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/trunc-inseltpoison.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/trunc.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/udiv-simplify.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/udivrem-change-width.ll (+11-11)
  • (modified) llvm/test/Transforms/InstCombine/vector-casts-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vector-casts.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/zeroext-and-reduce.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/zext-or-icmp.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/zext.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+5-5)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index efd18b44657e5da..08ead599d525a71 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -1219,6 +1219,11 @@ Instruction *InstCombinerImpl::visitZExt(ZExtInst &Zext) {
     }
   }
 
+  if (!Zext.hasNonNeg() && isKnownNonNegative(Src, DL, 0, &AC, &Zext, &DT)) {
+    Zext.setNonNeg();
+    return &Zext;
+  }
+
   return nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll b/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
index eda4053cf0f6988..3081baa2db281e4 100644
--- a/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
+++ b/llvm/test/Transforms/InstCombine/2010-11-01-lshr-mask.ll
@@ -8,7 +8,7 @@ define i32 @main(i32 %argc) {
 ; CHECK-NEXT:    [[T3163:%.*]] = xor i8 [[T3151]], -1
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl i8 [[T3163]], 5
 ; CHECK-NEXT:    [[T4127:%.*]] = and i8 [[TMP1]], 64
-; CHECK-NEXT:    [[T4086:%.*]] = zext i8 [[T4127]] to i32
+; CHECK-NEXT:    [[T4086:%.*]] = zext nneg i8 [[T4127]] to i32
 ; CHECK-NEXT:    ret i32 [[T4086]]
 ;
   %t3151 = trunc i32 %argc to i8
diff --git a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
index 8c2ba9701e72a5b..21d5723cbb82d63 100644
--- a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
+++ b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts-inseltpoison.ll
@@ -2816,7 +2816,7 @@ define <8 x i32> @avx2_psrai_d_256_masked(<8 x i32> %v, i32 %a) {
 define <8 x i64> @avx512_psrai_q_512_masked(<8 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx512_psrai_q_512_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <8 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <8 x i64> [[DOTSPLATINSERT]], <8 x i64> poison, <8 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = ashr <8 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2843,7 +2843,7 @@ define <4 x i32> @sse2_psrli_d_128_masked(<4 x i32> %v, i32 %a) {
 define <4 x i64> @avx2_psrli_q_256_masked(<4 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx2_psrli_q_256_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = lshr <4 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2871,7 +2871,7 @@ define <32 x i16> @avx512_psrli_w_512_masked(<32 x i16> %v, i32 %a) {
 define <2 x i64> @sse2_pslli_q_128_masked(<2 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @sse2_pslli_q_128_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = shl <2 x i64> [[V:%.*]], [[DOTSPLAT]]
diff --git a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
index 63e44fda81552e6..a3b14ef2b1c1bee 100644
--- a/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
+++ b/llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll
@@ -2772,8 +2772,8 @@ define <2 x i64> @sse2_psll_q_128_masked_bitcast(<2 x i64> %v, <2 x i64> %a) {
 ; CHECK-NEXT:    [[I:%.*]] = insertelement <4 x i32> [[M]], i32 0, i64 1
 ; CHECK-NEXT:    [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>
 ; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <2 x i64> [[SHAMT]], <2 x i64> poison, <2 x i32> zeroinitializer
-; CHECK-NEXT:    [[TMP2:%.*]] = shl <2 x i64> [[V:%.*]], [[TMP1]]
-; CHECK-NEXT:    ret <2 x i64> [[TMP2]]
+; CHECK-NEXT:    [[R:%.*]] = shl <2 x i64> [[V:%.*]], [[TMP1]]
+; CHECK-NEXT:    ret <2 x i64> [[R]]
 ;
   %b = bitcast <2 x i64> %a to <4 x i32>
   %m = and <4 x i32> %b, <i32 31, i32 poison, i32 poison, i32 poison>
@@ -2856,7 +2856,7 @@ define <8 x i32> @avx2_psrai_d_256_masked(<8 x i32> %v, i32 %a) {
 define <8 x i64> @avx512_psrai_q_512_masked(<8 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx512_psrai_q_512_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <8 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <8 x i64> [[DOTSPLATINSERT]], <8 x i64> poison, <8 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = ashr <8 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2883,7 +2883,7 @@ define <4 x i32> @sse2_psrli_d_128_masked(<4 x i32> %v, i32 %a) {
 define <4 x i64> @avx2_psrli_q_256_masked(<4 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @avx2_psrli_q_256_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i64> [[DOTSPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = lshr <4 x i64> [[V:%.*]], [[DOTSPLAT]]
@@ -2911,7 +2911,7 @@ define <32 x i16> @avx512_psrli_w_512_masked(<32 x i16> %v, i32 %a) {
 define <2 x i64> @sse2_pslli_q_128_masked(<2 x i64> %v, i32 %a) {
 ; CHECK-LABEL: @sse2_pslli_q_128_masked(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[A:%.*]], 63
-; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[TMP2]], i64 0
 ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP3:%.*]] = shl <2 x i64> [[V:%.*]], [[DOTSPLAT]]
diff --git a/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll b/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
index dced55944505370..76fc7a07be6bd61 100644
--- a/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
+++ b/llvm/test/Transforms/InstCombine/adjust-for-minmax.ll
@@ -414,7 +414,7 @@ define <2 x i64> @umax_zext_vec(<2 x i32> %a) {
 define i64 @umin_zext(i32 %a) {
 ; CHECK-LABEL: @umin_zext(
 ; CHECK-NEXT:    [[NARROW:%.*]] = call i32 @llvm.umin.i32(i32 [[A:%.*]], i32 2)
-; CHECK-NEXT:    [[MIN:%.*]] = zext i32 [[NARROW]] to i64
+; CHECK-NEXT:    [[MIN:%.*]] = zext nneg i32 [[NARROW]] to i64
 ; CHECK-NEXT:    ret i64 [[MIN]]
 ;
   %a_ext = zext i32 %a to i64
@@ -426,7 +426,7 @@ define i64 @umin_zext(i32 %a) {
 define <2 x i64> @umin_zext_vec(<2 x i32> %a) {
 ; CHECK-LABEL: @umin_zext_vec(
 ; CHECK-NEXT:    [[NARROW:%.*]] = call <2 x i32> @llvm.umin.v2i32(<2 x i32> [[A:%.*]], <2 x i32> <i32 2, i32 2>)
-; CHECK-NEXT:    [[MIN:%.*]] = zext <2 x i32> [[NARROW]] to <2 x i64>
+; CHECK-NEXT:    [[MIN:%.*]] = zext nneg <2 x i32> [[NARROW]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[MIN]]
 ;
   %a_ext = zext <2 x i32> %a to <2 x i64>
diff --git a/llvm/test/Transforms/InstCombine/and-narrow.ll b/llvm/test/Transforms/InstCombine/and-narrow.ll
index 92894090ef66d71..c8c720f5fbc5534 100644
--- a/llvm/test/Transforms/InstCombine/and-narrow.ll
+++ b/llvm/test/Transforms/InstCombine/and-narrow.ll
@@ -47,7 +47,7 @@ define i16 @zext_lshr(i8 %x) {
 ; CHECK-LABEL: @zext_lshr(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr i8 [[X:%.*]], 4
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
 ; CHECK-NEXT:    ret i16 [[R]]
 ;
   %z = zext i8 %x to i16
@@ -60,7 +60,7 @@ define i16 @zext_ashr(i8 %x) {
 ; CHECK-LABEL: @zext_ashr(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr i8 [[X:%.*]], 2
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
 ; CHECK-NEXT:    ret i16 [[R]]
 ;
   %z = zext i8 %x to i16
@@ -125,7 +125,7 @@ define <2 x i16> @zext_lshr_vec(<2 x i8> %x) {
 ; CHECK-LABEL: @zext_lshr_vec(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr <2 x i8> [[X:%.*]], <i8 4, i8 2>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i8> [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext <2 x i8> [[TMP2]] to <2 x i16>
+; CHECK-NEXT:    [[R:%.*]] = zext nneg <2 x i8> [[TMP2]] to <2 x i16>
 ; CHECK-NEXT:    ret <2 x i16> [[R]]
 ;
   %z = zext <2 x i8> %x to <2 x i16>
@@ -138,7 +138,7 @@ define <2 x i16> @zext_ashr_vec(<2 x i8> %x) {
 ; CHECK-LABEL: @zext_ashr_vec(
 ; CHECK-NEXT:    [[TMP1:%.*]] = lshr <2 x i8> [[X:%.*]], <i8 2, i8 3>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i8> [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext <2 x i8> [[TMP2]] to <2 x i16>
+; CHECK-NEXT:    [[R:%.*]] = zext nneg <2 x i8> [[TMP2]] to <2 x i16>
 ; CHECK-NEXT:    ret <2 x i16> [[R]]
 ;
   %z = zext <2 x i8> %x to <2 x i16>
diff --git a/llvm/test/Transforms/InstCombine/and-xor-or.ll b/llvm/test/Transforms/InstCombine/and-xor-or.ll
index 741fc1eca65d1e9..69a7890bee22f80 100644
--- a/llvm/test/Transforms/InstCombine/and-xor-or.ll
+++ b/llvm/test/Transforms/InstCombine/and-xor-or.ll
@@ -4207,7 +4207,7 @@ define i16 @and_zext_zext(i8 %x, i4 %y) {
 ; CHECK-SAME: (i8 [[X:%.*]], i4 [[Y:%.*]]) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = zext i4 [[Y]] to i8
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[TMP1]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = zext i8 [[TMP2]] to i16
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i8 [[TMP2]] to i16
 ; CHECK-NEXT:    ret i16 [[R]]
 ;
   %zx = zext i8 %x to i16
diff --git a/llvm/test/Transforms/InstCombine/and.ll b/llvm/test/Transforms/InstCombine/and.ll
index 95b1b0e73ea5c7a..386ee3807050140 100644
--- a/llvm/test/Transforms/InstCombine/and.ll
+++ b/llvm/test/Transforms/InstCombine/and.ll
@@ -525,7 +525,7 @@ define <2 x i32> @and_demanded_bits_splat_vec(<2 x i32> %x) {
 define i32 @and_zext_demanded(i16 %x, i32 %y) {
 ; CHECK-LABEL: @and_zext_demanded(
 ; CHECK-NEXT:    [[S:%.*]] = lshr i16 [[X:%.*]], 8
-; CHECK-NEXT:    [[Z:%.*]] = zext i16 [[S]] to i32
+; CHECK-NEXT:    [[Z:%.*]] = zext nneg i16 [[S]] to i32
 ; CHECK-NEXT:    ret i32 [[Z]]
 ;
   %s = lshr i16 %x, 8
@@ -618,7 +618,7 @@ define i64 @test35(i32 %X) {
 ; CHECK-LABEL: @test35(
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 0, [[X:%.*]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -631,7 +631,7 @@ define <2 x i64> @test35_uniform(<2 x i32> %X) {
 ; CHECK-LABEL: @test35_uniform(
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub <2 x i32> zeroinitializer, [[X:%.*]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT:    [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[RES]]
 ;
   %zext = zext <2 x i32> %X to <2 x i64>
@@ -644,7 +644,7 @@ define i64 @test36(i32 %X) {
 ; CHECK-LABEL: @test36(
 ; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[X:%.*]], 7
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -657,7 +657,7 @@ define <2 x i64> @test36_uniform(<2 x i32> %X) {
 ; CHECK-LABEL: @test36_uniform(
 ; CHECK-NEXT:    [[TMP1:%.*]] = add <2 x i32> [[X:%.*]], <i32 7, i32 7>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT:    [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[RES]]
 ;
   %zext = zext <2 x i32> %X to <2 x i64>
@@ -683,7 +683,7 @@ define i64 @test37(i32 %X) {
 ; CHECK-LABEL: @test37(
 ; CHECK-NEXT:    [[TMP1:%.*]] = mul i32 [[X:%.*]], 7
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i32 [[TMP1]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP2]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP2]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -696,7 +696,7 @@ define <2 x i64> @test37_uniform(<2 x i32> %X) {
 ; CHECK-LABEL: @test37_uniform(
 ; CHECK-NEXT:    [[TMP1:%.*]] = mul <2 x i32> [[X:%.*]], <i32 7, i32 7>
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 240, i32 240>
-; CHECK-NEXT:    [[RES:%.*]] = zext <2 x i32> [[TMP2]] to <2 x i64>
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg <2 x i32> [[TMP2]] to <2 x i64>
 ; CHECK-NEXT:    ret <2 x i64> [[RES]]
 ;
   %zext = zext <2 x i32> %X to <2 x i64>
@@ -721,7 +721,7 @@ define <2 x i64> @test37_nonuniform(<2 x i32> %X) {
 define i64 @test38(i32 %X) {
 ; CHECK-LABEL: @test38(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[X:%.*]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
@@ -733,7 +733,7 @@ define i64 @test38(i32 %X) {
 define i64 @test39(i32 %X) {
 ; CHECK-LABEL: @test39(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[X:%.*]], 240
-; CHECK-NEXT:    [[RES:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT:    [[RES:%.*]] = zext nneg i32 [[TMP1]] to i64
 ; CHECK-NEXT:    ret i64 [[RES]]
 ;
   %zext = zext i32 %X to i64
diff --git a/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll b/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
index 04b530647d0a26e..a3485978471dc05 100644
--- a/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
+++ b/llvm/test/Transforms/InstCombine/assoc-cast-assoc.ll
@@ -54,7 +54,7 @@ define <2 x i32> @OrZextOrVec(<2 x i2> %a) {
 define i5 @AndZextAnd(i3 %a) {
 ; CHECK-LABEL: @AndZextAnd(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and i3 [[A:%.*]], 2
-; CHECK-NEXT:    [[OP2:%.*]] = zext i3 [[TMP1]] to i5
+; CHECK-NEXT:    [[OP2:%.*]] = zext nneg i3 [[TMP1]] to i5
 ; CHECK-NEXT:    ret i5 [[OP2]]
 ;
   %op1 = and i3 %a, 3
@@ -66,7 +66,7 @@ define i5 @AndZextAnd(i3 %a) {
 define <2 x i32> @AndZextAndVec(<2 x i8> %a) {
 ; CHECK-LABEL: @AndZextAndVec(
 ; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[A:%.*]], <i8 5, i8 0>
-; CHECK-NEXT:    [[OP2:%.*]] = zext <2 x i8> [[TMP1]] to <2 x i32>
+; CHECK-NEXT:    [[OP2:%.*]] = zext nneg <2 x i8> [[TMP1]] to <2 x i32>
 ; CHECK-NEXT:    ret <2 x i32> [[OP2]]
 ;
   %op1 = and <2 x i8> %a, <i8 7, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/binop-cast.ll b/llvm/test/Transforms/InstCombine/binop-cast.ll
index 20d5814c05d3aa7..e3345194d0b3284 100644
--- a/llvm/test/Transforms/InstCombine/binop-cast.ll
+++ b/llvm/test/Transforms/InstCombine/binop-cast.ll
@@ -276,7 +276,7 @@ define i64 @PR63321(ptr %ptr, i64 %c) {
 define i64 @and_add_non_bool(ptr %ptr, i64 %c) {
 ; CHECK-LABEL: @and_add_non_bool(
 ; CHECK-NEXT:    [[VAL:%.*]] = load i8, ptr [[PTR:%.*]], align 1, !range [[RNG1:![0-9]+]]
-; CHECK-NEXT:    [[RHS:%.*]] = zext i8 [[VAL]] to i64
+; CHECK-NEXT:    [[RHS:%.*]] = zext nneg i8 [[VAL]] to i64
 ; CHECK-NEXT:    [[MASK:%.*]] = add nsw i64 [[RHS]], -1
 ; CHECK-NEXT:    [[RES:%.*]] = and i64 [[MASK]], [[C:%.*]]
 ; CHECK-NEXT:    ret i64 [[RES]]
diff --git a/llvm/test/Transforms/InstCombine/cast-mul-select.ll b/llvm/test/Transforms/InstCombine/cast-mul-select.ll
index 23e934de0baeb7e..454522b85a1e843 100644
--- a/llvm/test/Transforms/InstCombine/cast-mul-select.ll
+++ b/llvm/test/Transforms/InstCombine/cast-mul-select.ll
@@ -119,7 +119,7 @@ define i32 @eval_zext_multi_use_in_one_inst(i32 %x) {
 ; CHECK-NEXT:    [[T:%.*]] = trunc i32 [[X:%.*]] to i16
 ; CHECK-NEXT:    [[A:%.*]] = and i16 [[T]], 5
 ; CHECK-NEXT:    [[M:%.*]] = mul nuw nsw i16 [[A]], [[A]]
-; CHECK-NEXT:    [[R:%.*]] = zext i16 [[M]] to i32
+; CHECK-NEXT:    [[R:%.*]] = zext nneg i16 [[M]] to i32
 ; CHECK-NEXT:    ret i32 [[R]]
 ;
 ; DBGINFO-LABEL: @eval_zext_multi_use_in_one_inst(
@@ -129,7 +129,7 @@ define i32 @eval_zext_multi_use_in_one_inst(i32 %x) {
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i16 [[A]], metadata [[META66:![0-9]+]], metadata !DIExpression()), !dbg [[DBG70]]
 ; DBGINFO-NEXT:    [[M:%.*]] = mul nuw nsw i16 [[A]], [[A]], !dbg [[DBG71:![0-9]+]]
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i16 [[M]], metadata [[META67:![0-9]+]], metadata !DIExpression()), !dbg [[DBG71]]
-; DBGINFO-NEXT:    [[R:%.*]] = zext i16 [[M]] to i32, !dbg [[DBG72:![0-9]+]]
+; DBGINFO-NEXT:    [[R:%.*]] = zext nneg i16 [[M]] to i32, !dbg [[DBG72:![0-9]+]]
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i32 [[R]], metadata [[META68:![0-9]+]], metadata !DIExpression()), !dbg [[DBG72]]
 ; DBGINFO-NEXT:    ret i32 [[R]], !dbg [[DBG73:![0-9]+]]
 ;
@@ -183,13 +183,13 @@ define void @PR36225(i32 %a, i32 %b, i1 %c1, i3 %v1, i3 %v2) {
 ; CHECK-NEXT:    [[TOBOOL:%.*]] = icmp eq i32 [[B:%.*]], 0
 ; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[TOBOOL]], i8 0, i8 4
 ; CHECK-NEXT:    switch i3 [[V1:%.*]], label [[EXIT:%.*]] [
-; CHECK-NEXT:    i3 0, label [[FOR_END:%.*]]
-; CHECK-NEXT:    i3 -1, label [[FOR_END]]
+; CHECK-NEXT:      i3 0, label [[FOR_END:%.*]]
+; CHECK-NEXT:      i3 -1, label [[FOR_END]]
 ; CHECK-NEXT:    ]
 ; CHECK:       for.body3:
 ; CHECK-NEXT:    switch i3 [[V2:%.*]], label [[EXIT]] [
-; CHECK-NEXT:    i3 0, label [[FOR_END]]
-; CHECK-NEXT:    i3 -1, label [[FOR_END]]
+; CHECK-NEXT:      i3 0, label [[FOR_END]]
+; CHECK-NEXT:      i3 -1, label [[FOR_END]]
 ; CHECK-NEXT:    ]
 ; CHECK:       for.end:
 ; CHECK-NEXT:    [[H:%.*]] = phi i8 [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ 0, [[FOR_BODY3]] ], [ 0, [[FOR_BODY3]] ]
@@ -213,13 +213,13 @@ define void @PR36225(i32 %a, i32 %b, i1 %c1, i3 %v1, i3 %v2) {
 ; DBGINFO-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[TOBOOL]], i8 0, i8 4, !dbg [[DBG97:![0-9]+]]
 ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i8 [[SPEC_SELECT]], metadata [[META90:![0-9]+]], metadata !DIExpression()), !dbg [[DBG97]]
 ; DBGINFO-NEXT:    switch i3 [[V1:%.*]], label [[EXIT:%.*]] [
-; DBGINFO-NEXT:    i3 0, label [[FOR_END:%.*]]
-; DBGINFO-NEXT:    i3 -1, label [[FOR_END]]
+; DBGINFO-NEXT:      i3 0, label [[FOR_END:%.*]]
+; DBGINFO-NEXT:      i3 -1, label [[FOR_END]]
 ; DBGINFO-NEXT:    ], !dbg [[DBG98:![0-9]+]]
 ; DBGINFO:       for.body3:
 ; DBGINFO-NEXT:    switch i3 [[V2:%.*]], label [[EXIT]] [
-; DBGINFO-NEXT:    i3 0, label [[FOR_END]]
-; DBGINFO-NEXT:    i3 -1, label [[FOR_END]]
+; DBGINFO-NEXT:      i3 0, label [[FOR_END]]
+; DBGINFO-NEXT:      i3 -1, label [[FOR_END]]
 ; DBGINFO-NEXT:    ], !dbg [[DBG99:![0-9]+]]
 ; DBGINFO:       for.end:
 ; DBGINFO-NEXT:    [[H:%.*]] = phi i8 [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ [[SPEC_SELECT]], [[FOR_BODY3_US]] ], [ 0, [[FOR_BODY3]] ], [ 0, [[FOR_BODY3]] ], !dbg [[DBG100:![0-9]+]]
diff --git a/llvm/test/Transforms/InstCombine/cast.ll b/llvm/test/Transforms/InstCombine/cast.ll
index 59e488f3f23d52a..afa7ac45e96dcb4 100644
--- a/llvm/test/Transforms/InstCombine/cast.ll
+++ b/llvm/test/Transforms/InstCombine/cast.ll
@@ -124,12 +124,12 @@ define void @test_invoke_vararg_cast(ptr %a, ptr %b) personality ptr @__gxx_pers
 ; ALL-LABEL: @test_invoke_vararg_cast(
 ; ALL-NEXT:  entry:
 ; ALL-NEXT:    invoke void (i32, ...) @varargs(i32 1, ptr [[B:%.*]], ptr [[A:%.*]])
-; ALL-NEXT:    to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
+; ALL-NEXT:            to label [[INVOKE_CONT:%.*]] unwind label [[LPAD:%.*]]
 ; ALL:       invoke.cont:
 ; ALL-NEXT:    ret void
 ; ALL:       lpad:
 ; ALL-NEXT:    [[TMP0:%.*]] = landingpad { ptr, i32 }
-; ALL-NEXT:    cleanup
+; ALL-NEXT:            cleanup
 ; ALL-NEXT:    ret void
 ;
 entry:
@@ -619,7 +619,7 @@ define ...
[truncated]

Use KnownBits to infer the nneg flag on zext instructions.
@nikic nikic force-pushed the instcombine-zext-nneg branch from 9b66984 to e965141 Compare November 7, 2023 15:14
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Nov 7, 2023
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!
We can improve the compile time by adding nneg flag during the zext creation (e.g., cttz(zext(x)) -> zext nneg(cttz(x))).

@nikic nikic merged commit 5918f62 into llvm:main Nov 8, 2023
@mikaelholmen
Copy link
Collaborator

I think this patch causes miscompiles. Reproduce with
opt bbi-88690.ll -passes=instcombine -S -o -
So with this patch instcombine turns

@v_936 = global i16 -3276, align 1
@v_937 = global i24 0, align 1

define i16 @main() {
entry:
  %0 = load i16, ptr @v_936, align 1
  %unsclear = and i16 %0, 32767
  %resize = zext i16 %unsclear to i24
  %unsclear1 = and i24 %resize, 8388607
  store i24 %unsclear1, ptr @v_937, align 1
  ret i16 0
}

into

@v_936 = global i16 -3276, align 1
@v_937 = global i24 0, align 1

define i16 @main() {
entry:
  %0 = load i16, ptr @v_936, align 1
  %resize = zext nneg i16 %0 to i24
  store i24 %resize, ptr @v_937, align 1
  ret i16 0
}

I.e the and with 32767 (0x7fff) is gone and instead the zext got "nneg"?
But the value in v_936 can be, and actually is negative.

bbi-88690.ll.gz

@dyung
Copy link
Collaborator

dyung commented Nov 9, 2023

We also have a couple of internal tests that seem to be failing after this commit. Consider the following code:

char print_tmp[1];
void print(char *, void *data, unsigned size) {
  unsigned char *bytes = (unsigned char *)data;
  for (unsigned i = 0; i != size; ++i)
    sprintf(print_tmp + i * 2, "%02x", bytes[size - 1 - i]);
  printf(print_tmp);
}
#define PRINT(VAR) print(#VAR, &VAR, sizeof(VAR))
struct {
  long b : 17;
} test141_struct_id29534;
struct test141_struct_id29574_ {
  test141_struct_id29574_() { INIT(172, *this); }
  unsigned a : 15;
} test141_struct_id29574;
int main() {
  long id29692 = test141_struct_id29534.b = test141_struct_id29574.a;
  PRINT(id29692);
}

When compiled without optimizations (and before this change with optimization) it would print out the value 0000000000002dac. But after this change, when optimizations are enabled, the program now prints out 000000000000adac.

You can see the difference at https://godbolt.org/z/vjPvGT5G9.

@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 9, 2023

Reduced test case: https://godbolt.org/z/d4ETPhbno

@nikic
Copy link
Contributor Author

nikic commented Nov 9, 2023

It looks like simplifyAssocCastAssoc() is the problematic transform. It modifies a zext in-place without clearing poison flags.

nikic added a commit that referenced this pull request Nov 9, 2023
@nikic
Copy link
Contributor Author

nikic commented Nov 9, 2023

Should be fixed by 1b1c817.

@mikaelholmen
Copy link
Collaborator

Should be fixed by 1b1c817.

I've confirmed that the instances of the problem that we saw are fixed by 1b1c817.
Thanks!

dtcxzyw added a commit that referenced this pull request Nov 13, 2023
This patch infers `nneg` flags for existing zext instructions in CVP.
After #71534 and this patch, we
can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`:


https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83

This is an alternative to #72049.
dtcxzyw added a commit that referenced this pull request Nov 13, 2023
…72053)

After #71534 and #72052, the transform `zext -> zext nneg` in
`RISCVCodeGenPrepare` is redundant.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
This patch infers `nneg` flags for existing zext instructions in CVP.
After llvm#71534 and this patch, we
can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`:


https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83

This is an alternative to llvm#72049.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…lvm#72053)

After llvm#71534 and llvm#72052, the transform `zext -> zext nneg` in
`RISCVCodeGenPrepare` is redundant.
qihangkong pushed a commit to rvgpu/llvm that referenced this pull request Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants