Skip to content

[InstCombine] Add exact flags for ext idiom shr (shl X, Y), Y #72483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Nov 16, 2023

This patch adds exact flags for sext/zext idiom shr (shl X, Y), Y.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern shr (shl X, Y), Z with Y u>= Z (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of AliveToolkit/alive2#962.

@llvmbot
Copy link
Member

llvmbot commented Nov 16, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch adds exact flags for sext/zext idiom shr (shl X, Y), Y.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern shr (shl X, Y), Z with Y u>= Z (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of AliveToolkit/alive2#962.


Patch is 63.27 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72483.diff

21 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp (+6)
  • (modified) llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll (+7-7)
  • (modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-d.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-e.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-f.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-variant-d.ll (+12-12)
  • (modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-variant-e.ll (+13-17)
  • (modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-variant-f.ll (+14-18)
  • (modified) llvm/test/Transforms/InstCombine/redundant-right-shift-input-masking.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/sext.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/shift-by-signext.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/trunc-inseltpoison.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/trunc.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/variable-signext-of-variable-high-bit-extraction.ll (+6-6)
  • (modified) llvm/test/Transforms/PhaseOrdering/two-shifts-by-sext.ll (+6-6)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
index a234b916f72c0e2..9d4a2cc08cca30c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
@@ -953,6 +953,12 @@ static bool setShiftFlags(BinaryOperator &I, const SimplifyQuery &Q) {
   } else {
     if (I.isExact())
       return false;
+
+    // shr (shl X, Y), Y
+    if (match(I.getOperand(0), m_Shl(m_Value(), m_Specific(I.getOperand(1))))) {
+      I.setIsExact();
+      return true;
+    }
   }
 
   // Compute what we know about shift count.
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll
index b774cd766a264c9..4ff27b787ed4e15 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll
@@ -22,7 +22,7 @@ define i1 @p0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @p0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
 ;
@@ -42,7 +42,7 @@ define <2 x i1> @p1_vec(<2 x i8> %x, <2 x i8> %y) {
 ; CHECK-LABEL: @p1_vec(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <2 x i8> <i8 -1, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use2i8(<2 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <2 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <2 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge <2 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
@@ -58,7 +58,7 @@ define <3 x i1> @p2_vec_undef0(<3 x i8> %x, <3 x i8> %y) {
 ; CHECK-LABEL: @p2_vec_undef0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <3 x i8> <i8 -1, i8 undef, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use3i8(<3 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <3 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <3 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge <3 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <3 x i1> [[RET]]
 ;
@@ -80,7 +80,7 @@ define i1 @c0(i8 %y) {
 ; CHECK-LABEL: @c0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ule i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -98,7 +98,7 @@ define i1 @c1(i8 %y) {
 ; CHECK-LABEL: @c1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ule i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -116,7 +116,7 @@ define i1 @c2(i8 %y) {
 ; CHECK-LABEL: @c2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ule i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -138,7 +138,7 @@ define i1 @oneuse0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -156,7 +156,7 @@ define i1 @oneuse1(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge i8 [[T1]], [[X]]
@@ -175,7 +175,7 @@ define i1 @oneuse2(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
@@ -200,7 +200,7 @@ define i1 @n0(i8 %x, i8 %y, i8 %notx) {
 ; CHECK-LABEL: @n0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[T2]], [[NOTX:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll
index c4865404c2f28ed..3c69d6b4c14a762 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll
@@ -22,7 +22,7 @@ define i1 @p0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @p0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
 ;
@@ -42,7 +42,7 @@ define <2 x i1> @p1_vec(<2 x i8> %x, <2 x i8> %y) {
 ; CHECK-LABEL: @p1_vec(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <2 x i8> <i8 -1, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use2i8(<2 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <2 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <2 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult <2 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
@@ -58,7 +58,7 @@ define <3 x i1> @p2_vec_undef0(<3 x i8> %x, <3 x i8> %y) {
 ; CHECK-LABEL: @p2_vec_undef0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <3 x i8> <i8 -1, i8 undef, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use3i8(<3 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <3 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <3 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult <3 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <3 x i1> [[RET]]
 ;
@@ -80,7 +80,7 @@ define i1 @c0(i8 %y) {
 ; CHECK-LABEL: @c0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ugt i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -98,7 +98,7 @@ define i1 @c1(i8 %y) {
 ; CHECK-LABEL: @c1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ugt i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -116,7 +116,7 @@ define i1 @c2(i8 %y) {
 ; CHECK-LABEL: @c2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ugt i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -138,7 +138,7 @@ define i1 @oneuse0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -156,7 +156,7 @@ define i1 @oneuse1(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult i8 [[T1]], [[X]]
@@ -175,7 +175,7 @@ define i1 @oneuse2(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
@@ -200,7 +200,7 @@ define i1 @n0(i8 %x, i8 %y, i8 %notx) {
 ; CHECK-LABEL: @n0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ne i8 [[T2]], [[NOTX:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll b/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
index 7c1c18cc214a5c5..c1e871dcccddcf2 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
@@ -388,7 +388,7 @@ define i32 @negative_oneuse(i32 %x, i32 %y) {
 ; CHECK-LABEL: @negative_oneuse(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i32 [[X:%.*]], [[Y:%.*]]
 ; CHECK-NEXT:    call void @use32(i32 [[T0]])
-; CHECK-NEXT:    [[RET:%.*]] = lshr i32 [[T0]], [[Y]]
+; CHECK-NEXT:    [[RET:%.*]] = lshr exact i32 [[T0]], [[Y]]
 ; CHECK-NEXT:    ret i32 [[RET]]
 ;
   %t0 = shl i32 %x, %y
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
index cd0d633d58a68e3..86af88de08e2cb0 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
@@ -18,7 +18,7 @@ define i32 @t0_basic(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @t0_basic(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
@@ -54,7 +54,7 @@ define <8 x i32> @t1_vec_splat(<8 x i64> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t1_vec_splat(
 ; CHECK-NEXT:    [[T0:%.*]] = zext <8 x i32> [[NBITS:%.*]] to <8 x i64>
 ; CHECK-NEXT:    [[T1:%.*]] = shl <8 x i64> <i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1>, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr <8 x i64> [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact <8 x i64> [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33>
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T0]])
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T1]])
@@ -85,7 +85,7 @@ define <8 x i32> @t2_vec_splat_undef(<8 x i64> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_splat_undef(
 ; CHECK-NEXT:    [[T0:%.*]] = zext <8 x i32> [[NBITS:%.*]] to <8 x i64>
 ; CHECK-NEXT:    [[T1:%.*]] = shl <8 x i64> <i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 undef, i64 -1>, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr <8 x i64> [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact <8 x i64> [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 undef, i32 -33>
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T0]])
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T1]])
@@ -116,7 +116,7 @@ define <8 x i32> @t3_vec_nonsplat(<8 x i64> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t3_vec_nonsplat(
 ; CHECK-NEXT:    [[T0:%.*]] = zext <8 x i32> [[NBITS:%.*]] to <8 x i64>
 ; CHECK-NEXT:    [[T1:%.*]] = shl <8 x i64> <i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 undef, i64 -1>, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr <8 x i64> [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact <8 x i64> [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -64, i32 -63, i32 -33, i32 -32, i32 63, i32 64, i32 undef, i32 65>
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T0]])
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T1]])
@@ -149,7 +149,7 @@ define i32 @n4_extrause0(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @n4_extrause0(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
@@ -182,7 +182,7 @@ define i32 @n5_extrause1(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @n5_extrause1(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
@@ -215,7 +215,7 @@ define i32 @n6_extrause2(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @n6_extrause2(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
index eb26bfac66f92af..9a4a5dd890eec26 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
@@ -137,7 +137,7 @@ define i32 @n4_extrause0(i64 %x, i32 %nbits) {
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
 ; CHECK-NEXT:    call void @use32(i32 [[T2]])
-; CHECK-NEXT:    [[T3:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T3:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    call void @use64(i64 [[T3]])
 ; CHECK-NEXT:    [[T4:%.*]] = trunc i64 [[T3]] to i32
 ; CHECK-NEXT:    [[T5:%.*]] = shl i32 [[T4]], [[T2]]
@@ -166,7 +166,7 @@ define i32 @n5_extrause1(i64 %x, i32 %nbits) {
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
 ; CHECK-NEXT:    call void @use32(i32 [[T2]])
-; CHECK-NEXT:    [[T3:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T3:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T4:%.*]] = trunc i64 [[T3]] to i32
 ; CHECK-NEXT:    call void @use32(i32 [[T4]])
 ; CHECK-NEXT:    [[T5:%.*]] = shl i32 [[T4]], [[T2]]
@@ -195,7 +195,7 @@ define i32 @n6_extrause2(i64 %x, i32 %nbits) {
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
 ; CHECK-NEXT:    call void @use32(i32 [[T2]])
-; CHECK-NEXT:    [[T3:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T3:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    call void @use64(i64 [[T3]])
 ; CHECK-NEXT:    [[T4:%.*]] = trunc i64 [[T3]] to i32
 ; CHECK-NEXT:    call void @use32(i32 [[T4]])
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
index f2a788ddba287d2..2031001404925e8 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
@@ -16,7 +16,7 @@ declare void @use32(i32)
 define i32 @t0_basic(i32 %x, i32 %nbits) {
 ; CHECK-LABEL: @t0_basic(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i32 -1, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr i32 [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i32 [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -1
 ; CHECK-NEXT:    call void @use32(i32 [[T0]])
 ; CHECK-NEXT:    call void @use32(i32 [[T1]])
@@ -43,7 +43,7 @@ declare void @use8xi32(<8 x i32>)
 define <8 x i32> @t2_vec_splat(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_splat(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1>, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <8 x i32> [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1>
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T0]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T1]])
@@ -66,7 +66,7 @@ define <8 x i32> @t2_vec_splat(<8 x i32> %x, <8 x i32> %nbits) {
 define <8 x i32> @t2_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_splat_undef(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <8 x i32> [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T0]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T1]])
@@ -89,7 +89,7 @@ define <8 x i32> @t2_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {
 define <8 x i32> @t2_vec_nonsplat(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_nonsplat(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1>, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <8 x i32> [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -32, i32 -31, i32 -1, i32 0, i32 1, i32 31, i32 32, i32 33>
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T0]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T1]])
@@ -114,7 +114,7 @@ define <8 x i32> @t2_vec_nonsplat(<8 x i32> %x, <8 x i32> %nbits) {
 define i32 @n3_extrause(i32 %x, i32 %nbits) {
 ; CHECK-LABEL: @n3_extrause(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i32 -1, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr i32 [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i32 [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i32 [[T1]], [[X:%.*]]
 ; CHECK-...
[truncated]

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit e8fe15c into llvm:main Nov 16, 2023
@dtcxzyw dtcxzyw deleted the infer-exact-shr-shl branch November 16, 2023 09:30
sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
…#72483)

This patch adds exact flags for sext/zext idiom `shr (shl X, Y), Y`.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern `shr (shl X, Y), Z` with `Y u>=
Z` (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of
AliveToolkit/alive2#962.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…#72483)

This patch adds exact flags for sext/zext idiom `shr (shl X, Y), Y`.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern `shr (shl X, Y), Z` with `Y u>=
Z` (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of
AliveToolkit/alive2#962.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants