[InstCombine] Add exact flags for ext idiom `shr (shl X, Y), Y` #72483

dtcxzyw · 2023-11-16T06:39:42Z

This patch adds exact flags for sext/zext idiom shr (shl X, Y), Y.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern shr (shl X, Y), Z with Y u>= Z (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of AliveToolkit/alive2#962.

X, Y), Y`

llvmbot · 2023-11-16T06:40:08Z

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch adds exact flags for sext/zext idiom shr (shl X, Y), Y.
Alive2: https://alive2.llvm.org/ce/z/xYFpfB

We can generalize it to handle pattern shr (shl X, Y), Z with Y u>= Z (e.g., non-splat vectors). But I don't think it's worth the effort.

This missed optimization is discovered with the help of AliveToolkit/alive2#962.

Patch is 63.27 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72483.diff

21 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp (+6)
(modified) llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll (+10-10)
(modified) llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll (+10-10)
(modified) llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll (+7-7)
(modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll (+3-3)
(modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-d.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-e.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-f.ll (+6-6)
(modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-variant-d.ll (+12-12)
(modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-variant-e.ll (+13-17)
(modified) llvm/test/Transforms/InstCombine/redundant-left-shift-input-masking-variant-f.ll (+14-18)
(modified) llvm/test/Transforms/InstCombine/redundant-right-shift-input-masking.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/sext.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/shift-by-signext.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/trunc-inseltpoison.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/trunc.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/variable-signext-of-variable-high-bit-extraction.ll (+6-6)
(modified) llvm/test/Transforms/PhaseOrdering/two-shifts-by-sext.ll (+6-6)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
index a234b916f72c0e2..9d4a2cc08cca30c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
@@ -953,6 +953,12 @@ static bool setShiftFlags(BinaryOperator &I, const SimplifyQuery &Q) {
   } else {
     if (I.isExact())
       return false;
+
+    // shr (shl X, Y), Y
+    if (match(I.getOperand(0), m_Shl(m_Value(), m_Specific(I.getOperand(1))))) {
+      I.setIsExact();
+      return true;
+    }
   }
 
   // Compute what we know about shift count.
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll
index b774cd766a264c9..4ff27b787ed4e15 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-eq-to-icmp-ule.ll
@@ -22,7 +22,7 @@ define i1 @p0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @p0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
 ;
@@ -42,7 +42,7 @@ define <2 x i1> @p1_vec(<2 x i8> %x, <2 x i8> %y) {
 ; CHECK-LABEL: @p1_vec(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <2 x i8> <i8 -1, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use2i8(<2 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <2 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <2 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge <2 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
@@ -58,7 +58,7 @@ define <3 x i1> @p2_vec_undef0(<3 x i8> %x, <3 x i8> %y) {
 ; CHECK-LABEL: @p2_vec_undef0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <3 x i8> <i8 -1, i8 undef, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use3i8(<3 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <3 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <3 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge <3 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <3 x i1> [[RET]]
 ;
@@ -80,7 +80,7 @@ define i1 @c0(i8 %y) {
 ; CHECK-LABEL: @c0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ule i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -98,7 +98,7 @@ define i1 @c1(i8 %y) {
 ; CHECK-LABEL: @c1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ule i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -116,7 +116,7 @@ define i1 @c2(i8 %y) {
 ; CHECK-LABEL: @c2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ule i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -138,7 +138,7 @@ define i1 @oneuse0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -156,7 +156,7 @@ define i1 @oneuse1(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp uge i8 [[T1]], [[X]]
@@ -175,7 +175,7 @@ define i1 @oneuse2(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
@@ -200,7 +200,7 @@ define i1 @n0(i8 %x, i8 %y, i8 %notx) {
 ; CHECK-LABEL: @n0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[T2]], [[NOTX:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll
index c4865404c2f28ed..3c69d6b4c14a762 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-low-bit-mask-v4-and-icmp-ne-to-icmp-ugt.ll
@@ -22,7 +22,7 @@ define i1 @p0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @p0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
 ;
@@ -42,7 +42,7 @@ define <2 x i1> @p1_vec(<2 x i8> %x, <2 x i8> %y) {
 ; CHECK-LABEL: @p1_vec(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <2 x i8> <i8 -1, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use2i8(<2 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <2 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <2 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult <2 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
@@ -58,7 +58,7 @@ define <3 x i1> @p2_vec_undef0(<3 x i8> %x, <3 x i8> %y) {
 ; CHECK-LABEL: @p2_vec_undef0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <3 x i8> <i8 -1, i8 undef, i8 -1>, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use3i8(<3 x i8> [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr <3 x i8> [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <3 x i8> [[T0]], [[Y]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult <3 x i8> [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret <3 x i1> [[RET]]
 ;
@@ -80,7 +80,7 @@ define i1 @c0(i8 %y) {
 ; CHECK-LABEL: @c0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ugt i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -98,7 +98,7 @@ define i1 @c1(i8 %y) {
 ; CHECK-LABEL: @c1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ugt i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -116,7 +116,7 @@ define i1 @c2(i8 %y) {
 ; CHECK-LABEL: @c2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[X:%.*]] = call i8 @gen8()
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ugt i8 [[X]], [[T1]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -138,7 +138,7 @@ define i1 @oneuse0(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
@@ -156,7 +156,7 @@ define i1 @oneuse1(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse1(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ult i8 [[T1]], [[X]]
@@ -175,7 +175,7 @@ define i1 @oneuse2(i8 %x, i8 %y) {
 ; CHECK-LABEL: @oneuse2(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    call void @use8(i8 [[T1]])
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T2]])
@@ -200,7 +200,7 @@ define i1 @n0(i8 %x, i8 %y, i8 %notx) {
 ; CHECK-LABEL: @n0(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i8 -1, [[Y:%.*]]
 ; CHECK-NEXT:    call void @use8(i8 [[T0]])
-; CHECK-NEXT:    [[T1:%.*]] = lshr i8 [[T0]], [[Y]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i8 [[T0]], [[Y]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i8 [[T1]], [[X:%.*]]
 ; CHECK-NEXT:    [[RET:%.*]] = icmp ne i8 [[T2]], [[NOTX:%.*]]
 ; CHECK-NEXT:    ret i1 [[RET]]
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll b/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
index 7c1c18cc214a5c5..c1e871dcccddcf2 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
@@ -388,7 +388,7 @@ define i32 @negative_oneuse(i32 %x, i32 %y) {
 ; CHECK-LABEL: @negative_oneuse(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i32 [[X:%.*]], [[Y:%.*]]
 ; CHECK-NEXT:    call void @use32(i32 [[T0]])
-; CHECK-NEXT:    [[RET:%.*]] = lshr i32 [[T0]], [[Y]]
+; CHECK-NEXT:    [[RET:%.*]] = lshr exact i32 [[T0]], [[Y]]
 ; CHECK-NEXT:    ret i32 [[RET]]
 ;
   %t0 = shl i32 %x, %y
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
index cd0d633d58a68e3..86af88de08e2cb0 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
@@ -18,7 +18,7 @@ define i32 @t0_basic(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @t0_basic(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
@@ -54,7 +54,7 @@ define <8 x i32> @t1_vec_splat(<8 x i64> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t1_vec_splat(
 ; CHECK-NEXT:    [[T0:%.*]] = zext <8 x i32> [[NBITS:%.*]] to <8 x i64>
 ; CHECK-NEXT:    [[T1:%.*]] = shl <8 x i64> <i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1>, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr <8 x i64> [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact <8 x i64> [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33>
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T0]])
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T1]])
@@ -85,7 +85,7 @@ define <8 x i32> @t2_vec_splat_undef(<8 x i64> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_splat_undef(
 ; CHECK-NEXT:    [[T0:%.*]] = zext <8 x i32> [[NBITS:%.*]] to <8 x i64>
 ; CHECK-NEXT:    [[T1:%.*]] = shl <8 x i64> <i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 undef, i64 -1>, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr <8 x i64> [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact <8 x i64> [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 -33, i32 undef, i32 -33>
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T0]])
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T1]])
@@ -116,7 +116,7 @@ define <8 x i32> @t3_vec_nonsplat(<8 x i64> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t3_vec_nonsplat(
 ; CHECK-NEXT:    [[T0:%.*]] = zext <8 x i32> [[NBITS:%.*]] to <8 x i64>
 ; CHECK-NEXT:    [[T1:%.*]] = shl <8 x i64> <i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 -1, i64 undef, i64 -1>, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr <8 x i64> [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact <8 x i64> [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -64, i32 -63, i32 -33, i32 -32, i32 63, i32 64, i32 undef, i32 65>
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T0]])
 ; CHECK-NEXT:    call void @use8xi64(<8 x i64> [[T1]])
@@ -149,7 +149,7 @@ define i32 @n4_extrause0(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @n4_extrause0(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
@@ -182,7 +182,7 @@ define i32 @n5_extrause1(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @n5_extrause1(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
@@ -215,7 +215,7 @@ define i32 @n6_extrause2(i64 %x, i32 %nbits) {
 ; CHECK-LABEL: @n6_extrause2(
 ; CHECK-NEXT:    [[T0:%.*]] = zext i32 [[NBITS:%.*]] to i64
 ; CHECK-NEXT:    [[T1:%.*]] = shl i64 -1, [[T0]]
-; CHECK-NEXT:    [[T2:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T2:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -33
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
index eb26bfac66f92af..9a4a5dd890eec26 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
@@ -137,7 +137,7 @@ define i32 @n4_extrause0(i64 %x, i32 %nbits) {
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
 ; CHECK-NEXT:    call void @use32(i32 [[T2]])
-; CHECK-NEXT:    [[T3:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T3:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    call void @use64(i64 [[T3]])
 ; CHECK-NEXT:    [[T4:%.*]] = trunc i64 [[T3]] to i32
 ; CHECK-NEXT:    [[T5:%.*]] = shl i32 [[T4]], [[T2]]
@@ -166,7 +166,7 @@ define i32 @n5_extrause1(i64 %x, i32 %nbits) {
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
 ; CHECK-NEXT:    call void @use32(i32 [[T2]])
-; CHECK-NEXT:    [[T3:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T3:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    [[T4:%.*]] = trunc i64 [[T3]] to i32
 ; CHECK-NEXT:    call void @use32(i32 [[T4]])
 ; CHECK-NEXT:    [[T5:%.*]] = shl i32 [[T4]], [[T2]]
@@ -195,7 +195,7 @@ define i32 @n6_extrause2(i64 %x, i32 %nbits) {
 ; CHECK-NEXT:    call void @use64(i64 [[T0]])
 ; CHECK-NEXT:    call void @use64(i64 [[T1]])
 ; CHECK-NEXT:    call void @use32(i32 [[T2]])
-; CHECK-NEXT:    [[T3:%.*]] = lshr i64 [[T1]], [[T0]]
+; CHECK-NEXT:    [[T3:%.*]] = lshr exact i64 [[T1]], [[T0]]
 ; CHECK-NEXT:    call void @use64(i64 [[T3]])
 ; CHECK-NEXT:    [[T4:%.*]] = trunc i64 [[T3]] to i32
 ; CHECK-NEXT:    call void @use32(i32 [[T4]])
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
index f2a788ddba287d2..2031001404925e8 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
@@ -16,7 +16,7 @@ declare void @use32(i32)
 define i32 @t0_basic(i32 %x, i32 %nbits) {
 ; CHECK-LABEL: @t0_basic(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i32 -1, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr i32 [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i32 [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add i32 [[NBITS]], -1
 ; CHECK-NEXT:    call void @use32(i32 [[T0]])
 ; CHECK-NEXT:    call void @use32(i32 [[T1]])
@@ -43,7 +43,7 @@ declare void @use8xi32(<8 x i32>)
 define <8 x i32> @t2_vec_splat(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_splat(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1>, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <8 x i32> [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1>
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T0]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T1]])
@@ -66,7 +66,7 @@ define <8 x i32> @t2_vec_splat(<8 x i32> %x, <8 x i32> %nbits) {
 define <8 x i32> @t2_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_splat_undef(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <8 x i32> [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T0]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T1]])
@@ -89,7 +89,7 @@ define <8 x i32> @t2_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {
 define <8 x i32> @t2_vec_nonsplat(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-LABEL: @t2_vec_nonsplat(
 ; CHECK-NEXT:    [[T0:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1>, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact <8 x i32> [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -32, i32 -31, i32 -1, i32 0, i32 1, i32 31, i32 32, i32 33>
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T0]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T1]])
@@ -114,7 +114,7 @@ define <8 x i32> @t2_vec_nonsplat(<8 x i32> %x, <8 x i32> %nbits) {
 define i32 @n3_extrause(i32 %x, i32 %nbits) {
 ; CHECK-LABEL: @n3_extrause(
 ; CHECK-NEXT:    [[T0:%.*]] = shl i32 -1, [[NBITS:%.*]]
-; CHECK-NEXT:    [[T1:%.*]] = lshr i32 [[T0]], [[NBITS]]
+; CHECK-NEXT:    [[T1:%.*]] = lshr exact i32 [[T0]], [[NBITS]]
 ; CHECK-NEXT:    [[T2:%.*]] = and i32 [[T1]], [[X:%.*]]
 ; CHECK-...
[truncated]

nikic

LGTM

…#72483) This patch adds exact flags for sext/zext idiom `shr (shl X, Y), Y`. Alive2: https://alive2.llvm.org/ce/z/xYFpfB We can generalize it to handle pattern `shr (shl X, Y), Z` with `Y u>= Z` (e.g., non-splat vectors). But I don't think it's worth the effort. This missed optimization is discovered with the help of AliveToolkit/alive2#962.

[InstCombine] Add exact flags for ext idiom `shr (shl

ed6169c

X, Y), Y`

dtcxzyw requested a review from goldsteinn November 16, 2023 06:39

dtcxzyw requested a review from nikic as a code owner November 16, 2023 06:39

llvmbot added the llvm:transforms label Nov 16, 2023

nikic approved these changes Nov 16, 2023

View reviewed changes

dtcxzyw merged commit e8fe15c into llvm:main Nov 16, 2023

dtcxzyw deleted the infer-exact-shr-shl branch November 16, 2023 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] Add exact flags for ext idiom `shr (shl X, Y), Y` #72483

[InstCombine] Add exact flags for ext idiom `shr (shl X, Y), Y` #72483

Uh oh!

dtcxzyw commented Nov 16, 2023

Uh oh!

llvmbot commented Nov 16, 2023

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

[InstCombine] Add exact flags for ext idiom shr (shl X, Y), Y #72483

[InstCombine] Add exact flags for ext idiom shr (shl X, Y), Y #72483

Uh oh!

Conversation

dtcxzyw commented Nov 16, 2023

Uh oh!

llvmbot commented Nov 16, 2023

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[InstCombine] Add exact flags for ext idiom `shr (shl X, Y), Y` #72483

[InstCombine] Add exact flags for ext idiom `shr (shl X, Y), Y` #72483