Skip to content

[ValueTracking] Let ComputeKnownSignBits handle (shl (zext X), C) #97693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 19, 2024

Conversation

bjope
Copy link
Collaborator

@bjope bjope commented Jul 4, 2024

Add simple support for looking through a zext when doing ComputeKnownSignBits for shl. This is valid for the case when all extended bits are shifted out, because then the number of sign bits can be found by analysing the zext operand.

The solution here is simple as it only handle a single zext (not passing remaining left shift amount during recursion). It could be possible to generalize this in the future by for example passing an 'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to calculate number of sign bits starting at some offset from the most significant bit.

@bjope bjope requested a review from nikic as a code owner July 4, 2024 08:29
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 4, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Björn Pettersson (bjope)

Changes

Add simple support for looking through a zext when doing ComputeKnownSignBits for shl. This is valid for the case when all extended bits are shifted out, because then the number of sign bits can be found by analysing the zext operand.

The solution here is simple as it only handle a single zext (not passing remaining left shift amount during recursion). It could be possible to generalize this in the future by for example passing an 'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to calculate number of sign bits starting at some offset from the most significant bit.


Full diff: https://github.com/llvm/llvm-project/pull/97693.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+13-3)
  • (modified) llvm/unittests/Analysis/ValueTrackingTest.cpp (+30)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 5476dc5d851829..7d229f58d3f6a9 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -3757,11 +3757,21 @@ static unsigned ComputeNumSignBitsImpl(const Value *V,
     }
     case Instruction::Shl: {
       const APInt *ShAmt;
+      Value *X = nullptr;
       if (match(U->getOperand(1), m_APInt(ShAmt))) {
         // shl destroys sign bits.
-        Tmp = ComputeNumSignBits(U->getOperand(0), Depth + 1, Q);
-        if (ShAmt->uge(TyBits) ||   // Bad shift.
-            ShAmt->uge(Tmp)) break; // Shifted all sign bits out.
+        if (ShAmt->uge(TyBits))
+          break; // Bad shift.
+        // We can look through a zext (more or less treating it as a sext) if
+        // all extended bits are shifted out.
+        if (match(U->getOperand(0), m_ZExt(m_Value(X))) &&
+            ShAmt->uge(TyBits - X->getType()->getScalarSizeInBits())) {
+          Tmp = ComputeNumSignBits(X, Depth + 1, Q);
+          Tmp += TyBits - X->getType()->getScalarSizeInBits();
+        } else
+          Tmp = ComputeNumSignBits(U->getOperand(0), Depth + 1, Q);
+        if (ShAmt->uge(Tmp))
+          break; // Shifted all sign bits out.
         Tmp2 = ShAmt->getZExtValue();
         return Tmp - Tmp2;
       }
diff --git a/llvm/unittests/Analysis/ValueTrackingTest.cpp b/llvm/unittests/Analysis/ValueTrackingTest.cpp
index a30db468c77291..e850338ff8ff1e 100644
--- a/llvm/unittests/Analysis/ValueTrackingTest.cpp
+++ b/llvm/unittests/Analysis/ValueTrackingTest.cpp
@@ -697,6 +697,36 @@ TEST_F(ValueTrackingTest, ComputeNumSignBits_PR32045) {
   EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 32u);
 }
 
+TEST_F(ValueTrackingTest, ComputeNumSignBits_shl_ext1) {
+  parseAssembly("define i32 @test(i8 %a) {\n"
+                "  %b = ashr i8 %a, 4\n"
+                "  %c = zext i8 %b to i32\n"
+                "  %A = shl i32 %c, 24\n"
+                "  ret i32 %A\n"
+                "}\n");
+  EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 5u);
+}
+
+TEST_F(ValueTrackingTest, ComputeNumSignBits_shl_ext2) {
+  parseAssembly("define i32 @test(i8 %a) {\n"
+                "  %b = ashr i8 %a, 4\n"
+                "  %c = zext i8 %b to i32\n"
+                "  %A = shl i32 %c, 26\n"
+                "  ret i32 %A\n"
+                "}\n");
+  EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 3u);
+}
+
+TEST_F(ValueTrackingTest, ComputeNumSignBits_shl_ext3) {
+  parseAssembly("define i32 @test(i8 %a) {\n"
+                "  %b = ashr i8 %a, 4\n"
+                "  %c = zext i8 %b to i32\n"
+                "  %A = shl i32 %c, 30\n"
+                "  ret i32 %A\n"
+                "}\n");
+  EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 1u);
+}
+
 // No guarantees for canonical IR in this analysis, so this just bails out.
 TEST_F(ValueTrackingTest, ComputeNumSignBits_Shuffle) {
   parseAssembly(

@bjope
Copy link
Collaborator Author

bjope commented Jul 4, 2024

SelectionDAG variant is here: #97695

@bjope bjope requested review from goldsteinn and dtcxzyw July 4, 2024 08:54
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. Can you replace the unit tests with lit tests please?

@bjope
Copy link
Collaborator Author

bjope commented Jul 4, 2024

This looks reasonable to me. Can you replace the unit tests with lit tests please?

I don't really know how to make good lit tests for ComputeNumSignBits.
All I've got is something like this

; RUN: opt -passes=instcombine -S < %s | FileCheck %s

define i16 @numsignbits_shl_zext(i8 %a) {
  %nsb6 = ashr i8 %a, 6
  %zext = zext i8 %nsb6 to i16
  %nsb4 = shl i16 %zext, 10
  %or = or i16 %nsb4, 1
  ; With four sign bits this should be nsw.
  %shl = add i16 %nsb4, 4096
  ret i16 %shl
}

to show that the add will get a nsw. But it far from verifies that ComputeNumSignBits actually is returning the correct value. Isn't it much easier to verify the correct behavior via the unit test?
Maybe I should add a lit test like the one above as a complement, to show that it is possible to trigger this via a lit test as well (even though it doesn't validate that the calculation in ComputeNumSignBits is correct).

@goldsteinn
Copy link
Contributor

This looks reasonable to me. Can you replace the unit tests with lit tests please?

I don't really know how to make good lit tests for ComputeNumSignBits. All I've got is something like this

; RUN: opt -passes=instcombine -S < %s | FileCheck %s

define i16 @numsignbits_shl_zext(i8 %a) {
  %nsb6 = ashr i8 %a, 6
  %zext = zext i8 %nsb6 to i16
  %nsb4 = shl i16 %zext, 10
  %or = or i16 %nsb4, 1
  ; With four sign bits this should be nsw.
  %shl = add i16 %nsb4, 4096
  ret i16 %shl
}

to show that the add will get a nsw. But it far from verifies that ComputeNumSignBits actually is returning the correct value. Isn't it much easier to verify the correct behavior via the unit test? Maybe I should add a lit test like the one above as a complement, to show that it is possible to trigger this via a lit test as well (even though it doesn't validate that the calculation in ComputeNumSignBits is correct).

You could use the:

  // (A & 2^C1) + A => A & (2^C1 - 1) iff bit C1 in A is a sign bit
  if (match(&I, m_c_Add(m_And(m_Value(A), m_APInt(C1)), m_Deferred(A))) &&
      C1->isPowerOf2() && (ComputeNumSignBits(A) > C1->countl_zero())) {
    Constant *NewMask = ConstantInt::get(RHS->getType(), *C1 - 1);
    return BinaryOperator::CreateAnd(A, NewMask);
  }

fold and be precise (although typicaly I would do this with nsw flag on shl, but that doesn't really work here).

" %A = shl i32 %c, 30\n"
" ret i32 %A\n"
"}\n");
EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 1u);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should have a test (here or in lit) where ShAmt < 24

@bjope bjope force-pushed the valuetracking_shl branch from 112ee87 to 6506043 Compare July 4, 2024 15:18
@bjope
Copy link
Collaborator Author

bjope commented Jul 4, 2024

Replaced unit test case with a lit based test (thanks to @goldsteinn for great tip about the and+add fold!).

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jul 4, 2024
bjope added a commit to bjope/llvm-project that referenced this pull request Jul 16, 2024
When trying to improve value tracking in
   llvm#97693
some regressions was found due to a "weirdness" in simplify demanded
use bits for ashr. Normally an ashr is replaced by lshr when the
shifted in bits aren't demanded. Some years ago (see commit
22178dd) there was a test case motivating to keep the ashr
when any sign bit (besides the shifted in bits) was demanded. The
weird part about it is that the better we get at analysing known sign
bits, the less likely it is that we canonicalize from ashr to lshr.
That makes it hard to tune other combines to work based on the
canonicalization, as well as possibly resulting in unexpected
regressions when improving value tracking.

This patch adds a test case for which it would be better to
canonicalize ashr into lshr when possible. Worth mentioning is also
that reverting 22178dd doesn't seem to cause regressions
in any other lit tests (not even the one added in 22178dd).
bjope added a commit to bjope/llvm-project that referenced this pull request Jul 16, 2024
The functional change here is to undo "llvm-svn: 311773", aka D36936,
aka commit 22178dd. That patch avoided to convert AShr
into LShr in SimplifyDemandedUseBits based on known sign bits
analysis. Even if it would be legal to turn the shift into a
logical shift (given by the fact that the shifted in bits wasn't
demanded), that patch prevented converting the shift into LShr when
any of the original sign bits were demanded.
One side effect of the reverted functionalty was that the better we
were at computing number of sign bits, the less likely it was that
we would replace AShr by LShr during SimplifyDemandedUseBits. This
was seen in llvm#97693 when
an improvement of ComputeNumSignBits resulted in regressions due
to no longer rewriting AShr to LShr.
The test case from D36936 still passes after this commit. So it seems
like at least the compiler has been taught how to optimize that
scenario even if we do the AShr->LShr transform more aggressively.
bjope added a commit that referenced this pull request Jul 18, 2024
When trying to improve value tracking in
   #97693
some regressions was found due to a "weirdness" in simplify demanded
use bits for ashr. Normally an ashr is replaced by lshr when the
shifted in bits aren't demanded. Some years ago (see commit
22178dd) there was a test case motivating to keep the ashr
when any sign bit (besides the shifted in bits) was demanded. The
weird part about it is that the better we get at analysing known sign
bits, the less likely it is that we canonicalize from ashr to lshr.
That makes it hard to tune other combines to work based on the
canonicalization, as well as possibly resulting in unexpected
regressions when improving value tracking.

This patch adds a test case for which it would be better to
canonicalize ashr into lshr when possible. Worth mentioning is also
that reverting 22178dd doesn't seem to cause regressions
in any other lit tests (not even the one added in 22178dd).
bjope added a commit to bjope/llvm-project that referenced this pull request Jul 18, 2024
…ts (llvm#99155)

The functional change here is to undo "llvm-svn: 311773", aka D36936,
aka commit 22178dd. That patch avoided to convert AShr
into LShr in SimplifyDemandedUseBits based on known sign bits
analysis. Even if it would be legal to turn the shift into a
logical shift (given by the fact that the shifted in bits wasn't
demanded), that patch prevented converting the shift into LShr when
any of the original sign bits were demanded.
One side effect of the reverted functionalty was that the better we
were at computing number of sign bits, the less likely it was that
we would replace AShr by LShr during SimplifyDemandedUseBits. This
was seen in llvm#97693 when
an improvement of ComputeNumSignBits resulted in regressions due
to no longer rewriting AShr to LShr.
The test case from D36936 still passes after this commit. So it seems
like at least the compiler has been taught how to optimize that
scenario even if we do the AShr->LShr transform more aggressively.
bjope added a commit to bjope/llvm-project that referenced this pull request Jul 18, 2024
…vm#97693)

Add simple support for looking through a zext when doing
ComputeKnownSignBits for shl. This is valid for the case when
all extended bits are shifted out, because then the number of sign
bits can be found by analysing the zext operand.

The solution here is simple as it only handle a single zext (not
passing remaining left shift amount during recursion). It could be
possible to generalize this in the future by for example passing an
'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to
calculate number of sign bits starting at some offset from the most
significant bit.
@bjope bjope force-pushed the valuetracking_shl branch from 6506043 to 0deccda Compare July 18, 2024 10:11
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jul 18, 2024
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

…vm#97693)

Add simple support for looking through a zext when doing
ComputeKnownSignBits for shl. This is valid for the case when
all extended bits are shifted out, because then the number of sign
bits can be found by analysing the zext operand.

The solution here is simple as it only handle a single zext (not
passing remaining left shift amount during recursion). It could be
possible to generalize this in the future by for example passing an
'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to
calculate number of sign bits starting at some offset from the most
significant bit.
@bjope bjope force-pushed the valuetracking_shl branch from 0deccda to 098bd84 Compare July 19, 2024 10:45
@bjope bjope merged commit 098bd84 into llvm:main Jul 19, 2024
3 of 7 checks passed
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
When trying to improve value tracking in
   #97693
some regressions was found due to a "weirdness" in simplify demanded
use bits for ashr. Normally an ashr is replaced by lshr when the
shifted in bits aren't demanded. Some years ago (see commit
22178dd) there was a test case motivating to keep the ashr
when any sign bit (besides the shifted in bits) was demanded. The
weird part about it is that the better we get at analysing known sign
bits, the less likely it is that we canonicalize from ashr to lshr.
That makes it hard to tune other combines to work based on the
canonicalization, as well as possibly resulting in unexpected
regressions when improving value tracking.

This patch adds a test case for which it would be better to
canonicalize ashr into lshr when possible. Worth mentioning is also
that reverting 22178dd doesn't seem to cause regressions
in any other lit tests (not even the one added in 22178dd).

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250955
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…ts (#99155)

Summary:
The functional change here is to undo "llvm-svn: 311773", aka D36936,
aka commit 22178dd. That patch avoided to convert AShr
into LShr in SimplifyDemandedUseBits based on known sign bits
analysis. Even if it would be legal to turn the shift into a
logical shift (given by the fact that the shifted in bits wasn't
demanded), that patch prevented converting the shift into LShr when
any of the original sign bits were demanded.
One side effect of the reverted functionalty was that the better we
were at computing number of sign bits, the less likely it was that
we would replace AShr by LShr during SimplifyDemandedUseBits. This
was seen in #97693 when
an improvement of ComputeNumSignBits resulted in regressions due
to no longer rewriting AShr to LShr.
The test case from D36936 still passes after this commit. So it seems
like at least the compiler has been taught how to optimize that
scenario even if we do the AShr->LShr transform more aggressively.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250972
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…7693)

Summary:
Add simple support for looking through a zext when doing
ComputeKnownSignBits for shl. This is valid for the case when
all extended bits are shifted out, because then the number of sign
bits can be found by analysing the zext operand.

The solution here is simple as it only handle a single zext (not
passing remaining left shift amount during recursion). It could be
possible to generalize this in the future by for example passing an
'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to
calculate number of sign bits starting at some offset from the most
significant bit.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251376
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants