[InstCombine] fold unsigned predicates on srem result #122520

jacobly0 · 2025-01-10T20:05:08Z

This allows optimization of more signed floor implementations when the divisor is a known power of two to an arithmetic shift.

Proof for the implemented optimizations:
https://alive2.llvm.org/ce/z/j6C-Nz

Proof for the test cases:
https://alive2.llvm.org/ce/z/M_PBjw

llvmbot · 2025-01-10T20:05:44Z

@llvm/pr-subscribers-llvm-transforms

Author: Jacob Young (jacobly0)

Changes

This allows optimization of more signed floor implementations when the divisor is a known power of two to an arithmetic shift.

Proof for the implemented optimizations:
https://alive2.llvm.org/ce/z/UtdPgv

Proof for the test cases:
https://alive2.llvm.org/ce/z/M_PBjw

Full diff: https://github.com/llvm/llvm-project/pull/122520.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+18-4)
(modified) llvm/test/Transforms/InstCombine/add.ll (+26)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 2e457257599493..16cd4e847a3ad4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -2679,7 +2679,8 @@ Instruction *InstCombinerImpl::foldICmpSRemConstant(ICmpInst &Cmp,
   // (X % pow2C) sgt/slt 0
   const ICmpInst::Predicate Pred = Cmp.getPredicate();
   if (Pred != ICmpInst::ICMP_SGT && Pred != ICmpInst::ICMP_SLT &&
-      Pred != ICmpInst::ICMP_EQ && Pred != ICmpInst::ICMP_NE)
+      Pred != ICmpInst::ICMP_EQ && Pred != ICmpInst::ICMP_NE &&
+      Pred != ICmpInst::ICMP_UGT && Pred != ICmpInst::ICMP_ULT)
     return nullptr;
 
   // TODO: The one-use check is standard because we do not typically want to
@@ -2692,23 +2693,36 @@ Instruction *InstCombinerImpl::foldICmpSRemConstant(ICmpInst &Cmp,
   if (!match(SRem->getOperand(1), m_Power2(DivisorC)))
     return nullptr;
 
+  Type *Ty = SRem->getType();
+  APInt SignMask = APInt::getSignMask(Ty->getScalarSizeInBits());
+  // Setting the lsb instead of adding one properly handles i1.
+  APInt SignMaskOrOne = SignMask | 1;
+
   // For cmp_sgt/cmp_slt only zero valued C is handled.
   // For cmp_eq/cmp_ne only positive valued C is handled.
+  // For cmp_ugt only signed min/max valued C is handled.
+  // For cmp_ult only signed min | 0/1 valued C is handled.
   if (((Pred == ICmpInst::ICMP_SGT || Pred == ICmpInst::ICMP_SLT) &&
        !C.isZero()) ||
       ((Pred == ICmpInst::ICMP_EQ || Pred == ICmpInst::ICMP_NE) &&
-       !C.isStrictlyPositive()))
+       !C.isStrictlyPositive()) ||
+      (Pred == ICmpInst::ICMP_UGT && !C.isMinSignedValue() &&
+       !C.isMaxSignedValue()) ||
+      (Pred == ICmpInst::ICMP_ULT && !C.isMinSignedValue() &&
+       C != SignMaskOrOne))
     return nullptr;
 
   // Mask off the sign bit and the modulo bits (low-bits).
-  Type *Ty = SRem->getType();
-  APInt SignMask = APInt::getSignMask(Ty->getScalarSizeInBits());
   Constant *MaskC = ConstantInt::get(Ty, SignMask | (*DivisorC - 1));
   Value *And = Builder.CreateAnd(SRem->getOperand(0), MaskC);
 
   if (Pred == ICmpInst::ICMP_EQ || Pred == ICmpInst::ICMP_NE)
     return new ICmpInst(Pred, And, ConstantInt::get(Ty, C));
 
+  if (Pred == ICmpInst::ICMP_ULT)
+    return new ICmpInst(ICmpInst::ICMP_ULT, And,
+                        ConstantInt::get(Ty, SignMaskOrOne));
+
   // For 'is positive?' check that the sign-bit is clear and at least 1 masked
   // bit is set. Example:
   // (i8 X % 32) s> 0 --> (X & 159) s> 0
diff --git a/llvm/test/Transforms/InstCombine/add.ll b/llvm/test/Transforms/InstCombine/add.ll
index 222f87fa3a5f18..495f99824652d6 100644
--- a/llvm/test/Transforms/InstCombine/add.ll
+++ b/llvm/test/Transforms/InstCombine/add.ll
@@ -3018,6 +3018,32 @@ define i32 @floor_sdiv_wrong_op(i32 %x, i32 %y) {
   ret i32 %r
 }
 
+define i32 @floor_sdiv_using_srem_by_8(i32 %x) {
+; CHECK-LABEL: @floor_sdiv_using_srem_by_8(
+; CHECK-NEXT:    [[F:%.*]] = ashr i32 [[X:%.*]], 3
+; CHECK-NEXT:    ret i32 [[F]]
+;
+  %d = sdiv i32 %x, 8
+  %r = srem i32 %x, 8
+  %i = icmp ugt i32 %r, -2147483648
+  %s = sext i1 %i to i32
+  %f = add i32 %d, %s
+  ret i32 %f
+}
+
+define i32 @floor_sdiv_using_srem_by_2(i32 %x) {
+; CHECK-LABEL: @floor_sdiv_using_srem_by_2(
+; CHECK-NEXT:    [[F:%.*]] = ashr i32 [[X:%.*]], 1
+; CHECK-NEXT:    ret i32 [[F]]
+;
+  %d = sdiv i32 %x, 2
+  %r = srem i32 %x, 2
+  %i = icmp ugt i32 %r, -2147483648
+  %s = sext i1 %i to i32
+  %f = add i32 %d, %s
+  ret i32 %f
+}
+
 ; (X s>> (BW - 1)) + (zext (X s> 0)) --> (X s>> (BW - 1)) | (zext (X != 0))
 
 define i8 @signum_i8_i8(i8 %x) {

github-actions · 2025-01-10T20:06:49Z

⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.
Please turn off Keep my email addresses private setting in your account.
See LLVM Discourse for more information.

dtcxzyw

I think this optimization is too specific.
For your motivating case, we can simply convert icmp ugt (srem i32 %x, 8), -2147483648 into icmp slt (srem i32 %x, 8), 0. Then InstCombine will convert the remain expression into ashr X, 3: https://godbolt.org/z/Y7K5hd7c3

Generalized proof: https://alive2.llvm.org/ce/z/gST2A5

jacobly0 · 2025-01-11T12:24:55Z

I think this optimization is too specific. For your motivating case, we can simply convert icmp ugt (srem i32 %x, 8), -2147483648 into icmp slt (srem i32 %x, 8), 0. Then InstCombine will convert the remain expression into ashr X, 3: https://godbolt.org/z/Y7K5hd7c3

Generalized proof: https://alive2.llvm.org/ce/z/gST2A5

Thanks for the hint, it should now be generalized to any comparison constant outside of the srem result range.

dtcxzyw

Please add some negative tests.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

dtcxzyw

LGTM. Thank you!
Please wait for additional approval from other reviewers :)

llvm/test/Transforms/InstCombine/icmp-srem.ll

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

This allows optimization of more signed floor implementations when the divisor is a known power of two to an arithmetic shift. Proof for the implemented optimizations: https://alive2.llvm.org/ce/z/j6C-Nz Proof for the test cases: https://alive2.llvm.org/ce/z/M_PBjw

goldsteinn · 2025-01-12T16:40:50Z

LGTM

This is a generalization of llvm#122520 to other instructions with constrained result ranges.

jacobly0 requested a review from nikic as a code owner January 10, 2025 20:05

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jan 10, 2025

This was referenced Jan 11, 2025

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR122520 dtcxzyw/llvm-opt-benchmark#1948

Closed

dtcxzyw requested changes Jan 11, 2025

View reviewed changes

jacobly0 force-pushed the floor-sdiv branch 2 times, most recently from f98ced2 to ccc9c7d Compare January 11, 2025 12:21

jacobly0 requested a review from dtcxzyw January 11, 2025 12:22

jacobly0 force-pushed the floor-sdiv branch from ccc9c7d to 09a08b2 Compare January 11, 2025 13:03

This was referenced Jan 11, 2025

Fuzz PR122520 dtcxzyw/llvm-mutation-based-fuzz-service#22

Closed

pre-commit: PR122520 dtcxzyw/llvm-opt-benchmark#1952

Closed

dtcxzyw reviewed Jan 12, 2025

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Show resolved Hide resolved

jacobly0 force-pushed the floor-sdiv branch 2 times, most recently from 9b77759 to e15c88e Compare January 12, 2025 12:47

dtcxzyw approved these changes Jan 12, 2025

View reviewed changes

llvm/test/Transforms/InstCombine/icmp-srem.ll Outdated Show resolved Hide resolved

dtcxzyw requested a review from goldsteinn January 12, 2025 12:55

[InstCombine][NFC] precommit tests for signed floor division

f9ddc44

jacobly0 force-pushed the floor-sdiv branch from e15c88e to 92661e0 Compare January 12, 2025 15:28

goldsteinn reviewed Jan 12, 2025

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved

jacobly0 force-pushed the floor-sdiv branch from 92661e0 to d13aae7 Compare January 12, 2025 16:30

jacobly0 requested a review from goldsteinn January 12, 2025 16:37

jacobly0 added a commit to jacobly0/llvm-project that referenced this pull request Jan 14, 2025

[InstCombine] canonicalize sign bit checks

4ae0905

This is a generalization of llvm#122520 to other instructions with constrained result ranges.

jacobly0 mentioned this pull request Jan 14, 2025

[InstCombine] canonicalize sign bit checks #122962

Closed

jacobly0 added a commit to jacobly0/llvm-project that referenced this pull request Jan 15, 2025

[InstCombine] canonicalize sign bit checks

32ab6ee

This is a generalization of llvm#122520 to other instructions with constrained result ranges.

andrewrk merged commit bb59eb8 into llvm:main Jan 18, 2025
8 checks passed

jacobly0 deleted the floor-sdiv branch January 18, 2025 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] fold unsigned predicates on srem result #122520

[InstCombine] fold unsigned predicates on srem result #122520

Uh oh!

jacobly0 commented Jan 10, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jan 10, 2025

Uh oh!

github-actions bot commented Jan 10, 2025

Uh oh!

dtcxzyw left a comment

Uh oh!

jacobly0 commented Jan 11, 2025 •

edited

Loading

Uh oh!

dtcxzyw left a comment

Uh oh!

Uh oh!

Uh oh!

dtcxzyw left a comment

Uh oh!

Uh oh!

Uh oh!

goldsteinn commented Jan 12, 2025

Uh oh!

Uh oh!

Uh oh!

[InstCombine] fold unsigned predicates on srem result #122520

[InstCombine] fold unsigned predicates on srem result #122520

Uh oh!

Conversation

jacobly0 commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 10, 2025

Uh oh!

github-actions bot commented Jan 10, 2025

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

jacobly0 commented Jan 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

goldsteinn commented Jan 12, 2025

Uh oh!

Uh oh!

Uh oh!

jacobly0 commented Jan 10, 2025 •

edited

Loading

jacobly0 commented Jan 11, 2025 •

edited

Loading