Skip to content

[ConstraintElimination] Add eq/ne facts to signed constraint system #121423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 23, 2025

Conversation

zsrkmyn
Copy link
Member

@zsrkmyn zsrkmyn commented Jan 1, 2025

Facts of eq/ne were added to unsigned system only, causing some missing
optimizations. This patch adds eq/ne facts to both signed & unsigned
constraint system.

Fixes #117961.

@llvmbot
Copy link
Member

llvmbot commented Jan 2, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Stephen Senran Zhang (zsrkmyn)

Changes

Facts of eq/ne were added to unsigned system only, causing some missing
optimizations. This patch adds eq/ne facts to both signed & unsigned
constraint system.

Fixes #117961.


Full diff: https://github.com/llvm/llvm-project/pull/121423.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/ConstraintElimination.cpp (+31-7)
  • (modified) llvm/test/Transforms/ConstraintElimination/ne.ll (+2-4)
  • (modified) llvm/test/Transforms/ConstraintElimination/pr105785.ll (+1-2)
  • (added) llvm/test/Transforms/ConstraintElimination/pr117961.ll (+22)
diff --git a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
index 91a3c3f0d392a1..1c086d01a18c4c 100644
--- a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
@@ -313,7 +313,8 @@ class ConstraintInfo {
   /// New variables that need to be added to the system are collected in
   /// \p NewVariables.
   ConstraintTy getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
-                             SmallVectorImpl<Value *> &NewVariables) const;
+                             SmallVectorImpl<Value *> &NewVariables,
+                             bool ForceSignedSystem = false) const;
 
   /// Turns a comparison of the form \p Op0 \p Pred \p Op1 into a vector of
   /// constraints using getConstraint. Returns an empty constraint if the result
@@ -330,6 +331,14 @@ class ConstraintInfo {
   void transferToOtherSystem(CmpInst::Predicate Pred, Value *A, Value *B,
                              unsigned NumIn, unsigned NumOut,
                              SmallVectorImpl<StackEntry> &DFSInStack);
+
+private:
+  /// Adds facts into constraint system. \p ForceSignedSystem can be set when
+  /// the \p Pred is eq/ne, and signed constraint system is used when it's
+  /// specified.
+  void addFactImpl(CmpInst::Predicate Pred, Value *A, Value *B, unsigned NumIn,
+                   unsigned NumOut, SmallVectorImpl<StackEntry> &DFSInStack,
+                   bool ForceSignedSystem);
 };
 
 /// Represents a (Coefficient * Variable) entry after IR decomposition.
@@ -636,8 +645,13 @@ static Decomposition decompose(Value *V,
 
 ConstraintTy
 ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
-                              SmallVectorImpl<Value *> &NewVariables) const {
+                              SmallVectorImpl<Value *> &NewVariables,
+                              bool ForceSignedSystem) const {
   assert(NewVariables.empty() && "NewVariables must be empty when passed in");
+  assert((!ForceSignedSystem || Pred == CmpInst::ICMP_EQ ||
+          Pred == CmpInst::ICMP_NE) &&
+         "signed system can only be forced on eq/ne");
+
   bool IsEq = false;
   bool IsNe = false;
 
@@ -652,7 +666,7 @@ ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
     break;
   }
   case CmpInst::ICMP_EQ:
-    if (match(Op1, m_Zero())) {
+    if (!ForceSignedSystem && match(Op1, m_Zero())) {
       Pred = CmpInst::ICMP_ULE;
     } else {
       IsEq = true;
@@ -660,7 +674,7 @@ ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
     }
     break;
   case CmpInst::ICMP_NE:
-    if (match(Op1, m_Zero())) {
+    if (!ForceSignedSystem && match(Op1, m_Zero())) {
       Pred = CmpInst::getSwappedPredicate(CmpInst::ICMP_UGT);
       std::swap(Op0, Op1);
     } else {
@@ -677,7 +691,7 @@ ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
     return {};
 
   SmallVector<ConditionTy, 4> Preconditions;
-  bool IsSigned = CmpInst::isSigned(Pred);
+  bool IsSigned = ForceSignedSystem || CmpInst::isSigned(Pred);
   auto &Value2Index = getValue2Index(IsSigned);
   auto ADec = decompose(Op0->stripPointerCastsSameRepresentation(),
                         Preconditions, IsSigned, DL);
@@ -737,7 +751,7 @@ ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
   int64_t OffsetSum;
   if (AddOverflow(Offset1, Offset2, OffsetSum))
     return {};
-  if (Pred == (IsSigned ? CmpInst::ICMP_SLT : CmpInst::ICMP_ULT))
+  if (Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_ULT)
     if (AddOverflow(OffsetSum, int64_t(-1), OffsetSum))
       return {};
   R[0] = OffsetSum;
@@ -1580,10 +1594,20 @@ static bool checkOrAndOpImpliedByOther(
 void ConstraintInfo::addFact(CmpInst::Predicate Pred, Value *A, Value *B,
                              unsigned NumIn, unsigned NumOut,
                              SmallVectorImpl<StackEntry> &DFSInStack) {
+  addFactImpl(Pred, A, B, NumIn, NumOut, DFSInStack, false);
+  // If the Pred is eq/ne, also add the fact to signed system.
+  if (Pred == ICmpInst::ICMP_EQ || Pred == ICmpInst::ICMP_NE)
+    addFactImpl(Pred, A, B, NumIn, NumOut, DFSInStack, true);
+}
+
+void ConstraintInfo::addFactImpl(CmpInst::Predicate Pred, Value *A, Value *B,
+                                 unsigned NumIn, unsigned NumOut,
+                                 SmallVectorImpl<StackEntry> &DFSInStack,
+                                 bool ForceSignedSystem) {
   // If the constraint has a pre-condition, skip the constraint if it does not
   // hold.
   SmallVector<Value *> NewVariables;
-  auto R = getConstraint(Pred, A, B, NewVariables);
+  auto R = getConstraint(Pred, A, B, NewVariables, ForceSignedSystem);
 
   // TODO: Support non-equality for facts as well.
   if (!R.isValid(*this) || R.isNe())
diff --git a/llvm/test/Transforms/ConstraintElimination/ne.ll b/llvm/test/Transforms/ConstraintElimination/ne.ll
index 566e73dc8d626e..4753860db2851b 100644
--- a/llvm/test/Transforms/ConstraintElimination/ne.ll
+++ b/llvm/test/Transforms/ConstraintElimination/ne.ll
@@ -71,8 +71,7 @@ define i1 @test_ne_eq_0(i8 %a, i8 %b) {
 ; CHECK-NEXT:    [[RES_13:%.*]] = xor i1 [[RES_12]], false
 ; CHECK-NEXT:    [[RES_14:%.*]] = xor i1 [[RES_13]], false
 ; CHECK-NEXT:    [[RES_15:%.*]] = xor i1 [[RES_14]], false
-; CHECK-NEXT:    [[C_12:%.*]] = icmp sgt i8 [[A]], 0
-; CHECK-NEXT:    [[RES_16:%.*]] = xor i1 [[RES_15]], [[C_12]]
+; CHECK-NEXT:    [[RES_16:%.*]] = xor i1 [[RES_15]], false
 ; CHECK-NEXT:    ret i1 [[RES_16]]
 ;
 entry:
@@ -209,8 +208,7 @@ define i1 @test_ne_eq_1(i8 %a, i8 %b) {
 ; CHECK-NEXT:    [[RES_13:%.*]] = xor i1 [[RES_12]], true
 ; CHECK-NEXT:    [[RES_14:%.*]] = xor i1 [[RES_13]], true
 ; CHECK-NEXT:    [[RES_15:%.*]] = xor i1 [[RES_14]], false
-; CHECK-NEXT:    [[C_12:%.*]] = icmp sgt i8 [[A]], 0
-; CHECK-NEXT:    [[RES_16:%.*]] = xor i1 [[RES_15]], [[C_12]]
+; CHECK-NEXT:    [[RES_16:%.*]] = xor i1 [[RES_15]], true
 ; CHECK-NEXT:    ret i1 [[RES_16]]
 ;
 entry:
diff --git a/llvm/test/Transforms/ConstraintElimination/pr105785.ll b/llvm/test/Transforms/ConstraintElimination/pr105785.ll
index 6c340a11dd2e2c..83b7461720f09c 100644
--- a/llvm/test/Transforms/ConstraintElimination/pr105785.ll
+++ b/llvm/test/Transforms/ConstraintElimination/pr105785.ll
@@ -15,8 +15,7 @@ define void @pr105785(ptr %p) {
 ; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult i32 [[FOR_IND2]], 3
 ; CHECK-NEXT:    br i1 [[CMP2]], label %[[FOR_BODY3]], label %[[FOR_COND]]
 ; CHECK:       [[FOR_BODY3]]:
-; CHECK-NEXT:    [[SCMP:%.*]] = call i32 @llvm.scmp.i32.i32(i32 [[FOR_IND]], i32 1)
-; CHECK-NEXT:    store i32 [[SCMP]], ptr [[P]], align 4
+; CHECK-NEXT:    store i32 -1, ptr [[P]], align 4
 ; CHECK-NEXT:    [[INC]] = add nuw nsw i32 [[FOR_IND2]], 1
 ; CHECK-NEXT:    br label %[[FOR_COND1]]
 ; CHECK:       [[FOR_END6]]:
diff --git a/llvm/test/Transforms/ConstraintElimination/pr117961.ll b/llvm/test/Transforms/ConstraintElimination/pr117961.ll
new file mode 100644
index 00000000000000..346196a04b0014
--- /dev/null
+++ b/llvm/test/Transforms/ConstraintElimination/pr117961.ll
@@ -0,0 +1,22 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=constraint-elimination -S %s | FileCheck %s
+
+define i1 @f(i32 noundef %v0, i32 noundef %v1, i32 noundef %v2)  {
+; CHECK-LABEL: define i1 @f(
+; CHECK-SAME: i32 noundef [[V0:%.*]], i32 noundef [[V1:%.*]], i32 noundef [[V2:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[V2]], [[V0]]
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp sge i32 [[V0]], [[V1]]
+; CHECK-NEXT:    [[AND0:%.*]] = and i1 [[CMP1]], [[CMP]]
+; CHECK-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[V1]], [[V2]]
+; CHECK-NEXT:    [[AND1:%.*]] = and i1 false, [[AND0]]
+; CHECK-NEXT:    ret i1 [[AND1]]
+;
+entry:
+  %cmp = icmp eq i32 %v2, %v0
+  %cmp1 = icmp sge i32 %v0, %v1
+  %and0 = and i1 %cmp1, %cmp
+  %cmp4 = icmp sgt i32 %v1, %v2
+  %and1 = and i1 %cmp4, %and0
+  ret i1 %and1
+}

; CHECK-LABEL: define i1 @f(
; CHECK-SAME: i32 noundef [[V0:%.*]], i32 noundef [[V1:%.*]], i32 noundef [[V2:%.*]]) {
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[V2]], [[V0]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test with icmp ne and or?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, addFact doesn't handle ne cases for now. See

if (!R.isValid(*this) || R.isNe())

I think the reason is that the constraint system now models all constraints as c0 * var0 + c1 * var1 + ... <= const, and for x eq y, and we can model it as x - y <= 0 + -x + y <= 0; but for ne, we cannot do so. @fhahn do I understand it right?

For the completeness of the patch, I still try to add ne into the signed system, and hope we'll support it in the future :-D But I'm fine to remove it if it's unnecessary.

@nikic nikic self-requested a review January 2, 2025 08:48
@antoniofrighetto antoniofrighetto self-requested a review January 3, 2025 15:05
@zsrkmyn zsrkmyn self-assigned this Jan 8, 2025
@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 14, 2025

A soft ping.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also add a test that shows that we use signed decomposition for the signed equality constraints?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Facts of eq/ne were added to unsigned system only, causing some missing
optimizations. This patch adds eq/ne facts to both signed & unsigned
constraint system.

Fixes llvm#117961.
@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 16, 2025

Compile-time: http://llvm-compile-time-tracker.com/compare.php?from=193ea83dd7e879ddd4e3dfb1fa74a676b528e4a6&to=b4b2d9643dfef63550aecaea008ece92c549247f&stat=instructions:u

I'm thinking if we can only add facts to signed system only for eq, but not ne, at line 1598, to avoid unnecessary getConstraint & decompose for ne cases to save compiling time, as ne cannot be handled for signed system for now.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic nikic merged commit 7fb97be into llvm:main Jan 23, 2025
8 checks passed
@nikic
Copy link
Contributor

nikic commented Jan 23, 2025

I'm thinking if we can only add facts to signed system only for eq, but not ne, at line 1598, to avoid unnecessary getConstraint & decompose for ne cases to save compiling time, as ne cannot be handled for signed system for now.

An early exit for ne constraints where not supported makes sense to me. Do you plan to submit a patch for this?

@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 23, 2025

An early exit for ne constraints where not supported makes sense to me. Do you plan to submit a patch for this?

Ok, I'll submit a patch for this tomorrow. :-)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 23, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/12259

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/pgo1.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fprofile-instr-generate      -Xclang "-fprofile-instrument=clang"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -fprofile-instr-generate -Xclang -fprofile-instrument=clang
# RUN: at line 3
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c      --check-prefix="CLANG-PGO"
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/pgo1.c.tmp
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c --check-prefix=CLANG-PGO
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c:32:20: error: CLANG-PGO-NEXT: expected string not found in input
# | // CLANG-PGO-NEXT: [ 0 11 20 ]
# |                    ^
# | <stdin>:3:28: note: scanning from here
# | ======== Counters =========
# |                            ^
# | <stdin>:4:1: note: possible intended match here
# | [ 0 12 20 ]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/pgo1.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            1: ======= GPU Profile ======= 
# |            2: Target: amdgcn-amd-amdhsa 
# |            3: ======== Counters ========= 
# | next:32'0                                X error: no match found
# |            4: [ 0 12 20 ] 
# | next:32'0     ~~~~~~~~~~~~
# | next:32'1     ?            possible intended match
# |            5: [ 10 ] 
# | next:32'0     ~~~~~~~
# |            6: [ 20 ] 
# | next:32'0     ~~~~~~~
# |            7: ========== Data =========== 
# | next:32'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            8: { 9515997471539760012 4749112401 0xffffffffffffffd8 0x0 0x0 0x0 3 [...] 0 } 
# | next:32'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            9: { 3666282617048535130 24 0xffffffffffffffb0 0x0 0x0 0x0 1 [...] 0 } 
# | next:32'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            .
# |            .
# |            .
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

constraint elimination missing a case related to != and ==
5 participants