Skip to content

[SCCP] Infer return attributes in SCCP as well #106732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 2, 2024
Merged

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Aug 30, 2024

We can infer the range/nonnull attributes in non-interprocedural SCCP as well. The results may be better after the function has been simplified.

We can infer the range/nonnull attributes in non-interprocedural
SCCP as well. The results may be better after the function has
been simplified.
@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-function-specialization

Author: Nikita Popov (nikic)

Changes

We can infer the range/nonnull attributes in non-interprocedural SCCP as well. The results may be better after the function has been simplified.


Full diff: https://github.com/llvm/llvm-project/pull/106732.diff

8 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Utils/SCCPSolver.h (+3-1)
  • (modified) llvm/lib/Transforms/IPO/SCCP.cpp (+3-25)
  • (modified) llvm/lib/Transforms/Scalar/SCCP.cpp (+8)
  • (modified) llvm/lib/Transforms/Utils/SCCPSolver.cpp (+31-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/icmp-ashr-breaking-select-idiom.ll (+2-2)
  • (modified) llvm/test/Transforms/SCCP/exact-flags.ll (+2-2)
  • (modified) llvm/test/Transforms/SCCP/phis.ll (+2-2)
  • (modified) llvm/test/Transforms/SCCP/pointer-nonnull.ll (+3-7)
diff --git a/llvm/include/llvm/Transforms/Utils/SCCPSolver.h b/llvm/include/llvm/Transforms/Utils/SCCPSolver.h
index 1f959311295258..61a500b82875fb 100644
--- a/llvm/include/llvm/Transforms/Utils/SCCPSolver.h
+++ b/llvm/include/llvm/Transforms/Utils/SCCPSolver.h
@@ -137,7 +137,7 @@ class SCCPSolver {
   const ValueLatticeElement &getLatticeValueFor(Value *V) const;
 
   /// getTrackedRetVals - Get the inferred return value map.
-  const MapVector<Function *, ValueLatticeElement> &getTrackedRetVals();
+  const MapVector<Function *, ValueLatticeElement> &getTrackedRetVals() const;
 
   /// getTrackedGlobals - Get and return the set of inferred initializers for
   /// global variables.
@@ -190,6 +190,8 @@ class SCCPSolver {
   bool removeNonFeasibleEdges(BasicBlock *BB, DomTreeUpdater &DTU,
                               BasicBlock *&NewUnreachableBB) const;
 
+  void inferReturnAttributes() const;
+
   bool tryToReplaceWithConstant(Value *V);
 
   // Helper to check if \p LV is either a constant or a constant
diff --git a/llvm/lib/Transforms/IPO/SCCP.cpp b/llvm/lib/Transforms/IPO/SCCP.cpp
index 5ef08c4a2d725d..cbf511c53d29da 100644
--- a/llvm/lib/Transforms/IPO/SCCP.cpp
+++ b/llvm/lib/Transforms/IPO/SCCP.cpp
@@ -277,34 +277,12 @@ static bool runIPSCCP(
   // whether other functions are optimizable.
   SmallVector<ReturnInst*, 8> ReturnsToZap;
 
+  Solver.inferReturnAttributes();
   for (const auto &I : Solver.getTrackedRetVals()) {
     Function *F = I.first;
     const ValueLatticeElement &ReturnValue = I.second;
-
-    // If there is a known constant range for the return value, add range
-    // attribute to the return value.
-    if (ReturnValue.isConstantRange() &&
-        !ReturnValue.getConstantRange().isSingleElement()) {
-      // Do not add range metadata if the return value may include undef.
-      if (ReturnValue.isConstantRangeIncludingUndef())
-        continue;
-
-      // Take the intersection of the existing attribute and the inferred range.
-      ConstantRange CR = ReturnValue.getConstantRange();
-      if (F->hasRetAttribute(Attribute::Range))
-        CR = CR.intersectWith(F->getRetAttribute(Attribute::Range).getRange());
-      F->addRangeRetAttr(CR);
-      continue;
-    }
-    // Infer nonnull return attribute.
-    if (F->getReturnType()->isPointerTy() && ReturnValue.isNotConstant() &&
-        ReturnValue.getNotConstant()->isNullValue() &&
-        !F->hasRetAttribute(Attribute::NonNull)) {
-      F->addRetAttr(Attribute::NonNull);
-      continue;
-    }
-    if (F->getReturnType()->isVoidTy())
-      continue;
+    assert(!F->getReturnType()->isVoidTy() &&
+           "should not track void functions");
     if (SCCPSolver::isConstant(ReturnValue) || ReturnValue.isUnknownOrUndef())
       findReturnsToZap(*F, ReturnsToZap, Solver);
   }
diff --git a/llvm/lib/Transforms/Scalar/SCCP.cpp b/llvm/lib/Transforms/Scalar/SCCP.cpp
index caf9f890418e29..0330460e7df8ab 100644
--- a/llvm/lib/Transforms/Scalar/SCCP.cpp
+++ b/llvm/lib/Transforms/Scalar/SCCP.cpp
@@ -24,6 +24,7 @@
 #include "llvm/Analysis/DomTreeUpdater.h"
 #include "llvm/Analysis/GlobalsModRef.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/Analysis/ValueLatticeUtils.h"
 #include "llvm/Analysis/ValueTracking.h"
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/Constant.h"
@@ -66,6 +67,11 @@ static bool runSCCP(Function &F, const DataLayout &DL,
       DL, [TLI](Function &F) -> const TargetLibraryInfo & { return *TLI; },
       F.getContext());
 
+  // While we don't do any actual inter-procedural analysis, still track
+  // return values so we can infer attributes.
+  if (canTrackReturnsInterprocedurally(&F))
+    Solver.addTrackedFunction(&F);
+
   // Mark the first block of the function as being executable.
   Solver.markBlockExecutable(&F.front());
 
@@ -115,6 +121,8 @@ static bool runSCCP(Function &F, const DataLayout &DL,
     if (!DeadBB->hasAddressTaken())
       DTU.deleteBB(DeadBB);
 
+  Solver.inferReturnAttributes();
+
   return MadeChanges;
 }
 
diff --git a/llvm/lib/Transforms/Utils/SCCPSolver.cpp b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
index 59775d2199ca61..5cb0fc3ec5d633 100644
--- a/llvm/lib/Transforms/Utils/SCCPSolver.cpp
+++ b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
@@ -354,6 +354,36 @@ bool SCCPSolver::removeNonFeasibleEdges(BasicBlock *BB, DomTreeUpdater &DTU,
   return true;
 }
 
+void SCCPSolver::inferReturnAttributes() const {
+  for (const auto &I : getTrackedRetVals()) {
+    Function *F = I.first;
+    const ValueLatticeElement &ReturnValue = I.second;
+
+    // If there is a known constant range for the return value, add range
+    // attribute to the return value.
+    if (ReturnValue.isConstantRange() &&
+        !ReturnValue.getConstantRange().isSingleElement()) {
+      // Do not add range metadata if the return value may include undef.
+      if (ReturnValue.isConstantRangeIncludingUndef())
+        continue;
+
+      // Take the intersection of the existing attribute and the inferred range.
+      ConstantRange CR = ReturnValue.getConstantRange();
+      if (F->hasRetAttribute(Attribute::Range))
+        CR = CR.intersectWith(F->getRetAttribute(Attribute::Range).getRange());
+      F->addRangeRetAttr(CR);
+      continue;
+    }
+    // Infer nonnull return attribute.
+    if (F->getReturnType()->isPointerTy() && ReturnValue.isNotConstant() &&
+        ReturnValue.getNotConstant()->isNullValue() &&
+        !F->hasRetAttribute(Attribute::NonNull)) {
+      F->addRetAttr(Attribute::NonNull);
+      continue;
+    }
+  }
+}
+
 /// Helper class for SCCPSolver. This implements the instruction visitor and
 /// holds all the state.
 class SCCPInstVisitor : public InstVisitor<SCCPInstVisitor> {
@@ -2168,7 +2198,7 @@ const ValueLatticeElement &SCCPSolver::getLatticeValueFor(Value *V) const {
 }
 
 const MapVector<Function *, ValueLatticeElement> &
-SCCPSolver::getTrackedRetVals() {
+SCCPSolver::getTrackedRetVals() const {
   return Visitor->getTrackedRetVals();
 }
 
diff --git a/llvm/test/Transforms/PhaseOrdering/icmp-ashr-breaking-select-idiom.ll b/llvm/test/Transforms/PhaseOrdering/icmp-ashr-breaking-select-idiom.ll
index 35d5ceeb91950f..871615dbd62852 100644
--- a/llvm/test/Transforms/PhaseOrdering/icmp-ashr-breaking-select-idiom.ll
+++ b/llvm/test/Transforms/PhaseOrdering/icmp-ashr-breaking-select-idiom.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -O1 -S < %s  | FileCheck %s
 
 define i32 @testa(i32 %mul) {
-; CHECK-LABEL: define range(i32 -65536, 65536) i32 @testa(
+; CHECK-LABEL: define range(i32 -65536, 32768) i32 @testa(
 ; CHECK-SAME: i32 [[MUL:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 ; CHECK-NEXT:    [[SHR:%.*]] = ashr i32 [[MUL]], 15
 ; CHECK-NEXT:    [[SPEC_SELECT_I:%.*]] = tail call i32 @llvm.smin.i32(i32 [[SHR]], i32 32767)
@@ -16,7 +16,7 @@ define i32 @testa(i32 %mul) {
 }
 
 define i32 @testb(i32 %mul) {
-; CHECK-LABEL: define range(i32 -16777216, 16777216) i32 @testb(
+; CHECK-LABEL: define range(i32 -128, 128) i32 @testb(
 ; CHECK-SAME: i32 [[MUL:%.*]]) local_unnamed_addr #[[ATTR0]] {
 ; CHECK-NEXT:    [[SHR102:%.*]] = ashr i32 [[MUL]], 7
 ; CHECK-NEXT:    [[TMP1:%.*]] = tail call i32 @llvm.smax.i32(i32 [[SHR102]], i32 -128)
diff --git a/llvm/test/Transforms/SCCP/exact-flags.ll b/llvm/test/Transforms/SCCP/exact-flags.ll
index a5e3bf111bbd9d..f860ddb6fe9cfb 100644
--- a/llvm/test/Transforms/SCCP/exact-flags.ll
+++ b/llvm/test/Transforms/SCCP/exact-flags.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -passes=sccp < %s -S | FileCheck %s
 
 define i8 @ashr_to_lshr(i8 %x, i8 %y) {
-; CHECK-LABEL: define i8 @ashr_to_lshr(
+; CHECK-LABEL: define range(i8 0, -128) i8 @ashr_to_lshr(
 ; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
 ; CHECK-NEXT:    [[P:%.*]] = and i8 [[X]], 127
 ; CHECK-NEXT:    [[R:%.*]] = lshr exact i8 [[P]], [[Y]]
@@ -14,7 +14,7 @@ define i8 @ashr_to_lshr(i8 %x, i8 %y) {
 }
 
 define i8 @sdiv_to_udiv(i8 %x, i8 %y) {
-; CHECK-LABEL: define i8 @sdiv_to_udiv(
+; CHECK-LABEL: define range(i8 0, -128) i8 @sdiv_to_udiv(
 ; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
 ; CHECK-NEXT:    [[X1:%.*]] = and i8 [[X]], 127
 ; CHECK-NEXT:    [[Y1:%.*]] = and i8 [[Y]], 127
diff --git a/llvm/test/Transforms/SCCP/phis.ll b/llvm/test/Transforms/SCCP/phis.ll
index 9264a6eaefb85d..dae843ca955955 100644
--- a/llvm/test/Transforms/SCCP/phis.ll
+++ b/llvm/test/Transforms/SCCP/phis.ll
@@ -100,7 +100,7 @@ end:
 }
 
 define <2 x i16> @phi_vector_merge1(i1 %c, <2 x i8> %a) {
-; CHECK-LABEL: define <2 x i16> @phi_vector_merge1(
+; CHECK-LABEL: define range(i16 2, 259) <2 x i16> @phi_vector_merge1(
 ; CHECK-SAME: i1 [[C:%.*]], <2 x i8> [[A:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:    [[ZEXT:%.*]] = zext <2 x i8> [[A]] to <2 x i16>
@@ -126,7 +126,7 @@ join:
 }
 
 define <2 x i16> @phi_vector_merge2(i1 %c, <2 x i8> %a) {
-; CHECK-LABEL: define <2 x i16> @phi_vector_merge2(
+; CHECK-LABEL: define range(i16 2, 259) <2 x i16> @phi_vector_merge2(
 ; CHECK-SAME: i1 [[C:%.*]], <2 x i8> [[A:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:    [[ZEXT:%.*]] = zext <2 x i8> [[A]] to <2 x i16>
diff --git a/llvm/test/Transforms/SCCP/pointer-nonnull.ll b/llvm/test/Transforms/SCCP/pointer-nonnull.ll
index 08d4a76345bb63..c3a6a762e31744 100644
--- a/llvm/test/Transforms/SCCP/pointer-nonnull.ll
+++ b/llvm/test/Transforms/SCCP/pointer-nonnull.ll
@@ -232,13 +232,9 @@ define i1 @ip_test_nonnull_caller(ptr %p) {
 }
 
 define ptr @ret_nonnull_pointer(ptr nonnull %p) {
-; SCCP-LABEL: define ptr @ret_nonnull_pointer(
-; SCCP-SAME: ptr nonnull [[P:%.*]]) {
-; SCCP-NEXT:    ret ptr [[P]]
-;
-; IPSCCP-LABEL: define nonnull ptr @ret_nonnull_pointer(
-; IPSCCP-SAME: ptr nonnull [[P:%.*]]) {
-; IPSCCP-NEXT:    ret ptr [[P]]
+; CHECK-LABEL: define nonnull ptr @ret_nonnull_pointer(
+; CHECK-SAME: ptr nonnull [[P:%.*]]) {
+; CHECK-NEXT:    ret ptr [[P]]
 ;
   ret ptr %p
 }

@nikic
Copy link
Contributor Author

nikic commented Aug 30, 2024

This causes some compile-time regression on lencod with LTO (https://llvm-compile-time-tracker.com/compare.php?from=eaf87d32754beb5bec10bab517bf56e25575b48e&to=b03af0f9bc4e83de8ed78b55b1e0fc0abb9af24e&stat=instructions%3Au). I've seen a similar regression when testing a similar change in CVP (#99620), so it seems that the extra range information has some kind of negative compile-time effect. Unfortunately, I have not been able to reproduce it locally. callgrind on the link step only gives a pretty small difference in instructions.

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Aug 30, 2024
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Aug 30, 2024
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG. After this patch, some redundant null checks/error handlings are eliminated :)

@dtcxzyw
Copy link
Member

dtcxzyw commented Aug 30, 2024

This causes some compile-time regression on lencod with LTO (https://llvm-compile-time-tracker.com/compare.php?from=eaf87d32754beb5bec10bab517bf56e25575b48e&to=b03af0f9bc4e83de8ed78b55b1e0fc0abb9af24e&stat=instructions%3Au). I've seen a similar regression when testing a similar change in CVP (#99620), so it seems that the extra range information has some kind of negative compile-time effect. Unfortunately, I have not been able to reproduce it locally. callgrind on the link step only gives a pretty small difference in instructions.

Compilation time impact looks acceptable. If you want to further reduce compile-time impact, you can remove/stop inferring unnecessary range attributes on intrinsic calls (e.g., call i32 range(i32 0, -2147483648) @llvm.abs(i32 %x)). These trivial cases are already handled by getRangeForIntrinsic.

if (F->hasRetAttribute(Attribute::Range))
CR = CR.intersectWith(F->getRetAttribute(Attribute::Range).getRange());
F->addRangeRetAttr(CR);
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever want to infer nonnull in addition to range attr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonnull applies only to pointers, while range applies only to (vectors of) integers, so they can not apply at the same time.

// attribute to the return value.
if (ReturnValue.isConstantRange() &&
!ReturnValue.getConstantRange().isSingleElement()) {
// Do not add range metadata if the return value may include undef.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any tests for this.

Is the reason for this to avoid creating UB where we where just propagating poison beforehand? If so, do you also need noundef for propagating nonnull?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any tests for this.

This is pre-existing code moved to a different place.

Is the reason for this to avoid creating UB where we where just propagating poison beforehand? If so, do you also need noundef for propagating nonnull?

ConstantRangeIncludingUndef means ConstantRange|undef. In the undef case any value may be picked, so the result is a full range.

@nikic
Copy link
Contributor Author

nikic commented Aug 30, 2024

Compilation time impact looks acceptable. If you want to further reduce compile-time impact, you can remove/stop inferring unnecessary range attributes on intrinsic calls (e.g., call i32 range(i32 0, -2147483648) @llvm.abs(i32 %x)). These trivial cases are already handled by getRangeForIntrinsic.

SCCP only infers ranges on function definitions. I think the extra ranges on intrinsic calls are because we first infer a range on the calling function and then it gets inlined, which back-propagates the range to the intrinsic call.

@nikic nikic merged commit 24fe1d4 into llvm:main Sep 2, 2024
8 checks passed
@nikic nikic deleted the sccp-attr-infer branch September 2, 2024 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category function-specialization llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants