Skip to content

[ArgPromotion] Perform alias analysis on actual arguments of Calls #106216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 27, 2024

Conversation

hazzlim
Copy link
Contributor

@hazzlim hazzlim commented Aug 27, 2024

Teach Argument Promotion to perform alias analysis on actual arguments of Calls to a Function, to try to prove that all Calls to the Function do not modify the memory pointed to by an argument. This surfaces more opportunities to perform Argument Promotion in cases where simply looking at a Function's instructions is insufficient to prove that the pointer argument is not invalidated before all loads from it.

@llvmbot
Copy link
Member

llvmbot commented Aug 27, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Hari Limaye (hazzlim)

Changes

Teach Argument Promotion to perform alias analysis on actual arguments of Calls to a Function, to try to prove that all Calls to the Function do not modify the memory pointed to by an argument. This surfaces more opportunities to perform Argument Promotion in cases where simply looking at a Function's instructions is insufficient to prove that the pointer argument is not invalidated before all loads from it.


Full diff: https://github.com/llvm/llvm-project/pull/106216.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/ArgumentPromotion.cpp (+48-10)
  • (modified) llvm/test/Transforms/ArgumentPromotion/actual-arguments.ll (+12-13)
diff --git a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
index 452fff7898d0ea..008a3792ab31b9 100644
--- a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+++ b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
@@ -474,7 +474,8 @@ static bool allCallersPassValidPointerForArgument(
 /// parts it can be promoted into.
 static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
                          unsigned MaxElements, bool IsRecursive,
-                         SmallVectorImpl<OffsetAndArgPart> &ArgPartsVec) {
+                         SmallVectorImpl<OffsetAndArgPart> &ArgPartsVec,
+                         bool ArgNotModified) {
   // Quick exit for unused arguments
   if (Arg->use_empty())
     return true;
@@ -696,8 +697,10 @@ static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
 
   // If store instructions are allowed, the path from the entry of the function
   // to each load may be not free of instructions that potentially invalidate
-  // the load, and this is an admissible situation.
-  if (AreStoresAllowed)
+  // the load, and this is an admissible situation. If we have already
+  // determined that the pointer Arg is not modified in the function (for all
+  // Calls) then we can similarly conclude analysis here.
+  if (AreStoresAllowed || ArgNotModified)
     return true;
 
   // Okay, now we know that the argument is only used by load instructions, and
@@ -745,6 +748,33 @@ static bool areTypesABICompatible(ArrayRef<Type *> Types, const Function &F,
   });
 }
 
+// Try to prove that all Calls to F do not modify the memory pointed to by Arg.
+// This can provide us with more opportunities to perform Argument Promotion in
+// cases where simply looking at a Function's instructions is insufficient to
+// prove that the pointer argument is not invalidated before all loads from it.
+static bool callDoesNotModifyArg(Function *F, unsigned ArgNo,
+                                 FunctionAnalysisManager &FAM) {
+  // Find all Users of F that are Calls, and see if they may modify Arg.
+  for (User *U : F->users()) {
+    auto *Call = dyn_cast<CallInst>(U);
+    if (!Call)
+      continue;
+
+    Value *ArgOp = Call->getArgOperand(ArgNo);
+    assert(ArgOp->getType()->isPointerTy() && "Argument must be Pointer Type!");
+
+    MemoryLocation Loc = MemoryLocation::getForArgument(Call, ArgNo, nullptr);
+
+    AAResults &AAR = FAM.getResult<AAManager>(*Call->getFunction());
+    // Bail out as soon as we find a Call where Arg may be modified.
+    if (isModSet(AAR.getModRefInfo(Call, Loc)))
+      return false;
+  }
+
+  // All Calls do not modify the Arg.
+  return true;
+}
+
 /// PromoteArguments - This method checks the specified function to see if there
 /// are any promotable arguments and if it is safe to promote the function (for
 /// example, all callers are direct).  If safe to promote some arguments, it
@@ -775,11 +805,13 @@ static Function *promoteArguments(Function *F, FunctionAnalysisManager &FAM,
     return nullptr;
 
   // First check: see if there are any pointer arguments!  If not, quick exit.
-  SmallVector<Argument *, 16> PointerArgs;
-  for (Argument &I : F->args())
-    if (I.getType()->isPointerTy())
-      PointerArgs.push_back(&I);
-  if (PointerArgs.empty())
+  SmallVector<unsigned, 16> PointerArgNos;
+  for (unsigned I = 0; I < F->arg_size(); ++I) {
+    Argument *Arg = F->getArg(I);
+    if (Arg->getType()->isPointerTy())
+      PointerArgNos.push_back(I);
+  }
+  if (PointerArgNos.empty())
     return nullptr;
 
   // Second check: make sure that all callers are direct callers.  We can't
@@ -814,7 +846,8 @@ static Function *promoteArguments(Function *F, FunctionAnalysisManager &FAM,
   // add it to ArgsToPromote.
   DenseMap<Argument *, SmallVector<OffsetAndArgPart, 4>> ArgsToPromote;
   unsigned NumArgsAfterPromote = F->getFunctionType()->getNumParams();
-  for (Argument *PtrArg : PointerArgs) {
+  for (const auto &ArgIdx : PointerArgNos) {
+    Argument *PtrArg = F->getArg(ArgIdx);
     // Replace sret attribute with noalias. This reduces register pressure by
     // avoiding a register copy.
     if (PtrArg->hasStructRetAttr()) {
@@ -828,10 +861,15 @@ static Function *promoteArguments(Function *F, FunctionAnalysisManager &FAM,
       }
     }
 
+    // Check if we can determine ahead of time that the argument is never
+    // modified by a call to this function.
+    bool ArgNotModified = callDoesNotModifyArg(F, ArgIdx, FAM);
+
     // If we can promote the pointer to its value.
     SmallVector<OffsetAndArgPart, 4> ArgParts;
 
-    if (findArgParts(PtrArg, DL, AAR, MaxElements, IsRecursive, ArgParts)) {
+    if (findArgParts(PtrArg, DL, AAR, MaxElements, IsRecursive, ArgParts,
+                     ArgNotModified)) {
       SmallVector<Type *, 4> Types;
       for (const auto &Pair : ArgParts)
         Types.push_back(Pair.second.Ty);
diff --git a/llvm/test/Transforms/ArgumentPromotion/actual-arguments.ll b/llvm/test/Transforms/ArgumentPromotion/actual-arguments.ll
index 63366ba998c7bb..d52bfc3111779d 100644
--- a/llvm/test/Transforms/ArgumentPromotion/actual-arguments.ll
+++ b/llvm/test/Transforms/ArgumentPromotion/actual-arguments.ll
@@ -75,11 +75,9 @@ define internal i32 @test_cannot_promote_3(ptr %p, ptr nocapture readonly %test_
 ;
 define internal i32 @test_can_promote_1(ptr %p, ptr nocapture readonly %test_c) {
 ; CHECK-LABEL: define {{[^@]+}}@test_can_promote_1
-; CHECK-SAME: (ptr [[P:%.*]], ptr nocapture readonly [[TEST_C:%.*]]) {
-; CHECK-NEXT:    [[TEST_C_VAL:%.*]] = load i32, ptr [[TEST_C]], align 4
-; CHECK-NEXT:    [[RES:%.*]] = call i32 @callee(ptr [[P]], i32 [[TEST_C_VAL]])
-; CHECK-NEXT:    [[LTEST_C:%.*]] = load i32, ptr [[TEST_C]], align 4
-; CHECK-NEXT:    [[SUM:%.*]] = add i32 [[LTEST_C]], [[RES]]
+; CHECK-SAME: (ptr [[P:%.*]], i32 [[TEST_C_0_VAL:%.*]]) {
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @callee(ptr [[P]], i32 [[TEST_C_0_VAL]])
+; CHECK-NEXT:    [[SUM:%.*]] = add i32 [[TEST_C_0_VAL]], [[RES]]
 ; CHECK-NEXT:    ret i32 [[SUM]]
 ;
   %res = call i32 @callee(ptr %p, ptr %test_c)
@@ -99,11 +97,9 @@ define internal i32 @test_can_promote_1(ptr %p, ptr nocapture readonly %test_c)
 ;
 define internal i32 @test_can_promote_2(ptr %p, ptr nocapture readonly %test_c) {
 ; CHECK-LABEL: define {{[^@]+}}@test_can_promote_2
-; CHECK-SAME: (ptr [[P:%.*]], ptr nocapture readonly [[TEST_C:%.*]]) {
-; CHECK-NEXT:    [[TEST_C_VAL:%.*]] = load i32, ptr [[TEST_C]], align 4
-; CHECK-NEXT:    [[RES:%.*]] = call i32 @callee(ptr [[P]], i32 [[TEST_C_VAL]])
-; CHECK-NEXT:    [[LTEST_C:%.*]] = load i32, ptr [[TEST_C]], align 4
-; CHECK-NEXT:    [[SUM:%.*]] = add i32 [[LTEST_C]], [[RES]]
+; CHECK-SAME: (ptr [[P:%.*]], i32 [[TEST_C_0_VAL:%.*]]) {
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @callee(ptr [[P]], i32 [[TEST_C_0_VAL]])
+; CHECK-NEXT:    [[SUM:%.*]] = add i32 [[TEST_C_0_VAL]], [[RES]]
 ; CHECK-NEXT:    ret i32 [[SUM]]
 ;
   %res = call i32 @callee(ptr %p, ptr %test_c)
@@ -186,8 +182,10 @@ define i32 @caller_safe_args_1(i64 %n) {
 ; CHECK-NEXT:    [[CALLER_C:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    store i32 5, ptr [[CALLER_C]], align 4
 ; CHECK-NEXT:    [[RES1:%.*]] = call i32 @test_cannot_promote_3(ptr [[P]], ptr [[CALLER_C]])
-; CHECK-NEXT:    [[RES2:%.*]] = call i32 @test_can_promote_1(ptr [[P]], ptr [[CALLER_C]])
-; CHECK-NEXT:    [[RES3:%.*]] = call i32 @test_can_promote_2(ptr [[P]], ptr [[CALLER_C]])
+; CHECK-NEXT:    [[CALLER_C_VAL:%.*]] = load i32, ptr [[CALLER_C]], align 4
+; CHECK-NEXT:    [[RES2:%.*]] = call i32 @test_can_promote_1(ptr [[P]], i32 [[CALLER_C_VAL]])
+; CHECK-NEXT:    [[CALLER_C_VAL1:%.*]] = load i32, ptr [[CALLER_C]], align 4
+; CHECK-NEXT:    [[RES3:%.*]] = call i32 @test_can_promote_2(ptr [[P]], i32 [[CALLER_C_VAL1]])
 ; CHECK-NEXT:    [[RES12:%.*]] = add i32 [[RES1]], [[RES2]]
 ; CHECK-NEXT:    [[RES:%.*]] = add i32 [[RES12]], [[RES3]]
 ; CHECK-NEXT:    ret i32 [[RES]]
@@ -215,7 +213,8 @@ define i32 @caller_safe_args_2(i64 %n, ptr %p) {
 ; CHECK-NEXT:    call void @memset(ptr [[P]], i64 0, i64 [[N]])
 ; CHECK-NEXT:    [[CALLER_C:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    store i32 5, ptr [[CALLER_C]], align 4
-; CHECK-NEXT:    [[RES:%.*]] = call i32 @test_can_promote_2(ptr [[P]], ptr [[CALLER_C]])
+; CHECK-NEXT:    [[CALLER_C_VAL:%.*]] = load i32, ptr [[CALLER_C]], align 4
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @test_can_promote_2(ptr [[P]], i32 [[CALLER_C_VAL]])
 ; CHECK-NEXT:    ret i32 [[RES]]
 ;
   call void @memset(ptr %p, i64 0, i64 %n)

for (User *U : F->users()) {
auto *Call = dyn_cast<CallInst>(U);
if (!Call)
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we find an unexpected use of the function, should we bail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - done.

Value *ArgOp = Call->getArgOperand(ArgNo);
assert(ArgOp->getType()->isPointerTy() && "Argument must be Pointer Type!");

MemoryLocation Loc = MemoryLocation::getForArgument(Call, ArgNo, nullptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we construct a more precise memory location based on the actual loads in the callee? This is basically equivalent to MemoryLocation::getBeforeOrAfter(ArgOp)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, I don't think that we can do this - because BasicAA does not support inter-procedural analysis, we cannot pass a Loc derived from a load in the callee to the AAR query based on the caller (where the Call to F is).

ModRefInfo BasicAAResult::getModRefInfo(const CallBase *Call,
const MemoryLocation &Loc,
AAQueryInfo &AAQI) {
assert(notDifferentParent(Call, Loc.Ptr) &&
"AliasAnalysis query involving multiple functions!");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, use the pointer from the caller, but compute the size of the memory region using the loads in the callee.

Copy link
Contributor Author

@hazzlim hazzlim Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, apologies I was misunderstanding you before. I have added this in Address review 3

If I understand correctly we still run into an issue because we are doing an alias query on a CallInst, and because BasicAA is inherently non-interprocedural it disregards the Size of the Loc and uses getBeforeOrAfter anyway when checking if the pointer aliases any of the Call's arguments:

// If this is a no-capture pointer argument, see if we can tell that it
// is impossible to alias the pointer we're checking.
AliasResult AR =
AAQI.AAR.alias(MemoryLocation::getBeforeOrAfter(*CI),
MemoryLocation::getBeforeOrAfter(Object), AAQI);
// Operand doesn't alias 'Object', continue looking for other aliases

// Try to prove that all Calls to F do not modify the memory pointed to by Arg.
// This can provide us with more opportunities to perform Argument Promotion in
// cases where simply looking at a Function's instructions is insufficient to
// prove that the pointer argument is not invalidated before all loads from it.
Copy link
Collaborator

@efriedma-quic efriedma-quic Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I'm following correctly, you're doing basically the same analysis in two different ways. The first way is using local to the caller: if the pointed-to memory isn't modified at all, the transform is fine. The second way is local to the callee: if the argument memory can't ever be modified before it's loaded, no matter what the arguments are, the transform is fine.

In theory, you could do an interprocedural analysis... but it would be expensive to compute, and ArgumentPromotion doesn't have infrastructure for that. So I guess this approach is fine, but it could use better comments explaining that you're using two different approaches to compute the same thing.

Also, is there some reason callDoesNotModifyArg() needs to run before we reach the relevant bit of findArgParts()? If the two checks were together, it would be more obvious they're related.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly - ideally we would be able to do a more complete inter-procedural analysis, but in lieu of that this seemed like a reasonable way to get ArgPromotion to handle the motivating case (which comes from the SPEC bwaves benchmark).

I've moved the call to the relevant place in findArgParts, and tried to make things clearer in the comments.

Thank you for reviewing the PR :)

Comment on lines 490 to 491
static bool isArgUnmodifiedByAllCalls(Function *F, unsigned ArgNo,
FunctionAnalysisManager &FAM) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass just in an Argument* here and use Argument->getParent() and Argument->getArgNo() to simplify the function signature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, that's much cleaner - I've made this change in commit Address review comments 2

return false;
}

// All Users are Calls which do not modify the Arg.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment, the one on line 492, the one on line 488/499, and the function name all pretty much say the same thing. Maybe delete the ones in the function body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed redundant comments in Address review comments 2

Comment on lines 500 to 501
Value *ArgOp = Call->getArgOperand(ArgNo);
assert(ArgOp->getType()->isPointerTy() && "Argument must be Pointer Type!");
Copy link
Contributor

@MDevereau MDevereau Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think its possible for Arg to not be a pointer type at this stage, since PointerArgs in promoteArguments does a pointer check on the arguments before adding them. I think this check can be deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point - I've removed this assert in Address review comments 2

Teach Argument Promotion to perform alias analysis on actual arguments
of Calls to a Function, to try to prove that all Calls to the Function
do not modify the memory pointed to by an argument. This surfaces more
opportunities to perform Argument Promotion in cases where simply
looking at a Function's instructions is insufficient to prove that the
pointer argument is not invalidated before all loads from it.
- Bail on unexpected (non CallInst) use of function
- Move checks into findArgParts to improve clarity and only run when
  necessary
- Remove FIXMEs from test
- Remove redundant comments
- Refactor function signature
- Remove unnecessary assert
Copy link

github-actions bot commented Sep 6, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

- Compute the size of the memory region using the loads in the callee
@hazzlim
Copy link
Contributor Author

hazzlim commented Sep 18, 2024

Polite ping on this :)

LocationSize Size = LocationSize::precise(0);
for (LoadInst *Load : Loads) {
Size = Size.unionWith(MemoryLocation::get(Load).Size);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop doesn't compute what the comment says it does. Say you have an array of 3 uint32_t to promote; that's 12 bytes. This computes a size of 4 bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about this - I got that completely wrong there.

I have fixed this to re-use the Size that is computed during the check that the parts are non-overlapping - which I believe is correct. I think that this is more conservative than it could be, as it computes the Size of the Memory Location from base pointer Arg, but you could have the case where all loads are from some non-zero offset from Arg. However it's not obvious to me how we could "add" this offset to the Pointer from the Caller in the Memory Location used in isArgUnmodifiedByAllCalls().

I struggled to come up with a test for the Size of the Memory Location, I don't know if you have any ideas there?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the case where it would matter is if you pass in the same pointer twice (or a related pointer). Which, umm.... I guess we can't prove aliasing in that case anyway. I mean, it could matter in the future if getModRefInfo becomes more powerful, but I don't think getModRefInfo can prove something like that at the moment.

Sorry, I didn't realize when I made the suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! Refining with the Size of the memory accessed through the pointer (given that it is already calculated) does seem like a reasonable thing to do, even though getModRefInfo does not currently make much use of it. I'll leave this in if that makes sense to you?

- Fix incorrect calculation of the size of the memory region accessed by
  the loads through Arg.
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned that because we can't write tests, it'll be harder to track down issues later if there's something wrong with the size computation.

// If we can determine that no call to the Function modifies the memory region
// accessed through Arg, through alias analysis using actual arguments in the
// callers, we know that it is guaranteed to be safe to promote the argument.
if (isArgUnmodifiedByAllCalls(Arg, LocationSize::precise(Offset), FAM))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is quite right; can an Offset be negative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes - I was mistakenly thinking that we bail when any load has a negative offset from Arg, but this is only the case for loads in the entry block that are guaranteed to execute. And I think what I've done here is in fact incorrect for all cases where the lowest offset is non-zero.

Given that it is not really possible to test this at this stage, I have reverted to not using any Size information if you think that seems appropriate?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hazzlim hazzlim merged commit 1c26e2b into llvm:main Sep 27, 2024
8 checks passed
for (User *U : Arg->getParent()->users()) {

// Bail if we find an unexpected (non CallInst) use of the function.
auto *Call = dyn_cast<CallInst>(U);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this checking for CallInst rather than CallBase? This should also work for invokes, right?

And I think then you could also replace it with an assert, as the function only being used as CallBase callee is a precondition of trying the transform in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I was perhaps a bit trigger happy with committing this. I have put up #110335 to address this.

hazzlim added a commit to hazzlim/llvm-project that referenced this pull request Sep 27, 2024
Check that all users of a Function are CallBase rather than CallInst
when performing alias analysis using actual arguments in the calling
function, as this check is also valid for Invoke instructions.

This allows replacing the existing check with an assert, as the Function
only being used by CallBase derived instructions is a precondition of
the transform.

This addresses post-commit review on llvm#106216.
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Sep 27, 2024
…lvm#106216)

Teach Argument Promotion to perform alias analysis on actual arguments
of Calls to a Function, to try to prove that all Calls to the Function
do not modify the memory pointed to by an argument. This surfaces more
opportunities to perform Argument Promotion in cases where simply
looking at a Function's instructions is insufficient to prove that the
pointer argument is not invalidated before all loads from it.
hazzlim added a commit that referenced this pull request Oct 4, 2024
Check that all users of a Function are CallBase rather than CallInst
when performing alias analysis using actual arguments in the calling
function, as this check is also valid for Invoke instructions.

This allows replacing the existing check with an assert, as the Function
only being used by CallBase derived instructions is a precondition of
the transform.

This addresses post-commit review on #106216.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants