Skip to content

[MemDep] Use EarliestEscapeInfo #69727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 23, 2023
Merged

[MemDep] Use EarliestEscapeInfo #69727

merged 1 commit into from
Oct 23, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 20, 2023

Use BatchAA with EarliestEscapeInfo instead of callCapturesBefore() in MemDepAnalysis. The advantage of this is that it will also take not-captured-before information into account for non-calls (see test_store_before_capture for a representative example), and that this is a cached analysis. The disadvantage is that EII is slightly less precise than full CapturedBefore analysis.

In practice the impact is positive, with gvn.NumGVNLoad going from 22022 to 22808 on test-suite. Full list of diffs: https://gist.github.com/nikic/9b6b4187d96d25249e9d874b47be99df

The impact to compile-time is also positive, mainly in the ThinLTO configuration: http://llvm-compile-time-tracker.com/compare.php?from=97e06a0d83a55ff67e5fd0c7328233c44415de33&to=4921af8845260f2cea9aed0a350aec4e8c7b10e6&stat=instructions:u

Use BatchAA with EarliestEscapeInfo instead of callCapturesBefore().
The advantage of this is that it will also take not-captured-before
information into account for non-calls, and that this is a cached
analysis. The disadvantage is that EII is slightly less precise
than full CapturedBefore analysis.

In practice the impact is positive, with gvn.NumGVNLoad going
from 22022 to 22808 on test-suite.
@nikic nikic requested review from fhahn, alinas and skachkov-sc October 20, 2023 13:46
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 20, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 20, 2023

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Use BatchAA with EarliestEscapeInfo instead of callCapturesBefore() in MemDepAnalysis. The advantage of this is that it will also take not-captured-before information into account for non-calls (see test_store_before_capture for a representative example), and that this is a cached analysis. The disadvantage is that EII is slightly less precise than full CapturedBefore analysis.

In practice the impact is positive, with gvn.NumGVNLoad going from 22022 to 22808 on test-suite. Full list of diffs: https://gist.github.com/nikic/9b6b4187d96d25249e9d874b47be99df

The impact to compile-time is also positive, mainly in the ThinLTO configuration: http://llvm-compile-time-tracker.com/compare.php?from=7c7896b1beeb1d275a06557111363b99b754398a&to=72a9442c743de0ad3fe84e2f8042f392aacda2c7&stat=instructions%3Au


Full diff: https://github.com/llvm/llvm-project/pull/69727.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h (+3-2)
  • (modified) llvm/lib/Analysis/MemoryDependenceAnalysis.cpp (+5-7)
  • (modified) llvm/test/CodeGen/BPF/CORE/no-narrow-load.ll (+4-4)
  • (modified) llvm/test/Transforms/GVN/captured-before.ll (+1-2)
diff --git a/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h b/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
index 27185aa9942e4e3..d5b2eb6253db959 100644
--- a/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
+++ b/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
@@ -18,6 +18,7 @@
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/PointerSumType.h"
 #include "llvm/ADT/SmallPtrSet.h"
+#include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/MemoryLocation.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/PredIteratorCache.h"
@@ -27,7 +28,6 @@
 
 namespace llvm {
 
-class AAResults;
 class AssumptionCache;
 class BatchAAResults;
 class DominatorTree;
@@ -356,6 +356,7 @@ class MemoryDependenceResults {
   const TargetLibraryInfo &TLI;
   DominatorTree &DT;
   PredIteratorCache PredCache;
+  EarliestEscapeInfo EII;
 
   unsigned DefaultBlockScanLimit;
 
@@ -367,7 +368,7 @@ class MemoryDependenceResults {
   MemoryDependenceResults(AAResults &AA, AssumptionCache &AC,
                           const TargetLibraryInfo &TLI, DominatorTree &DT,
                           unsigned DefaultBlockScanLimit)
-      : AA(AA), AC(AC), TLI(TLI), DT(DT),
+      : AA(AA), AC(AC), TLI(TLI), DT(DT), EII(DT),
         DefaultBlockScanLimit(DefaultBlockScanLimit) {}
 
   /// Handle invalidation in the new PM.
diff --git a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
index 49a31a52cf0e850..1c1a0873ac520f9 100644
--- a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
@@ -268,7 +268,7 @@ MemDepResult MemoryDependenceResults::getPointerDependencyFrom(
 MemDepResult MemoryDependenceResults::getPointerDependencyFrom(
     const MemoryLocation &MemLoc, bool isLoad, BasicBlock::iterator ScanIt,
     BasicBlock *BB, Instruction *QueryInst, unsigned *Limit) {
-  BatchAAResults BatchAA(AA);
+  BatchAAResults BatchAA(AA, &EII);
   return getPointerDependencyFrom(MemLoc, isLoad, ScanIt, BB, QueryInst, Limit,
                                   BatchAA);
 }
@@ -645,11 +645,7 @@ MemDepResult MemoryDependenceResults::getSimplePointerDependencyFrom(
         continue;
 
     // See if this instruction (e.g. a call or vaarg) mod/ref's the pointer.
-    ModRefInfo MR = BatchAA.getModRefInfo(Inst, MemLoc);
-    // If necessary, perform additional analysis.
-    if (isModAndRefSet(MR))
-      MR = BatchAA.callCapturesBefore(Inst, MemLoc, &DT);
-    switch (MR) {
+    switch (BatchAA.getModRefInfo(Inst, MemLoc)) {
     case ModRefInfo::NoModRef:
       // If the call has no effect on the queried pointer, just ignore it.
       continue;
@@ -1227,7 +1223,7 @@ bool MemoryDependenceResults::getNonLocalPointerDepFromBB(
   bool GotWorklistLimit = false;
   LLVM_DEBUG(AssertSorted(*Cache));
 
-  BatchAAResults BatchAA(AA);
+  BatchAAResults BatchAA(AA, &EII);
   while (!Worklist.empty()) {
     BasicBlock *BB = Worklist.pop_back_val();
 
@@ -1539,6 +1535,8 @@ void MemoryDependenceResults::invalidateCachedPredecessors() {
 }
 
 void MemoryDependenceResults::removeInstruction(Instruction *RemInst) {
+  EII.removeInstruction(RemInst);
+
   // Walk through the Non-local dependencies, removing this one as the value
   // for any cached queries.
   NonLocalDepMapType::iterator NLDI = NonLocalDepsMap.find(RemInst);
diff --git a/llvm/test/CodeGen/BPF/CORE/no-narrow-load.ll b/llvm/test/CodeGen/BPF/CORE/no-narrow-load.ll
index b154b1d29fa24bf..dbfc4d2f99fc56d 100644
--- a/llvm/test/CodeGen/BPF/CORE/no-narrow-load.ll
+++ b/llvm/test/CodeGen/BPF/CORE/no-narrow-load.ll
@@ -65,10 +65,10 @@ lor.end:                                          ; preds = %lor.end.critedge, %
   ret void, !dbg !53
 }
 
-; CHECK: r[[LOAD1:[0-9]+]] = *(u32 *)(r{{[0-9]+}} + 4)
-; CHECK: r[[LOAD1]] &= 65536
-; CHECK: r[[LOAD2:[0-9]+]] = *(u32 *)(r{{[0-9]+}} + 4)
-; CHECK: r[[LOAD2]] &= 32768
+; CHECK: r[[LOAD:[0-9]+]] = *(u32 *)(r{{[0-9]+}} + 4)
+; CHECK: r[[COPY:[0-9]+]] = r[[LOAD]]
+; CHECK: r[[COPY]] &= 65536
+; CHECK: r[[LOAD]] &= 32768
 
 ; Function Attrs: nounwind readnone speculatable willreturn
 declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
diff --git a/llvm/test/Transforms/GVN/captured-before.ll b/llvm/test/Transforms/GVN/captured-before.ll
index 4e025b3f772e103..6d012edea34610f 100644
--- a/llvm/test/Transforms/GVN/captured-before.ll
+++ b/llvm/test/Transforms/GVN/captured-before.ll
@@ -46,9 +46,8 @@ define i32 @test_store_before_capture(ptr %p) {
 ; CHECK-NEXT:    store i32 123, ptr [[A]], align 4
 ; CHECK-NEXT:    [[P2:%.*]] = load ptr, ptr [[P]], align 8
 ; CHECK-NEXT:    store i32 42, ptr [[P2]], align 4
-; CHECK-NEXT:    [[V:%.*]] = load i32, ptr [[A]], align 4
 ; CHECK-NEXT:    call void @capture(ptr [[A]])
-; CHECK-NEXT:    ret i32 [[V]]
+; CHECK-NEXT:    ret i32 123
 ;
   %a = alloca i32
   store i32 123, ptr %a

Copy link
Contributor

@alinas alinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZequanWu
Copy link
Contributor

It seems to cause mis-compile to our tests: https://bugs.chromium.org/p/chromium/issues/detail?id=1495809#c3

Attached the good & bad IR for the test function body with:
good.txt
bad.txt

@nikic
Copy link
Contributor Author

nikic commented Oct 26, 2023

@ZequanWu I looked at the test case, and I don't think this is a miscompile. The faulting load just gets optimized away.

The code uses the pattern

  int buffer0_contents = *buffer0;
  EXPECT_EQ(buffer0_contents, *buffer0);

which previously managed to prevent the optimization, but no longer does.

You should be able to force the load to not be optimized away by using a volatile load instead. Something like *(volatile int *)buffer0 should do.

@ZequanWu
Copy link
Contributor

Thanks, that seems to be the cause.

fhahn pushed a commit to fhahn/llvm-project that referenced this pull request Feb 5, 2024
Use BatchAA with EarliestEscapeInfo instead of callCapturesBefore() in
MemDepAnalysis. The advantage of this is that it will also take
not-captured-before information into account for non-calls (see
test_store_before_capture for a representative example), and that this
is a cached analysis. The disadvantage is that EII is slightly less
precise than full CapturedBefore analysis.

In practice the impact is positive, with gvn.NumGVNLoad going from 22022
to 22808 on test-suite.

The impact to compile-time is also positive, mainly in the ThinLTO
configuration.

(cherry-picked from 2ad9fde)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants