Skip to content

[MemDepAnalysis] Don't reuse NonLocalPointerDeps cache if memory location size differs #116936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 21, 2024

Conversation

aeubanks
Copy link
Contributor

As seen in #111585, we can end up using a previous cache entry where the size was too large and was UB.

Compile time impact: https://llvm-compile-time-tracker.com/compare.php?from=6a863f7e2679a60f2f38ae6a920d0b6e1a2c1690&to=faccf4e1f47fcd5360a438de2a56d02b770ad498&stat=instructions:u.

Fixes #111585.

@aeubanks aeubanks requested review from nikic and alinas November 20, 2024 07:28
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Nov 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Arthur Eubanks (aeubanks)

Changes

As seen in #111585, we can end up using a previous cache entry where the size was too large and was UB.

Compile time impact: https://llvm-compile-time-tracker.com/compare.php?from=6a863f7e2679a60f2f38ae6a920d0b6e1a2c1690&to=faccf4e1f47fcd5360a438de2a56d02b770ad498&stat=instructions:u.

Fixes #111585.


Full diff: https://github.com/llvm/llvm-project/pull/116936.diff

1 Files Affected:

  • (modified) llvm/lib/Analysis/MemoryDependenceAnalysis.cpp (+12-34)
diff --git a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
index c5fba184cd0850..2a0c9612d4abc8 100644
--- a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
@@ -1066,40 +1066,18 @@ bool MemoryDependenceResults::getNonLocalPointerDepFromBB(
   // Invariant loads don't participate in caching. Thus no need to reconcile.
   if (!isInvariantLoad && !Pair.second) {
     if (CacheInfo->Size != Loc.Size) {
-      bool ThrowOutEverything;
-      if (CacheInfo->Size.hasValue() && Loc.Size.hasValue()) {
-        // FIXME: We may be able to do better in the face of results with mixed
-        // precision. We don't appear to get them in practice, though, so just
-        // be conservative.
-        ThrowOutEverything =
-            CacheInfo->Size.isPrecise() != Loc.Size.isPrecise() ||
-            !TypeSize::isKnownGE(CacheInfo->Size.getValue(),
-                                 Loc.Size.getValue());
-      } else {
-        // For our purposes, unknown size > all others.
-        ThrowOutEverything = !Loc.Size.hasValue();
-      }
-
-      if (ThrowOutEverything) {
-        // The query's Size is greater than the cached one. Throw out the
-        // cached data and proceed with the query at the greater size.
-        CacheInfo->Pair = BBSkipFirstBlockPair();
-        CacheInfo->Size = Loc.Size;
-        for (auto &Entry : CacheInfo->NonLocalDeps)
-          if (Instruction *Inst = Entry.getResult().getInst())
-            RemoveFromReverseMap(ReverseNonLocalPtrDeps, Inst, CacheKey);
-        CacheInfo->NonLocalDeps.clear();
-        // The cache is cleared (in the above line) so we will have lost
-        // information about blocks we have already visited. We therefore must
-        // assume that the cache information is incomplete.
-        IsIncomplete = true;
-      } else {
-        // This query's Size is less than the cached one. Conservatively restart
-        // the query using the greater size.
-        return getNonLocalPointerDepFromBB(
-            QueryInst, Pointer, Loc.getWithNewSize(CacheInfo->Size), isLoad,
-            StartBB, Result, Visited, SkipFirstBlock, IsIncomplete);
-      }
+      // The query's Size is greater than the cached one. Throw out the
+      // cached data and proceed with the query at the greater size.
+      CacheInfo->Pair = BBSkipFirstBlockPair();
+      CacheInfo->Size = Loc.Size;
+      for (auto &Entry : CacheInfo->NonLocalDeps)
+        if (Instruction *Inst = Entry.getResult().getInst())
+          RemoveFromReverseMap(ReverseNonLocalPtrDeps, Inst, CacheKey);
+      CacheInfo->NonLocalDeps.clear();
+      // The cache is cleared (in the above line) so we will have lost
+      // information about blocks we have already visited. We therefore must
+      // assume that the cache information is incomplete.
+      IsIncomplete = true;
     }
 
     // If the query's AATags are inconsistent with the cached one,

@aeubanks
Copy link
Contributor Author

Still trying to understand the code to see if this is the correct fix, but figured I'd send this out to get thoughts

I'm having trouble coming up with a test case that isn't brittle.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please update the failing Transforms/GVN/PRE/rle.ll test?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in principle, not reusing cache entries from a greater size for a smaller one is correct. As you point out, it may be incorrect if the larger size is UB.

It should still be fine to reuse a cache entry without size for one with, and it would also be valid to use an upperBound() larger size instead of a precise() one. But possibly it's not worth bothering with it.

@aeubanks
Copy link
Contributor Author

updated test

; CHECK-NEXT: br i1 [[COND:%.*]], label [[T:%.*]], label [[F:%.*]]
; CHECK: T:
; CHECK-NEXT: ret float [[TMP1]]
; CHECK: F:
; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: [[X:%.*]] = bitcast i32 [[A]] to float
; CHECK-NEXT: ret float [[X]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this fail to fold now? There are no access size changes going on here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I had messed up which version of opt I had used for UTC

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I'd still like to have a test case, even if it's brittle.

aeubanks added a commit that referenced this pull request Nov 21, 2024
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aeubanks aeubanks merged commit 6f68d03 into llvm:main Nov 21, 2024
8 checks passed
@aeubanks aeubanks deleted the memdep-throwouteverything branch November 21, 2024 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Miscompile in GVN due to MemDepAnalysis caching
3 participants