-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[GlobalOpt] Look through non-PointerType constant expressions. #125205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Allow looking through constant expressions. Constant expressions cannot read, modify or leak the global themselves. I might be missing something, but using analyzeGlobalAux should ensure all (instruction) users that may read, modify or leak the global are checked. This fixes another regression exposed by llvm#123518.
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesAllow looking through constant expressions. Constant expressions cannot read, modify or leak the global themselves. I might be missing something, but using analyzeGlobalAux should ensure all (instruction) users that may read, modify or leak the global are checked. This fixes another regression exposed by Full diff: https://github.com/llvm/llvm-project/pull/125205.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/GlobalStatus.cpp b/llvm/lib/Transforms/Utils/GlobalStatus.cpp
index 0b3016a86e2875..54a72824129b86 100644
--- a/llvm/lib/Transforms/Utils/GlobalStatus.cpp
+++ b/llvm/lib/Transforms/Utils/GlobalStatus.cpp
@@ -78,8 +78,14 @@ static bool analyzeGlobalAux(const Value *V, GlobalStatus &GS,
return true;
} else {
// Ignore dead constant users.
- if (!isSafeToDestroyConstant(C))
- return true;
+ if (!isSafeToDestroyConstant(C)) {
+ if (CE) {
+ if (VisitedUsers.insert(CE).second)
+ if (analyzeGlobalAux(CE, GS, VisitedUsers))
+ return true;
+ } else
+ return true;
+ }
}
} else if (const Instruction *I = dyn_cast<Instruction>(UR)) {
if (!GS.HasMultipleAccessingFunctions) {
@@ -166,9 +172,12 @@ static bool analyzeGlobalAux(const Value *V, GlobalStatus &GS,
if (MTI->getArgOperand(1) == V)
GS.IsLoaded = true;
} else if (const MemSetInst *MSI = dyn_cast<MemSetInst>(I)) {
- assert(MSI->getArgOperand(0) == V && "Memset only takes one pointer!");
+ if (MSI->getArgOperand(0) != V || MSI->getArgOperand(1) == V)
+ return true;
if (MSI->isVolatile())
return true;
+ assert(MSI->getArgOperand(0) == V &&
+ "First argument must be the pointer!");
GS.StoredType = GlobalStatus::Stored;
} else if (const auto *CB = dyn_cast<CallBase>(I)) {
if (CB->getIntrinsicID() == Intrinsic::threadlocal_address) {
diff --git a/llvm/test/Transforms/GlobalOpt/cleanup-pointer-root-users-ptrtoint-add-constexpr.ll b/llvm/test/Transforms/GlobalOpt/cleanup-pointer-root-users-ptrtoint-add-constexpr.ll
index 1e87d9266cb380..52bdc4343851ab 100644
--- a/llvm/test/Transforms/GlobalOpt/cleanup-pointer-root-users-ptrtoint-add-constexpr.ll
+++ b/llvm/test/Transforms/GlobalOpt/cleanup-pointer-root-users-ptrtoint-add-constexpr.ll
@@ -12,8 +12,6 @@ declare i32 @fn1()
define void @stores_single_use_gep_constexpr() {
; CHECK-LABEL: @stores_single_use_gep_constexpr(
; CHECK-NEXT: entry:
-; CHECK-NEXT: store ptr @fn0, ptr @global.20ptr, align 8
-; CHECK-NEXT: store ptr @fn1, ptr getelementptr inbounds ([[STRUCT_GLOBAL_20PTR:%.*]], ptr @global.20ptr, i64 0, i32 1), align 8
; CHECK-NEXT: ret void
;
entry:
diff --git a/llvm/test/Transforms/GlobalOpt/read-with-constexpr-users.ll b/llvm/test/Transforms/GlobalOpt/read-with-constexpr-users.ll
index 9df31887102f23..1e39cac9722ca5 100644
--- a/llvm/test/Transforms/GlobalOpt/read-with-constexpr-users.ll
+++ b/llvm/test/Transforms/GlobalOpt/read-with-constexpr-users.ll
@@ -5,15 +5,14 @@
@H = internal global [2 x i64 ] zeroinitializer
;.
-; CHECK: @G = internal global [2 x i64] zeroinitializer
+; CHECK: @G = internal constant [2 x i64] zeroinitializer
; CHECK: @H = internal global [2 x i64] zeroinitializer
;.
define i64 @G_used_by_gep_inttoptr_exprs() {
; CHECK-LABEL: define i64 @G_used_by_gep_inttoptr_exprs() local_unnamed_addr {
-; CHECK-NEXT: [[L:%.*]] = load i64, ptr @G, align 8
; CHECK-NEXT: [[GEP:%.*]] = getelementptr i8, ptr inttoptr (i64 add (i64 ptrtoint (ptr getelementptr inbounds nuw (i8, ptr @G, i64 16) to i64), i64 8) to ptr), i64 1
; CHECK-NEXT: [[C:%.*]] = icmp eq ptr [[GEP]], @G
-; CHECK-NEXT: [[SEL:%.*]] = select i1 [[C]], i64 [[L]], i64 9
+; CHECK-NEXT: [[SEL:%.*]] = select i1 [[C]], i64 0, i64 9
; CHECK-NEXT: ret i64 [[SEL]]
;
%l = load i64, ptr @G, align 8
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is safe. We generally cannot analyze past a ptrtoint cast due to its provenance exposure side effect.
Why does your change result in this ptrtoint+add+inttoptr arithmetic being introduced in the first place?
With the patch, we only look through it to determine if the base object may be read or written. Are there any provenance issues that may lead as to miss reads/writes?
It is not removed/folded in the same way, because we simplify to a compare earlier. Here is roughly what's going on We start out with
With the patch, SimplifyCFG converts this to icmps
Then instcombine folds this to
Without the patch, the switch will only get simplified later
And later instcombine removes the
|
My primary concern here would be that the result of the inttoptr might point to a different global. That isn't relevant in the context of an icmp (icmp ignores provenance anyway), but could be relevant in other contexts. |
Are you thinking about something like below? GlobalStatus determines how a particular global is used. If starting from
|
Say you have something like the following:
You have a write to g2, but the store itself doesn't mention g2 at all. Maybe that's already broken even without your patch, though? I'm suspicious of the way "IsCompared" works. If comparisons aren't involved, I guess it's safe to recurse through ptrtoint/inttoptr? At least, I can't come up with any way to break it in this context; like you say, it's conservatively correct. |
Hmm I see. I assumed there's already code to rule out those cases. I checked and the compare based on integers is handled correctly by GlobalOpt due to considering I think we can also run into this issue with non-inbounds GEPs unless I am missing something and that then wouldn't be handled correctly by GlobalOpt: https://llvm.godbolt.org/z/YroTa6Pc6 So with the change we would be incorrect in more cases. Maybe we need to be more careful with compares in general, and if any constant is involved we should make sure that the pointers on both sides are inbounds? |
The GEP version is UB due to provenance. But yes, we probably want need to restrict more optimizations on comparison operators. |
Allow looking through constant expressions. Constant expressions cannot read, modify or leak the global themselves.
I might be missing something, but using analyzeGlobalAux should ensure all (instruction) users that may read, modify or leak the global are checked.
This fixes another regression exposed by
#123518.