-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization #143566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Evgenii Kudriashov (e-kud) ChangesFull diff: https://github.com/llvm/llvm-project/pull/143566.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 32348a899683d..7529ead8af202 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -5790,7 +5790,8 @@ static BasicBlock::iterator findInsertPos(Value *Addr, Instruction *MemoryInst,
// instruction after it.
if (SunkAddr) {
if (Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr))
- return std::next(AddrInst->getIterator());
+ return AddrInst->isTerminator() ? MemoryInst->getIterator()
+ : std::next(AddrInst->getIterator());
}
// Find the first user of Addr in current BB.
diff --git a/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll b/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
index 019f311406550..5f084fc7ea2b5 100644
--- a/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
@@ -42,3 +42,38 @@ bb3:
bb7:
ret void
}
+
+; Test a scenario when the address is the last instruction in the basic block.
+
+define void @adress_terminator() personality ptr null {
+; CHECK-LABEL: define void @adress_terminator() personality ptr null {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[PTR:%.*]] = invoke ptr null(i64 0)
+; CHECK-NEXT: to label %[[BODY_1:.*]] unwind label %[[EHCLEANUP:.*]]
+; CHECK: [[EHCLEANUP]]:
+; CHECK-NEXT: [[TMP0:%.*]] = cleanuppad within none []
+; CHECK-NEXT: cleanupret from [[TMP0]] unwind to caller
+; CHECK: [[BODY_1]]:
+; CHECK-NEXT: [[TMP1:%.*]] = bitcast ptr [[PTR]] to ptr
+; CHECK-NEXT: store <4 x i32> zeroinitializer, ptr [[TMP1]], align 4
+; CHECK-NEXT: [[V0:%.*]] = load <4 x i32>, ptr [[PTR]], align 4
+; CHECK-NEXT: store <4 x i32> zeroinitializer, ptr [[PTR]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ %ptr = invoke ptr null(i64 0) to label %body.1 unwind label %ehcleanup
+
+body.2:
+ %v0 = load <4 x i32>, ptr %2, align 4
+ store <4 x i32> zeroinitializer, ptr %2, align 4
+ ret void
+
+ehcleanup:
+ %1 = cleanuppad within none []
+ cleanupret from %1 unwind to caller
+
+body.1:
+ %2 = getelementptr { i32 }, ptr %ptr, i64 0, i32 0
+ store <4 x i32> zeroinitializer, ptr %2, align 4
+ br label %body.2
+}
|
llvm/lib/CodeGen/CodeGenPrepare.cpp
Outdated
@@ -5790,7 +5790,8 @@ static BasicBlock::iterator findInsertPos(Value *Addr, Instruction *MemoryInst, | |||
// instruction after it. | |||
if (SunkAddr) { | |||
if (Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr)) | |||
return std::next(AddrInst->getIterator()); | |||
return AddrInst->isTerminator() ? MemoryInst->getIterator() | |||
: std::next(AddrInst->getIterator()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found Invoke instruction at this position is already wrong. The problem is at
if (!ResultIndex) {
SunkAddr = ResultPtr;
} else {
- We should make sure ResultPtr is in current BB before directly using it as SunkAddr.
- If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.
- Invoke instruction is not a good candidate for SunkAddr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure ResultPtr is in current BB before directly using it as SunkAddr.
We use global pointers safely because we can't get iterator in a BB for them. Do we want to handle them similarly to instructions in other blocks?
If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.
Clone -- do you mean create GEP with 0?
Invoke instruction is not a good candidate for SunkAddr.
Not obvious to me, why is it different from instructions in other blocks or from globals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with @weiguozhi . From looking at the test case, why are we trying to sink the GEP upwards toward the entry block? It seems like something has gone wrong earlier.
The proposed fix is like adding a null check when the correct fix was to avoid passing null at a higher level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed fix is like adding a null check when the correct fix was to avoid passing null at a higher level.
Yeah, I agree but the same we can say about checking on Instruction
to bypass arguments and globals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of optimizeMemoryInst is to find address computation in PrevBB, and the address is used in a memory access instruction in a SuccBB, it then duplicate the address computation in SuccBB, SelectionDAG can fold the address computation into memory access.
We should make sure ResultPtr is in current BB before directly using it as SunkAddr.
We use global pointers safely because we can't get iterator in a BB for them. Do we want to handle them similarly to instructions in other blocks?
Global pointers is absolutely OK in a memory access instruction, but it doesn't fit into this optimization.
If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.
Clone -- do you mean create GEP with 0?
Duplicate the address computation instruction. (thus SunkAddr)
Invoke instruction is not a good candidate for SunkAddr.
Not obvious to me, why is it different from instructions in other blocks or from globals?
We can't fold Invoke instruction into a memory access instruction.
// this pointer comes from a different from the current basic block we | ||
// need to know how to recreate it in another basic block. | ||
// Currently we don't support recreation of any of instruction. | ||
if (PtrInst && PtrInst->getParent() != MemoryInst->getParent()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we can change the predicate to
if (!PtrInst || PtrInst->getParent() != MemoryInst->getParent())
To skip arguments and global pointers. I'm not sure if it is completely correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this condition. We should not modify a memory access instruction with pointer comes from argument or global value directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also added tests for arguments and globals. @weiguozhi you may be interested that we don't reuse addresses in these cases: https://godbolt.org/z/q8fzce4ov
llvm/lib/CodeGen/CodeGenPrepare.cpp
Outdated
@@ -6099,6 +6099,13 @@ bool CodeGenPrepare::optimizeMemoryInst(Instruction *MemoryInst, Value *Addr, | |||
} | |||
|
|||
if (!ResultIndex) { | |||
auto PtrInst = dyn_cast<Instruction>(ResultPtr); | |||
// Here we know that we have just a pointer without any offsets. If | |||
// this pointer comes from a different from the current basic block we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
different BB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for the fix.
@weiguozhi I think we need to skip only instructions. There are two failed test cases:
First one is about propagating Or maybe bypass constants additionally to instructions inside the same block? |
Then it's OK to skip instructions only. |
Follow up on #139303