Skip to content

[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization #143566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

e-kud
Copy link
Contributor

@e-kud e-kud commented Jun 10, 2025

Follow up on #139303

@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Evgenii Kudriashov (e-kud)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/143566.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+2-1)
  • (modified) llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll (+35)
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 32348a899683d..7529ead8af202 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -5790,7 +5790,8 @@ static BasicBlock::iterator findInsertPos(Value *Addr, Instruction *MemoryInst,
   // instruction after it.
   if (SunkAddr) {
     if (Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr))
-      return std::next(AddrInst->getIterator());
+      return AddrInst->isTerminator() ? MemoryInst->getIterator()
+                                      : std::next(AddrInst->getIterator());
   }
 
   // Find the first user of Addr in current BB.
diff --git a/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll b/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
index 019f311406550..5f084fc7ea2b5 100644
--- a/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
@@ -42,3 +42,38 @@ bb3:
 bb7:
   ret void
 }
+
+; Test a scenario when the address is the last instruction in the basic block.
+
+define void @adress_terminator() personality ptr null {
+; CHECK-LABEL: define void @adress_terminator() personality ptr null {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[PTR:%.*]] = invoke ptr null(i64 0)
+; CHECK-NEXT:            to label %[[BODY_1:.*]] unwind label %[[EHCLEANUP:.*]]
+; CHECK:       [[EHCLEANUP]]:
+; CHECK-NEXT:    [[TMP0:%.*]] = cleanuppad within none []
+; CHECK-NEXT:    cleanupret from [[TMP0]] unwind to caller
+; CHECK:       [[BODY_1]]:
+; CHECK-NEXT:    [[TMP1:%.*]] = bitcast ptr [[PTR]] to ptr
+; CHECK-NEXT:    store <4 x i32> zeroinitializer, ptr [[TMP1]], align 4
+; CHECK-NEXT:    [[V0:%.*]] = load <4 x i32>, ptr [[PTR]], align 4
+; CHECK-NEXT:    store <4 x i32> zeroinitializer, ptr [[PTR]], align 4
+; CHECK-NEXT:    ret void
+;
+entry:
+  %ptr = invoke ptr null(i64 0) to label %body.1 unwind label %ehcleanup
+
+body.2:
+  %v0 = load <4 x i32>, ptr %2, align 4
+  store <4 x i32> zeroinitializer, ptr %2, align 4
+  ret void
+
+ehcleanup:
+  %1 = cleanuppad within none []
+  cleanupret from %1 unwind to caller
+
+body.1:
+  %2 = getelementptr { i32 }, ptr %ptr, i64 0, i32 0
+  store <4 x i32> zeroinitializer, ptr %2, align 4
+  br label %body.2
+}

@@ -5790,7 +5790,8 @@ static BasicBlock::iterator findInsertPos(Value *Addr, Instruction *MemoryInst,
// instruction after it.
if (SunkAddr) {
if (Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr))
return std::next(AddrInst->getIterator());
return AddrInst->isTerminator() ? MemoryInst->getIterator()
: std::next(AddrInst->getIterator());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found Invoke instruction at this position is already wrong. The problem is at

      if (!ResultIndex) {
        SunkAddr = ResultPtr;
      } else {
  1. We should make sure ResultPtr is in current BB before directly using it as SunkAddr.
  2. If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.
  3. Invoke instruction is not a good candidate for SunkAddr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure ResultPtr is in current BB before directly using it as SunkAddr.

We use global pointers safely because we can't get iterator in a BB for them. Do we want to handle them similarly to instructions in other blocks?

If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.

Clone -- do you mean create GEP with 0?

Invoke instruction is not a good candidate for SunkAddr.

Not obvious to me, why is it different from instructions in other blocks or from globals?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @weiguozhi . From looking at the test case, why are we trying to sink the GEP upwards toward the entry block? It seems like something has gone wrong earlier.

The proposed fix is like adding a null check when the correct fix was to avoid passing null at a higher level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed fix is like adding a null check when the correct fix was to avoid passing null at a higher level.

Yeah, I agree but the same we can say about checking on Instruction to bypass arguments and globals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of optimizeMemoryInst is to find address computation in PrevBB, and the address is used in a memory access instruction in a SuccBB, it then duplicate the address computation in SuccBB, SelectionDAG can fold the address computation into memory access.

We should make sure ResultPtr is in current BB before directly using it as SunkAddr.

We use global pointers safely because we can't get iterator in a BB for them. Do we want to handle them similarly to instructions in other blocks?

Global pointers is absolutely OK in a memory access instruction, but it doesn't fit into this optimization.

If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.

Clone -- do you mean create GEP with 0?

Duplicate the address computation instruction. (thus SunkAddr)

Invoke instruction is not a good candidate for SunkAddr.

Not obvious to me, why is it different from instructions in other blocks or from globals?

We can't fold Invoke instruction into a memory access instruction.

// this pointer comes from a different from the current basic block we
// need to know how to recreate it in another basic block.
// Currently we don't support recreation of any of instruction.
if (PtrInst && PtrInst->getParent() != MemoryInst->getParent())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can change the predicate to

if (!PtrInst || PtrInst->getParent() != MemoryInst->getParent())

To skip arguments and global pointers. I'm not sure if it is completely correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this condition. We should not modify a memory access instruction with pointer comes from argument or global value directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I also added tests for arguments and globals. @weiguozhi you may be interested that we don't reuse addresses in these cases: https://godbolt.org/z/q8fzce4ov

@@ -6099,6 +6099,13 @@ bool CodeGenPrepare::optimizeMemoryInst(Instruction *MemoryInst, Value *Addr,
}

if (!ResultIndex) {
auto PtrInst = dyn_cast<Instruction>(ResultPtr);
// Here we know that we have just a pointer without any offsets. If
// this pointer comes from a different from the current basic block we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different BB

@e-kud e-kud changed the title [CodeGenPrepare] Handle address sinking obtained from invoke [CodeGenPrepare] Filter out unrecreatable addresses from memory optimization Jun 13, 2025
Copy link
Contributor

@weiguozhi weiguozhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks for the fix.

@e-kud
Copy link
Contributor Author

e-kud commented Jun 26, 2025

@weiguozhi I think we need to skip only instructions. There are two failed test cases:

 LLVM :: Transforms/CodeGenPrepare/ARM/dead-gep.ll
 LLVM :: Transforms/CodeGenPrepare/NVPTX/dont-introduce-addrspacecast.ll

First one is about propagating undef base that is Value. However for the second we duplicate series of inttoptr ptrtoint casts because base is an argument, looks like the expected change.

Or maybe bypass constants additionally to instructions inside the same block?

@e-kud e-kud requested a review from weiguozhi June 26, 2025 21:52
@weiguozhi
Copy link
Contributor

Then it's OK to skip instructions only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants