Skip to content

[InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP #135155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 14, 2025

Conversation

bjope
Copy link
Collaborator

@bjope bjope commented Apr 10, 2025

Given that we have a "add nuw" and a "getelementptr inbounds nuw" like this:
%idx = add nuw i64 %idx1, %idx2
%gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that into two getelementptr instructions:
%gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
%gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw" instead of "inbounds nuw" on the getelementptr.

Proof: https://alive2.llvm.org/ce/z/4uhfDq

@bjope bjope requested a review from nikic as a code owner April 10, 2025 10:42
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Apr 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 10, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Björn Pettersson (bjope)

Changes

Given that we have a "add nuw" and a "getelementptr inbounds nuw" like this:
%idx = add nuw i64 %idx1, %idx2
%gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that into two getelementptr instructions:
%gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
%gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw" instead of "inbounds nuw" on the getelementptr.

Proof: https://alive2.llvm.org/ce/z/4uhfDq


Full diff: https://github.com/llvm/llvm-project/pull/135155.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+26-17)
  • (modified) llvm/test/Transforms/InstCombine/array.ll (+4-6)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 856e02c9f1ddb..19a818f4baa30 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -3087,12 +3087,22 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
     return nullptr;
 
   if (GEP.getNumIndices() == 1) {
-    // We can only preserve inbounds if the original gep is inbounds, the add
-    // is nsw, and the add operands are non-negative.
-    auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
+    auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1,
+                                      Value *Idx2) {
+      // Preserve "inbounds nuw" if the original gep is "inbounds nuw",
+      // and the add is "nuw".
+      if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
+        return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();
+      // Preserve "inbounds" if the original gep is "inbounds", the add
+      // is "nsw", and the add operands are non-negative.
       SimplifyQuery Q = SQ.getWithInstruction(&GEP);
-      return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
-             isKnownNonNegative(Idx2, Q);
+      if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
+          isKnownNonNegative(Idx2, Q))
+        return GEPNoWrapFlags::inBounds();
+      // Preserve "nuw" if the original gep is "nuw", and the add is "nuw".
+      if (GEP.hasNoUnsignedWrap() && AddIsNUW)
+        return GEPNoWrapFlags::noUnsignedWrap();
+      return GEPNoWrapFlags::none();
     };
 
     // Try to replace ADD + GEP with GEP + GEP.
@@ -3104,15 +3114,15 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
       // as:
       //   %newptr = getelementptr i32, ptr %ptr, i64 %idx1
       //   %newgep = getelementptr i32, ptr %newptr, i64 %idx2
-      bool IsInBounds = CanPreserveInBounds(
-          cast<OverflowingBinaryOperator>(GEP.getOperand(1))->hasNoSignedWrap(),
-          Idx1, Idx2);
+      bool NSW = match(GEP.getOperand(1), m_NSWAddLike(m_Value(), m_Value()));
+      bool NUW = match(GEP.getOperand(1), m_NUWAddLike(m_Value(), m_Value()));
+      GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(NSW, NUW, Idx1, Idx2);
       auto *NewPtr =
           Builder.CreateGEP(GEP.getSourceElementType(), GEP.getPointerOperand(),
-                            Idx1, "", IsInBounds);
-      return replaceInstUsesWith(
-          GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Idx2, "",
-                                 IsInBounds));
+                            Idx1, "", NWFlags);
+      return replaceInstUsesWith(GEP,
+                                 Builder.CreateGEP(GEP.getSourceElementType(),
+                                                   NewPtr, Idx2, "", NWFlags));
     }
     ConstantInt *C;
     if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAdd(
@@ -3123,17 +3133,16 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
       // as:
       // %newptr = getelementptr i32, ptr %ptr, i32 %idx1
       // %newgep = getelementptr i32, ptr %newptr, i32 idx2
-      bool IsInBounds = CanPreserveInBounds(
-          /*IsNSW=*/true, Idx1, C);
+      GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(
+          /*IsNSW=*/true, /*IsNUW=*/false, Idx1, C);
       auto *NewPtr = Builder.CreateGEP(
           GEP.getSourceElementType(), GEP.getPointerOperand(),
-          Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "",
-          IsInBounds);
+          Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", NWFlags);
       return replaceInstUsesWith(
           GEP,
           Builder.CreateGEP(GEP.getSourceElementType(), NewPtr,
                             Builder.CreateSExt(C, GEP.getOperand(1)->getType()),
-                            "", IsInBounds));
+                            "", NWFlags));
     }
   }
 
diff --git a/llvm/test/Transforms/InstCombine/array.ll b/llvm/test/Transforms/InstCombine/array.ll
index 763c6e77f89ee..5d389958173a5 100644
--- a/llvm/test/Transforms/InstCombine/array.ll
+++ b/llvm/test/Transforms/InstCombine/array.ll
@@ -122,12 +122,11 @@ define ptr @gep_inbounds_nuwaddlike(ptr %ptr, i64 %a, i64 %b) {
   ret ptr %gep
 }
 
-; FIXME: Preserve "inbounds nuw".
 define ptr @gep_inbounds_add_nuw(ptr %ptr, i64 %a, i64 %b) {
 ; CHECK-LABEL: define ptr @gep_inbounds_add_nuw(
 ; CHECK-SAME: ptr [[PTR:%.*]], i64 [[A:%.*]], i64 [[B:%.*]]) {
-; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i32, ptr [[PTR]], i64 [[A]]
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i32, ptr [[TMP1]], i64 [[B]]
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds nuw i32, ptr [[PTR]], i64 [[A]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[TMP1]], i64 [[B]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %add = add nuw i64 %a, %b
@@ -135,12 +134,11 @@ define ptr @gep_inbounds_add_nuw(ptr %ptr, i64 %a, i64 %b) {
   ret ptr %gep
 }
 
-; FIXME: Preserve "nuw".
 define ptr @gep_add_nuw(ptr %ptr, i64 %a, i64 %b) {
 ; CHECK-LABEL: define ptr @gep_add_nuw(
 ; CHECK-SAME: ptr [[PTR:%.*]], i64 [[A:%.*]], i64 [[B:%.*]]) {
-; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i32, ptr [[PTR]], i64 [[A]]
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i32, ptr [[TMP1]], i64 [[B]]
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i32, ptr [[PTR]], i64 [[A]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr nuw i32, ptr [[TMP1]], i64 [[B]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %add = add nuw i64 %a, %b

// We can only preserve inbounds if the original gep is inbounds, the add
// is nsw, and the add operands are non-negative.
auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this to GetPreservedNoWrapFlags or something.

Comment on lines 3094 to 3095
if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();
if (GEP.hasNoUnsignedWrap() && AddIsNUW)
return GEP.getNoWrapFlags();

Would this work to subsume both this case and the only nuw one below?

isKnownNonNegative(Idx2, Q);
if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
isKnownNonNegative(Idx2, Q))
return GEPNoWrapFlags::inBounds();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually still necessary to explicitly handle this case? If we have an add nsw with nonneg operands, I think we should infer nuw on both add and gep and can then use the new code path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we can probably skip it, but then we need to move the "nusw + nneg -> nuw" case (from line 3171) to make sure we infer nuw on the gep prior to this pattern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or well, we also need to update the pattern below that involve a sext operation. That call to CanPreserveNoWrapFlags is currently not passing AddIsNUW.

Copy link

github-actions bot commented Apr 10, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

if (GEP.hasNoUnsignedWrap() && AddIsNUW) {
Flags |= GEPNoWrapFlags::noUnsignedWrap();
if (GEP.isInBounds())
Flags |= GEPNoWrapFlags::inBounds();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here, we should just be returning the original Flags instead of reconstructing them. Note that this works for all of nuw, inbounds nuw and nusw nuw: https://alive2.llvm.org/ce/z/QSweWW

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to 16-bit ptr/index size in your example seem to indicate that this isn't valid for just nuw (at least no unless we check that the types used for the offset satisfy the index size): https://alive2.llvm.org/ce/z/oyAv6T

My idea of reconstructing flags was to make the code more defensive. I have no clue what flags will be added in the future. If someone adds a "PointerIsEven" flag (or whatever) in the future, it might be wrong to preserve all flags. So I used a more defensive programming style to only preserve the flags that I explicitly handle. But maybe that isn't the LLVM style of programming? But maybe adding a helper in GEPNoWrapFlags (a reverse version of intersectForOffsetAdd) would be better then. Then one would need to consider updating that helper if adding/removing flags in GEPNoWrapFlags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to 16-bit ptr/index size in your example seem to indicate that this isn't valid for just nuw (at least no unless we check that the types used for the offset satisfy the index size): https://alive2.llvm.org/ce/z/oyAv6T

If you don't match the index size, you are introducing an implicit sext() on the indices. The transform is not valid in that case.

My idea of reconstructing flags was to make the code more defensive. I have no clue what flags will be added in the future. If someone adds a "PointerIsEven" flag (or whatever) in the future, it might be wrong to preserve all flags. So I used a more defensive programming style to only preserve the flags that I explicitly handle. But maybe that isn't the LLVM style of programming? But maybe adding a helper in GEPNoWrapFlags (a reverse version of intersectForOffsetAdd) would be better then. Then one would need to consider updating that helper if adding/removing flags in GEPNoWrapFlags.

I don't think we should try too hard to predict the future and go with the simple solution here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't match the index size, you are introducing an implicit sext() on the indices. The transform is not valid in that case.
Oh, right, the transform isn't even valid if dropping the flags in that case.

It puzzeled me for awhile how implicit sext/trunc of indices would impact these rewrites (also for the legacy code with NSW considering that there are no checks on index size). Now I found out that we make sure that the gep index operand is using the correct index type size earlier in this function. So we can expect that we match with the index size when doing the rewrites into GEP+GEP here.

return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
isKnownNonNegative(Idx2, Q);
auto GetPreservedNoWrapFlags = [&](bool AddIsNUW, Value *Idx1,
Value *Idx2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to pass Idx1 / Idx2 anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Done.

; CHECK-NEXT: [[TMP1:%.*]] = getelementptr i32, ptr [[PTR]], i64 [[A]]
; CHECK-NEXT: [[GEP:%.*]] = getelementptr i32, ptr [[TMP1]], i64 [[B]]
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds nuw i32, ptr [[PTR]], i64 [[A]]
; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[TMP1]], i64 [[B]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add a test for nusw nuw for completeness?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Added @gep_inbounds_add_nusw_nuw.

@bjope bjope force-pushed the users/bjope/addlikegep_2 branch from 4895b27 to 51ec9b0 Compare April 11, 2025 11:00
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

return &GEP;
}

// These rewrites is trying to preserve inbounds/nuw attributes. So we want to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// These rewrites is trying to preserve inbounds/nuw attributes. So we want to
// These rewrites are trying to preserve inbounds/nuw attributes. So we want to

return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
isKnownNonNegative(Idx2, Q);
// Given (gep p, x+y) we want to determine the common nowrap flags for both
// gep if transforming into (gep (gep p, x), y).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// gep if transforming into (gep (gep p, x), y).
// geps if transforming into (gep (gep p, x), y).

Base automatically changed from users/bjope/addlikegep_1 to main April 14, 2025 07:34
bjope added 7 commits April 14, 2025 09:50
Given that we have a "add nuw" and a "getelementptr inbounds nuw" like
this:
   %idx = add nuw i64 %idx1, %idx2
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that
into two getelementptr instructions:
   %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw" or "nusw nuw" instead of "inbounds nuw"
on the getelementptr.

Proof: https://alive2.llvm.org/ce/z/QSweWW
- Make sure we try to derive inbound + nuw before doing the
  ADD+GEP->GEP+GEP rewrites. This to make it possible to rely
  on "nuw" being present when trying to preserve flags.

- Make sure the special ADDNSW+SEXTLIKE+GEP->GEP+GEP case is checking
  for ADDNUW+ZEXTNNEG+GEP->GEP+GEP when preserving flags.
  This way we do not need to check for non-negative operands.
@bjope bjope force-pushed the users/bjope/addlikegep_2 branch from eb5fbb8 to dd8cdaf Compare April 14, 2025 08:31
@bjope bjope merged commit 29555ad into main Apr 14, 2025
7 of 11 checks passed
@bjope bjope deleted the users/bjope/addlikegep_2 branch April 14, 2025 09:03
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…lvm#135155)

Given that we have a "add nuw" and a "getelementptr inbounds nuw" like
this:
   %idx = add nuw i64 %idx1, %idx2
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that into
two getelementptr instructions:
   %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw", and "nusw nuw" instead of "inbounds nuw"
on the getelementptr.

Proof: https://alive2.llvm.org/ce/z/QSweWW
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants