[SeparateConstOffsetFromGEP] propagate const offset through GEP chains #143470

AlexMaclean · 2025-06-10T02:36:16Z

When separating the constant offset from a GEP, if the pointer operand is a constant ptradd (likely generated when we performed this transform on that GEP), we accumulate the offset into the current offset. This ensures that when there is a chain of GEPs the constant offset reaches the final memory instruction where it can likely be folded into the addressing.

llvmbot · 2025-06-10T02:36:51Z

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-nvptx

Author: Alex MacLean (AlexMaclean)

Changes

When separating the constant offset from a GEP, if the pointer operand is a constant ptradd (likely generated when we performed this transform on that GEP), we accumulate the offset into the current offset. This ensures that when there is a chain of GEPs the constant offset reaches the final memory instruction where it can likely be folded into the addressing.

Patch is 20.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143470.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp (+37-11)
(added) llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/gep-chain.ll (+383)

diff --git a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
index 320b79203c0b3..b478c72a14a71 100644
--- a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+++ b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
@@ -1048,19 +1048,31 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
   if (GEP->getType()->isVectorTy())
     return false;
 
+  // If the base of this GEP is a ptradd of a constant, lets pass the constant
+  // along. This ensures that when we have a chain of GEPs the constant
+  // offset from each is accumulated.
+  Value *NewBase;
+  const APInt *BaseOffset;
+  const bool ExtractBase =
+      match(GEP->getPointerOperand(),
+            m_PtrAdd(m_Value(NewBase), m_APInt(BaseOffset)));
+
+  const int64_t BaseByteOffset = ExtractBase ? BaseOffset->getSExtValue() : 0;
+
   // The backend can already nicely handle the case where all indices are
   // constant.
-  if (GEP->hasAllConstantIndices())
+  if (GEP->hasAllConstantIndices() && !ExtractBase)
     return false;
 
   bool Changed = canonicalizeArrayIndicesToIndexSize(GEP);
 
   bool NeedsExtraction;
-  int64_t AccumulativeByteOffset = accumulateByteOffset(GEP, NeedsExtraction);
+  int64_t AccumulativeByteOffset =
+      BaseByteOffset + accumulateByteOffset(GEP, NeedsExtraction);
 
   TargetTransformInfo &TTI = GetTTI(*GEP->getFunction());
 
-  if (!NeedsExtraction) {
+  if (!NeedsExtraction && !ExtractBase) {
     Changed |= reorderGEP(GEP, TTI);
     return Changed;
   }
@@ -1084,7 +1096,9 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
 
   // Track information for preserving GEP flags.
   bool AllOffsetsNonNegative = AccumulativeByteOffset >= 0;
-  bool AllNUWPreserved = true;
+  bool AllNUWPreserved = GEP->hasNoUnsignedWrap();
+  bool NewGEPInBounds = GEP->isInBounds();
+  bool NewGEPNUSW = GEP->hasNoUnsignedSignedWrap();
 
   // Remove the constant offset in each sequential index. The resultant GEP
   // computes the variadic base.
@@ -1120,6 +1134,16 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
       }
     }
   }
+  if (ExtractBase) {
+    GEPOperator *Base = cast<GEPOperator>(GEP->getPointerOperand());
+    AllNUWPreserved &= Base->hasNoUnsignedWrap();
+    NewGEPInBounds &= Base->isInBounds();
+    NewGEPNUSW &= Base->hasNoUnsignedSignedWrap();
+    AllOffsetsNonNegative &= BaseByteOffset >= 0;
+
+    GEP->setOperand(0, NewBase);
+    RecursivelyDeleteTriviallyDeadInstructions(Base);
+  }
 
   // Clear the inbounds attribute because the new index may be off-bound.
   // e.g.,
@@ -1147,7 +1171,7 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
 
   // If the initial GEP was NUW and all operations that we reassociate were NUW
   // additions, the resulting GEPs are also NUW.
-  if (GEP->hasNoUnsignedWrap() && AllNUWPreserved) {
+  if (AllNUWPreserved) {
     NewGEPFlags |= GEPNoWrapFlags::noUnsignedWrap();
     // If the initial GEP additionally had NUSW (or inbounds, which implies
     // NUSW), we know that the indices in the initial GEP must all have their
@@ -1155,13 +1179,13 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
     // add-operands therefore also don't have their signbit set. Therefore, all
     // indices of the resulting GEPs are non-negative -> we can preserve
     // the inbounds/nusw flag.
-    CanPreserveInBoundsNUSW |= GEP->hasNoUnsignedSignedWrap();
+    CanPreserveInBoundsNUSW |= NewGEPNUSW;
   }
 
   if (CanPreserveInBoundsNUSW) {
-    if (GEP->isInBounds())
+    if (NewGEPInBounds)
       NewGEPFlags |= GEPNoWrapFlags::inBounds();
-    else if (GEP->hasNoUnsignedSignedWrap())
+    else if (NewGEPNUSW)
       NewGEPFlags |= GEPNoWrapFlags::noUnsignedSignedWrap();
   }
 
@@ -1242,11 +1266,13 @@ bool SeparateConstOffsetFromGEP::run(Function &F) {
 
   DL = &F.getDataLayout();
   bool Changed = false;
-  for (BasicBlock &B : F) {
-    if (!DT->isReachableFromEntry(&B))
+
+  ReversePostOrderTraversal<Function *> RPOT(&F);
+  for (BasicBlock *B : RPOT) {
+    if (!DT->isReachableFromEntry(B))
       continue;
 
-    for (Instruction &I : llvm::make_early_inc_range(B))
+    for (Instruction &I : llvm::make_early_inc_range(*B))
       if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(&I))
         Changed |= splitGEP(GEP);
     // No need to split GEP ConstantExprs because all its indices are constant
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/gep-chain.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/gep-chain.ll
new file mode 100644
index 0000000000000..2587fd26fdef2
--- /dev/null
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/gep-chain.ll
@@ -0,0 +1,383 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=nvptx64-nvidia-cuda -S -passes=separate-const-offset-from-gep  < %s | FileCheck %s
+
+%struct.uchar4 = type { i8, i8, i8, i8 }
+
+define ptr @basic(ptr %ptr, i64 %offset1, i64 %offset2) {
+; CHECK-LABEL: define ptr @basic(
+; CHECK-SAME: ptr [[PTR:%.*]], i64 [[OFFSET1:%.*]], i64 [[OFFSET2:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr [[STRUCT_UCHAR4:%.*]], ptr [[PTR]], i64 [[OFFSET1]]
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr [[STRUCT_UCHAR4]], ptr [[TMP1]], i64 [[OFFSET2]]
+; CHECK-NEXT:    [[GEP24:%.*]] = getelementptr i8, ptr [[TMP2]], i64 72
+; CHECK-NEXT:    ret ptr [[GEP24]]
+;
+  %offset3 = add i64 %offset1, 8
+  %gep1 = getelementptr %struct.uchar4, ptr %ptr, i64 %offset3
+  %offset4 = add i64 %offset2, 10
+  %gep2 = getelementptr %struct.uchar4, ptr %gep1, i64 %offset4
+  ret ptr %gep2
+}
+
+define i32 @more_interesting(ptr %ptr, i32 %offset1, i32 %offset2) {
+; CHECK-LABEL: define i32 @more_interesting(
+; CHECK-SAME: ptr [[PTR:%.*]], i32 [[OFFSET1:%.*]], i32 [[OFFSET2:%.*]]) {
+; CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[OFFSET1]] to i64
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds [[STRUCT_UCHAR4:%.*]], ptr [[PTR]], i64 [[IDXPROM]]
+; CHECK-NEXT:    [[IDXPROM1:%.*]] = sext i32 [[OFFSET2]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds [[STRUCT_UCHAR4]], ptr [[GEP1]], i64 [[IDXPROM1]]
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i64 8
+; CHECK-NEXT:    [[V1:%.*]] = load i32, ptr [[TMP2]], align 4
+; CHECK-NEXT:    [[IDXPROM2:%.*]] = sext i32 [[OFFSET2]] to i64
+; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_UCHAR4]], ptr [[TMP1]], i64 [[IDXPROM2]]
+; CHECK-NEXT:    [[V2:%.*]] = load i32, ptr [[TMP4]], align 4
+; CHECK-NEXT:    [[R:%.*]] = add i32 [[V1]], [[V2]]
+; CHECK-NEXT:    ret i32 [[R]]
+;
+  %gep1 = getelementptr inbounds %struct.uchar4, ptr %ptr, i32 %offset1
+  %gep2 = getelementptr inbounds nuw i8, ptr %gep1, i32 8
+  %gep3 = getelementptr inbounds %struct.uchar4, ptr %gep2, i32 %offset2
+  %v1 = load i32, ptr %gep3, align 4
+  %gep4 = getelementptr inbounds i8, ptr %gep3, i32 -8
+  %gep5 = getelementptr inbounds %struct.uchar4, ptr %gep4, i32 %offset2
+  %v2 = load i32, ptr %gep5, align 4
+  %r = add i32 %v1, %v2
+  ret i32 %r
+}
+
+;; Check nuw/nusw/inbounds flag propagation
+
+; GEPs with nusw flag. All indices and offsets are non-negative.
+define ptr @test_0(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_0(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add i64 %i.prom, 1
+  %arrayidx2 = getelementptr nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with inbounds flag. All indices and offsets are non-negative.
+define ptr @test_1(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_1(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr inbounds i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add i64 %i.prom, 1
+  %arrayidx2 = getelementptr nuw inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nusw flag. All indices and offsets are non-negative.
+define ptr @test_2(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_2(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nusw i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, 1
+  %arrayidx2 = getelementptr nuw inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with inbounds flag. All indices and offsets are non-negative.
+define ptr @test_3(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_3(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, 1
+  %arrayidx2 = getelementptr inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nusw flag. All indices and offsets are non-negative.
+define ptr @test_4(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_4(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add i64 %i.prom, 1
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with inbounds flag. All indices and offsets are non-negative.
+define ptr @test_5(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_5(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add i64 %i.prom, 1
+  %arrayidx2 = getelementptr nuw inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Negative offsets.
+define ptr @test_6(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_6(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 -11
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 -1
+  %idx2 = add nuw i64 %i, -10
+  %arrayidx2 = getelementptr nuw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. Negative offsets.
+define ptr @test_7(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_7(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 -11
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 -1
+  %idx2 = add nuw i64 %i, -10
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Mixed positive/negative offsets.
+define ptr @test_8(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_8(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 -9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw i8, ptr %p, i32 1
+  %idx2 = add nuw i64 %i, -10
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. Mixed positive/negative offsets.
+define ptr @test_9(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_9(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 -9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 1
+  %idx2 = add nuw i64 %i, -10
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Mixed negative/positive offsets.
+define ptr @test_10(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_10(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw i8, ptr %p, i32 -1
+  %idx2 = add nuw i64 %i, 10
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. Mixed negative/positive offsets.
+define ptr @test_11(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_11(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 -1
+  %idx2 = add nuw i64 %i, 10
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. All positive offsets.
+define ptr @test_12(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_12(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 1
+  %idx2 = add nuw i64 %i, 1
+  %arrayidx2 = getelementptr nuw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. All positive offsets.
+define ptr @test_13(ptr %p, i64 %i) {
+; CHECK-LABEL: define ptr @test_13(
+; CHECK-SAME: ptr [[P:%.*]], i64 [[I:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw inbounds i8, ptr %p, i32 1
+  %idx2 = add nuw i64 %i, 1
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Zext index with negative offsets.
+define ptr @test_14(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_14(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 -11
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 -1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, -10
+  %arrayidx2 = getelementptr nuw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. Zext index with negative offsets.
+define ptr @test_15(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_15(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 -11
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 -1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, -10
+  %arrayidx2 = getelementptr nuw inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Zext index with mixed positive/negative offsets.
+define ptr @test_16(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_16(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 -9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, -10
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. Zext index with mixed positive/negative offsets.
+define ptr @test_17(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_17(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 -9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, -10
+  %arrayidx2 = getelementptr nuw inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Zext index with mixed negative/positive offsets.
+define ptr @test_18(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_18(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 -1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, 10
+  %arrayidx2 = getelementptr nuw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw and nusw flags. Zext index with mixed negative/positive offsets.
+define ptr @test_19(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_19(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nusw nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nusw nuw i8, ptr [[TMP1]], i64 9
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw nusw i8, ptr %p, i32 -1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, 10
+  %arrayidx2 = getelementptr nuw inbounds i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
+}
+
+; GEPs with nuw flag. Zext index with all positive offsets.
+define ptr @test_20(ptr %p, i32 %i) {
+; CHECK-LABEL: define ptr @test_20(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[I:%.*]]) {
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr nuw i8, ptr [[P]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr nuw i8, ptr [[TMP1]], i64 2
+; CHECK-NEXT:    ret ptr [[ARRAYIDX22]]
+;
+  %ptradd = getelementptr nuw i8, ptr %p, i32 1
+  %i.prom = zext i32 %i to i64
+  %idx2 = add nuw i64 %i.prom, 1
+  %arrayidx2 = getelementptr nuw nusw i8, ptr %ptradd, i64 %idx2
+  ret ptr %arrayidx2
...
[truncated]

nikic

Looks like there are some test failures.

This looks plausible, but I also feel like this is something we should be doing in InstCombine? (That is, trying to combine multiple constant offset GEPs into one, even if there are intermediate GEPs, as long as use-constraints allow.)

nikic · 2025-06-10T10:38:04Z

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp

-  for (BasicBlock &B : F) {
-    if (!DT->isReachableFromEntry(&B))
+
+  ReversePostOrderTraversal<Function *> RPOT(&F);


How is this related to the rest of the patch?

The intent is to ensure that when there is a chain of GEPs, we always separate constant offsets in order. This way the logic to pick up the constant offset from the first GEP and pass it along through the second always works as expected. My understanding is that a standard iteration doesn't have very strong guarantees about the ordering of blocks and a RPOT would ensure GEPs are transformed in order.

AlexMaclean · 2025-06-10T16:52:53Z

This looks plausible, but I also feel like this is something we should be doing in InstCombine? (That is, trying to combine multiple constant offset GEPs into one, even if there are intermediate GEPs, as long as use-constraints allow.)

As things are currently divided I think it makes a lot of sense to keep this logic here. This pass isn't run by most targets and it's behavior is very dependent on TTI.isLegalAddressingMode. Since we assume that the constant offsets we're generating will be used in the addressing of some memory instruction we're also less concerned about use constrains then we would be in InstCombine. While I agree there are likely some limited ways to do similar transformations in IC, here I think its safe to be a bit more aggressive and rely on target-specific information.

AlexMaclean · 2025-06-17T14:25:50Z

@nikic @jrbyrnes @krzysz00 Ping for review

krzysz00

I can't see anything wrong here, so week approve from me (I'd give it a bit of time to see of other folks have thoughts)

nikic · 2025-06-17T14:49:55Z

This looks plausible, but I also feel like this is something we should be doing in InstCombine? (That is, trying to combine multiple constant offset GEPs into one, even if there are intermediate GEPs, as long as use-constraints allow.)

As things are currently divided I think it makes a lot of sense to keep this logic here. This pass isn't run by most targets and it's behavior is very dependent on TTI.isLegalAddressingMode. Since we assume that the constant offsets we're generating will be used in the addressing of some memory instruction we're also less concerned about use constrains then we would be in InstCombine. While I agree there are likely some limited ways to do similar transformations in IC, here I think its safe to be a bit more aggressive and rely on target-specific information.

I don't disagree. My general thinking here was more along the lines that we should try do canonicalize this as far as we can in InstCombine, and have SeparateConstOffsetFromGEP pick up the parts that InstCombine can't do (such as use constraints, as you mention.)

We're moving towards splitting up GEPs in InstCombine (see #137297 for a first step), which will increase our ability to make these kind of transforms at that level.

AlexMaclean added 2 commits June 6, 2025 18:56

pre-commit tests

c778ecd

[SeparateConstOffsetFromGEP] propogate const offset through GEP chains

35cbb92

AlexMaclean requested review from nikic, krzysz00 and jrbyrnes June 10, 2025 02:36

AlexMaclean self-assigned this Jun 10, 2025

llvmbot added backend:NVPTX llvm:transforms labels Jun 10, 2025

nikic reviewed Jun 10, 2025

View reviewed changes

fixup tests

2de8f9b

llvmbot added backend:AMDGPU backend:PowerPC labels Jun 10, 2025

AlexMaclean requested review from arsenm and Artem-B June 17, 2025 14:26

krzysz00 approved these changes Jun 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SeparateConstOffsetFromGEP] propagate const offset through GEP chains #143470

[SeparateConstOffsetFromGEP] propagate const offset through GEP chains #143470

Uh oh!

AlexMaclean commented Jun 10, 2025

Uh oh!

llvmbot commented Jun 10, 2025 •

edited

Loading

Uh oh!

nikic left a comment

Uh oh!

nikic Jun 10, 2025

Uh oh!

AlexMaclean Jun 10, 2025

Uh oh!

AlexMaclean commented Jun 10, 2025

Uh oh!

AlexMaclean commented Jun 17, 2025

Uh oh!

krzysz00 left a comment

Uh oh!

nikic commented Jun 17, 2025

Uh oh!

Uh oh!

[SeparateConstOffsetFromGEP] propagate const offset through GEP chains #143470

Are you sure you want to change the base?

[SeparateConstOffsetFromGEP] propagate const offset through GEP chains #143470

Uh oh!

Conversation

AlexMaclean commented Jun 10, 2025

Uh oh!

llvmbot commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

nikic Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

AlexMaclean Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

AlexMaclean commented Jun 10, 2025

Uh oh!

AlexMaclean commented Jun 17, 2025

Uh oh!

krzysz00 left a comment

Choose a reason for hiding this comment

Uh oh!

nikic commented Jun 17, 2025

Uh oh!

Uh oh!

llvmbot commented Jun 10, 2025 •

edited

Loading