Skip to content

[llvm] Fix __builtin_object_size interaction between Negative Offset … #111827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 2, 2024

Conversation

serge-sans-paille
Copy link
Collaborator

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior differs if we're going to apply a constant offset that's positive or negative: If it's positive, then we need to compare the remaining bytes (i.e. Size

  • Offset), but if it's negative, we need to compare the preceding bytes (i.e. Offset).

Fix #111709

@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: None (serge-sans-paille)

Changes

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior differs if we're going to apply a constant offset that's positive or negative: If it's positive, then we need to compare the remaining bytes (i.e. Size

  • Offset), but if it's negative, we need to compare the preceding bytes (i.e. Offset).

Fix #111709


Full diff: https://github.com/llvm/llvm-project/pull/111827.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+2)
  • (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+45-13)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+107)
diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index 7b48844cc9e8e9..01c642d4f48abd 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -160,6 +160,7 @@ struct ObjectSizeOpts {
   /// though they can't be evaluated. Otherwise, null is always considered to
   /// point to a 0 byte region of memory.
   bool NullIsUnknownSize = false;
+
   /// If set, used for more accurate evaluation
   AAResults *AA = nullptr;
 };
@@ -230,6 +231,7 @@ class ObjectSizeOffsetVisitor
   ObjectSizeOpts Options;
   unsigned IntTyBits;
   APInt Zero;
+  APInt ConstantOffset;
   SmallDenseMap<Instruction *, SizeOffsetAPInt, 8> SeenInsts;
   unsigned InstructionsVisited;
 
diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp b/llvm/lib/Analysis/MemoryBuiltins.cpp
index e1abf5e4d885ec..eb6e139a6b9d6a 100644
--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -570,6 +570,14 @@ static APInt getSizeWithOverflow(const SizeOffsetAPInt &Data) {
   return Size - Offset;
 }
 
+static APInt getOffsetWithOverflow(const SizeOffsetAPInt &Data) {
+  APInt Size = Data.Size;
+  APInt Offset = Data.Offset;
+  if (Offset.isNegative())
+    return APInt(Size.getBitWidth(), 0);
+  return Offset;
+}
+
 /// Compute the size of the object pointed by Ptr. Returns true and the
 /// object size in Size if successful, and false otherwise.
 /// If RoundToAlign is true, then Size is rounded up to the alignment of
@@ -697,7 +705,8 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value *V) {
   // the index type size and if we stripped address space casts we have to
   // readjust the APInt as we pass it upwards in order for the APInt to match
   // the type the caller passed in.
-  APInt Offset(InitialIntTyBits, 0);
+
+  APInt Offset = APInt{InitialIntTyBits, 0};
   V = V->stripAndAccumulateConstantOffsets(
       DL, Offset, /* AllowNonInbounds */ true, /* AllowInvariantGroup */ true);
 
@@ -706,7 +715,9 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value *V) {
   IntTyBits = DL.getIndexTypeSizeInBits(V->getType());
   Zero = APInt::getZero(IntTyBits);
 
+  std::swap(Offset, ConstantOffset);
   SizeOffsetAPInt SOT = computeValue(V);
+  std::swap(Offset, ConstantOffset);
 
   bool IndexTypeSizeChanged = InitialIntTyBits != IntTyBits;
   if (!IndexTypeSizeChanged && Offset.isZero())
@@ -981,18 +992,39 @@ ObjectSizeOffsetVisitor::combineSizeOffset(SizeOffsetAPInt LHS,
                                            SizeOffsetAPInt RHS) {
   if (!LHS.bothKnown() || !RHS.bothKnown())
     return ObjectSizeOffsetVisitor::unknown();
-
-  switch (Options.EvalMode) {
-  case ObjectSizeOpts::Mode::Min:
-    return (getSizeWithOverflow(LHS).slt(getSizeWithOverflow(RHS))) ? LHS : RHS;
-  case ObjectSizeOpts::Mode::Max:
-    return (getSizeWithOverflow(LHS).sgt(getSizeWithOverflow(RHS))) ? LHS : RHS;
-  case ObjectSizeOpts::Mode::ExactSizeFromOffset:
-    return (getSizeWithOverflow(LHS).eq(getSizeWithOverflow(RHS)))
-               ? LHS
-               : ObjectSizeOffsetVisitor::unknown();
-  case ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset:
-    return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
+  // If the ConstantOffset we add in the end is negative, then we're actually
+  // interested in selecting the nodes based on their offset rather than their
+  // size.
+  if (ConstantOffset.isNegative()) {
+    switch (Options.EvalMode) {
+    case ObjectSizeOpts::Mode::Min:
+      return (getOffsetWithOverflow(LHS).slt(getOffsetWithOverflow(RHS))) ? LHS
+                                                                          : RHS;
+    case ObjectSizeOpts::Mode::Max:
+      return (getOffsetWithOverflow(LHS).sgt(getOffsetWithOverflow(RHS))) ? LHS
+                                                                          : RHS;
+    case ObjectSizeOpts::Mode::ExactSizeFromOffset:
+      return (getOffsetWithOverflow(LHS).eq(getOffsetWithOverflow(RHS)))
+                 ? LHS
+                 : ObjectSizeOffsetVisitor::unknown();
+    case ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset:
+      return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
+    }
+  } else {
+    switch (Options.EvalMode) {
+    case ObjectSizeOpts::Mode::Min:
+      return (getSizeWithOverflow(LHS).slt(getSizeWithOverflow(RHS))) ? LHS
+                                                                      : RHS;
+    case ObjectSizeOpts::Mode::Max:
+      return (getSizeWithOverflow(LHS).sgt(getSizeWithOverflow(RHS))) ? LHS
+                                                                      : RHS;
+    case ObjectSizeOpts::Mode::ExactSizeFromOffset:
+      return (getSizeWithOverflow(LHS).eq(getSizeWithOverflow(RHS)))
+                 ? LHS
+                 : ObjectSizeOffsetVisitor::unknown();
+    case ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset:
+      return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
+    }
   }
   llvm_unreachable("missing an eval mode");
 }
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
index 4f4d6a88e1693b..27cbc391d52c3a 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
@@ -117,3 +117,110 @@ if.end:
   %size = call i64 @llvm.objectsize.i64.p0(ptr %p, i1 true, i1 true, i1 false)
   ret i64 %size
 }
+
+define i64 @pick_negative_offset(i32 %n) {
+; CHECK-LABEL: @pick_negative_offset(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[BUFFER0:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT:    [[OFFSETED0:%.*]] = getelementptr i8, ptr [[BUFFER0]], i64 20
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-NEXT:    br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK:       if.else:
+; CHECK-NEXT:    [[BUFFER1:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT:    [[OFFSETED1:%.*]] = getelementptr i8, ptr [[BUFFER1]], i64 20
+; CHECK-NEXT:    br label [[IF_END]]
+; CHECK:       if.end:
+; CHECK-NEXT:    [[P:%.*]] = phi ptr [ [[OFFSETED1]], [[IF_ELSE]] ], [ [[OFFSETED0]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[POFFSETED:%.*]] = getelementptr i8, ptr [[P]], i64 -4
+; CHECK-NEXT:    ret i64 4
+;
+entry:
+  %buffer0 = alloca i8, i64 20
+  %offseted0 = getelementptr i8, ptr %buffer0, i64 20
+  %cond = icmp eq i32 %n, 0
+  br i1 %cond, label %if.else, label %if.end
+
+if.else:
+  %buffer1 = alloca i8, i64 20
+  %offseted1 = getelementptr i8, ptr %buffer1, i64 20
+  br label %if.end
+
+if.end:
+  %p = phi ptr [ %offseted1, %if.else ], [ %offseted0, %entry ]
+  %poffseted = getelementptr i8, ptr %p, i64 -4
+  %size = call i64 @llvm.objectsize.i64.p0(ptr %poffseted, i1 false, i1 false, i1 false)
+  ret i64 %size
+}
+
+define i64 @pick_negative_offset_with_nullptr(i32 %n) {
+; CHECK-LABEL: @pick_negative_offset_with_nullptr(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[BUFFER0:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT:    [[OFFSETED0:%.*]] = getelementptr i8, ptr [[BUFFER0]], i64 20
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-NEXT:    br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK:       if.else:
+; CHECK-NEXT:    br label [[IF_END]]
+; CHECK:       if.end:
+; CHECK-NEXT:    [[P0:%.*]] = phi ptr [ [[OFFSETED0]], [[ENTRY:%.*]] ], [ null, [[IF_ELSE]] ]
+; CHECK-NEXT:    [[P1:%.*]] = phi ptr [ null, [[IF_ELSE]] ], [ [[OFFSETED0]], [[ENTRY]] ]
+; CHECK-NEXT:    [[P0OFFSETED:%.*]] = getelementptr i8, ptr [[P0]], i64 -4
+; CHECK-NEXT:    [[P1OFFSETED:%.*]] = getelementptr i8, ptr [[P1]], i64 -4
+; CHECK-NEXT:    [[SIZE:%.*]] = select i1 [[COND]], i64 0, i64 4
+; CHECK-NEXT:    ret i64 [[SIZE]]
+;
+entry:
+  %buffer0 = alloca i8, i64 20
+  %offseted0 = getelementptr i8, ptr %buffer0, i64 20
+  %cond = icmp eq i32 %n, 0
+  br i1 %cond, label %if.else, label %if.end
+
+if.else:
+  br label %if.end
+
+if.end:
+  %p0 = phi ptr [ %offseted0, %entry ], [ null, %if.else ]
+  %p1 = phi ptr [ null, %if.else ], [ %offseted0, %entry ]
+  %p0offseted = getelementptr i8, ptr %p0, i64 -4
+  %p1offseted = getelementptr i8, ptr %p1, i64 -4
+  %size0 = call i64 @llvm.objectsize.i64.p0(ptr %p0offseted, i1 false, i1 false, i1 false)
+  %size1 = call i64 @llvm.objectsize.i64.p0(ptr %p1offseted, i1 false, i1 false, i1 false)
+  %size = select i1 %cond, i64 %size0, i64 %size1
+  ret i64 %size
+}
+
+define i64 @pick_negative_offset_with_unsized_nullptr(i32 %n) {
+; CHECK-LABEL: @pick_negative_offset_with_unsized_nullptr(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[BUFFER0:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT:    [[OFFSETED0:%.*]] = getelementptr i8, ptr [[BUFFER0]], i64 20
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-NEXT:    br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK:       if.else:
+; CHECK-NEXT:    br label [[IF_END]]
+; CHECK:       if.end:
+; CHECK-NEXT:    [[P0:%.*]] = phi ptr [ [[OFFSETED0]], [[ENTRY:%.*]] ], [ null, [[IF_ELSE]] ]
+; CHECK-NEXT:    [[P1:%.*]] = phi ptr [ null, [[IF_ELSE]] ], [ [[OFFSETED0]], [[ENTRY]] ]
+; CHECK-NEXT:    [[P0OFFSETED:%.*]] = getelementptr i8, ptr [[P0]], i64 -4
+; CHECK-NEXT:    [[P1OFFSETED:%.*]] = getelementptr i8, ptr [[P1]], i64 -4
+; CHECK-NEXT:    ret i64 -1
+;
+entry:
+  %buffer0 = alloca i8, i64 20
+  %offseted0 = getelementptr i8, ptr %buffer0, i64 20
+  %cond = icmp eq i32 %n, 0
+  br i1 %cond, label %if.else, label %if.end
+
+if.else:
+  br label %if.end
+
+if.end:
+  %p0 = phi ptr [ %offseted0, %entry ], [ null, %if.else ]
+  %p1 = phi ptr [ null, %if.else ], [ %offseted0, %entry ]
+  %p0offseted = getelementptr i8, ptr %p0, i64 -4
+  %p1offseted = getelementptr i8, ptr %p1, i64 -4
+  %size0 = call i64 @llvm.objectsize.i64.p0(ptr %p0offseted, i1 false, i1 true, i1 false)
+  %size1 = call i64 @llvm.objectsize.i64.p0(ptr %p1offseted, i1 false, i1 true, i1 false)
+  %size = select i1 %cond, i64 %size0, i64 %size1
+  ret i64 %size
+}

return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
// If the ConstantOffset we add in the end is negative, then we're actually
// interested in selecting the nodes based on their offset rather than their
// size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we? I cannot really understand how this logic works, sorry. When Options.EvalMode==ObjectSizeOpts::Mode::ExactSizeFromOffset, that specifies that we should combine LHS and RHS when the remaining sizes of LHS and RHS are the same, that is how it is documented, and it feels risky to add a field to ObjectSizeOffsetVisitor that makes it so that this function no longer behaves as documented.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if we end up applying a negative offset, we actually care about the space before the object, and not after :-/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I do see a way for us to keep the mentioned behavior, let me try a few things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But even with a negative offset, __builtin_object_size returns the size that is available after applying that negative offset, so I think we still care about the size as well.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 10, 2024

If I apply this PR locally, although I no longer see the UBSan error, I still get a number of differences in __builtin_object_size between Clang and GCC when I change my test to print __builtin_object_size's result directly. I am not sure what the status of __builtin_object_size in Clang is, are we meant to return the same results as GCC?

@serge-sans-paille
Copy link
Collaborator Author

If I apply this PR locally, although I no longer see the UBSan error, I still get a number of differences in __builtin_object_size between Clang and GCC when I change my test to print __builtin_object_size's result directly. I am not sure what the status of __builtin_object_size in Clang is, are we meant to return the same results as GCC?

We're allowed to have a different behavior than GCC if we produce accurate result instead of -1/0. Otherwise that's probably an error. Would you mind sharing the differences?

@hvdijk
Copy link
Contributor

hvdijk commented Oct 10, 2024

We're allowed to have a different behavior than GCC if we produce accurate result instead of -1/0. Otherwise that's probably an error. Would you mind sharing the differences?

Sure. For the avoidance of confusion, in general this is not specifically about negative offsets, but here is an example where we produced a different result to GCC, we still produce a different result to GCC with your PR, but no longer the same result as before.

#include <stdio.h>
int x;
int main(void) {
  int array1[4] = {0};
  int array2[4] = {0};
  int *ptr;
  if (x) {
    ptr = array1 + 2;
  } else {
    ptr = array2 + 3;
  }
  printf("%zu\n", __builtin_object_size(ptr - 1, 0));
}

GCC prints 16. Clang used to print 12. Clang with your PR prints 8.

I am not sure which result is correct, I initially thought it was GCC's, but my thinking was wrong so I do not know if my conclusion was right.

@serge-sans-paille
Copy link
Collaborator Author

I've fixed my PR, it now respects the semantic as you hinted, and I had a small bug that I got rid of.
It now prints 12 in the example you just shared. I do think it's the correct result, you can replace the ptr - 1 by ptr in your example and see that gcc and clang agree on the result (8 on my laptop, which is indeed 12 - sizeof(int))

@hvdijk
Copy link
Contributor

hvdijk commented Oct 10, 2024

Yeah, I think you're right that 12 is the correct result in that test. Your updated PR looks good to me at a quick glance, thanks! I'll do some additional testing tomorrow.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 11, 2024

The PR makes it so that ObjectSizeOffsetVisitor::visitPHINode can return different results for the same PHI node when called with a different ConstantOffset, but that seems like it would interfere with the caching done in ObjectSizeOffsetEvaluator::compute_. I do not have a test case that shows wrong behaviour because of that though.

Here is another test demonstrating the PHI handling is not quite right yet, this one not related to negative offsets:

#include <stdio.h>
int x;
int main(void) {
  int array1[4] = {0};
  int array2[4] = {0};
  int *ptr;
  if (x) {
    ptr = array1;
  } else {
    ptr = array2;
  }
  printf("%zu\n", __builtin_object_size(ptr, 3));
}

Regardless of whether it's array1 or array2, 16 bytes are available, so that is what I would expect to be printed.

@serge-sans-paille
Copy link
Collaborator Author

#include <stdio.h>
int x;
int main(void) {
  int array1[4] = {0};
  int array2[4] = {0};
  int *ptr;
  if (x) {
    ptr = array1;
  } else {
    ptr = array2;
  }
  printf("%zu\n", __builtin_object_size(ptr, 3));
}

I'll have a look to that caching issue.

For the other example, LLVM doesn't seem to support Type=3, see

https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGBuiltin.cpp#L1122

@serge-sans-paille
Copy link
Collaborator Author

The PR makes it so that ObjectSizeOffsetVisitor::visitPHINode can return different results for the same PHI node when called with a different ConstantOffset, but that seems like it would interfere with the caching done in ObjectSizeOffsetEvaluator::compute_. I do not have a test case that shows wrong behaviour because of that though.

The cache belongs to the ObjectSizeOffsetEvaluator while ConstantOffset belongs to ObjectSizeOffsetVisitor. I don't see any state shared between the two so I don't see the issue (but it was worth double checking, thanks for having me do that).

@hvdijk
Copy link
Contributor

hvdijk commented Oct 12, 2024

Thanks for pointing me to Type = 3 just not being generally supported, and I agree with your explanation of why the caching should be a non-issue.

Unfortunately, more testing reveals that since this only works when the constant offset is known, it does not yet handle all cases, there are cases where the constant offset is not visible. With a modification to my test, I still get a false UBSan positive.

int x;
int main(void) {
  int array[4] = {0};
  int *ptr;
  if (x) {
    ptr = 0;
  } else {
    ptr = array + 2;
  }
  if (x) {
    ptr = 0;
  } else {
    ptr = ptr + 2;
  }
  return ptr[-1];
}

@serge-sans-paille
Copy link
Collaborator Author

Thanks for this test case! Bug fixed and test case added to the test suite.

@@ -706,7 +715,11 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value *V) {
IntTyBits = DL.getIndexTypeSizeInBits(V->getType());
Zero = APInt::getZero(IntTyBits);

APInt PrevConstantOffset = ConstantOffset;

ConstantOffset = ConstantOffset.sextOrTrunc(Offset.getBitWidth()) + Offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to possibly truncate here? The below bit with IndexTypeSizeChanged tries to very carefully handle the possibility that pointers (and hence indices on those pointers) have different sizes, it's a bit hard to see how exactly that should be handled here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved to a call to CheckedZextOrTrunc which bails out in case of failure.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 16, 2024

I'm still a bit wary: since this relies on the offset being known, since we get incorrect results if the offset isn't known, it suggests to me that if we ever end up in ObjectSizeOffsetEvaluator::visitGEPOperator for a non-constant offset, where that then calls compute_(GEP.getPointerOperand()) where compute_ first tries to get a result via ObjectSizeOffsetVisitor, that ObjectSizeOffsetVisitor could still return an incorrect result. I cannot come up with a test case where this actually fails though.

At the same time, this PR looks to me like it makes things strictly better: if there are cases that get mishandled, they would already get mishandled even in current LLVM. So even if we end up remaining unsure that this fix is complete, it may still be fine to merge.

I would appreciate it if someone more familiar with this code could also take a look, but if no one does in a reasonable time, I think merging this should be okay, thanks.

@serge-sans-paille
Copy link
Collaborator Author

I'm still a bit wary: since this relies on the offset being known, since we get incorrect results if the offset isn't known

Actually if we report an offset as unknown, __builtin_object_size returns an error code (either 0 or -1 depending on its arguments) and the caller can handle that gracefully, so it's not an error.

Error occur when we report an incorrect size, which is what happened in your original code, so I think we're not regressing in any way.

@nikic nikic requested a review from efriedma-quic October 16, 2024 20:05
@hvdijk
Copy link
Contributor

hvdijk commented Oct 16, 2024

Actually if we report an offset as unknown, __builtin_object_size returns an error code (either 0 or -1 depending on its arguments) and the caller can handle that gracefully, so it's not an error.

That is true for __builtin_object_size, but ObjectSizeOffsetEvaluator::visitGEPOperator is not used for that, it is used for __builtin_dynamic_object_size (as well as some other -fsanitize options) which does handle unknown offsets. That is the one where I worry there may still be cases not caught.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 17, 2024

And now I do have a testcase:

#include <stdio.h>
int x, i = -1;
int main() {
  int array1[4];
  int array2[8];
  int *ptr;
  if (x) {
    ptr = array1;
  } else {
    ptr = array2 + 4;
  }
  printf("%zu\n", __builtin_dynamic_object_size(ptr + i, 0));
}

This prints 0, but must print 20 (5 * sizeof(int)) or higher. It is the same problem: the PHI for ptr has incoming values that resolve to (size 16, offset 0) and (size 32, offset 16), and since both specify 16 remaining bytes, they are resolved to the former. And then, __builtin_dynamic_object_size concludes that i = -1 is out of range as an index and therefore 0 bytes are available, but that conclusion is wrong.

@serge-sans-paille
Copy link
Collaborator Author

% cat /tmp/a.c
#include <stdio.h>
int x, i = -1;
int main() {
  int array1[4];
  int array2[8];
  int *ptr;
  if (x) {
    ptr = array1;
  } else {
    ptr = array2 + 4;
  }
  printf("%zd\n", __builtin_dynamic_object_size(ptr + i, 0));
}
% ./bin/clang /tmp/a.c -o /tmp/a && /tmp/a
-1

which is totally fine (I've just changed the printf call to output signed value instead of unsigned in your example)

@hvdijk
Copy link
Contributor

hvdijk commented Oct 18, 2024

The return value is unsigned and printing it as signed makes it trickier to explain what values are correct: instead of 20 or greater, it's 20 or greater, or negative, where the reason negative numbers are okay is because they aren't actually negative. But anyway, that is not the reason you are getting a different output, it is because you have not enabled any optimisations. The previous test cases have been with optimisations enabled, this one is too. At any optimisation level (-Og, -O1, -O2, -O3, -Os, -Oz) it doesn't print -1, it prints 0. -1 is a valid output, 0 is not.

@serge-sans-paille
Copy link
Collaborator Author

Thanks! Indeed I can reproduce (and the problem seems to predate my change!) and I also have a patch. I'll test it and add it to this patchset.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 19, 2024

Sorry, but that looks like a very wrong approach to me. The code you're modifying in your last commit was already correct, and this new version means we no longer detect accesses before the start of an object as out of bounds.

#include <stdio.h>
int i = -1;
int main() {
  int array[4];
  printf("%zu\n", __builtin_dynamic_object_size(array + i, 0));
}

Here, we used to print 0, and 0 is the best possible value to print. Your update makes it so that we print 20.

When you've got an index that you know is before the start of an object, you know it's invalid to access any bytes, that's what the code there was correctly handling. The problem isn't that check, it's that the PHI handling wrongly concludes that we know we're at the start of an object when we're not, and that is the bit that needs to be fixed.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 20, 2024

Which makes me think that in both cases, if we have the choice between two object with the same size + offset value, then choosing the one with the larger offset solves the issue. I'll check that.

That works for this test, though for __builtin_dynamic_object_size(ptr + index, 2) it might cause a too large value to be returned.

I think we need to, at least in the __builtin_dynamic_object_size case, use ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset in which case we can return that we do not statically know the size of the object. That way, we end up in ObjectSizeOffsetEvaluator::visitPHINode and can handle each incoming value separately and return the correct value.

@serge-sans-paille serge-sans-paille force-pushed the fix/bos-select branch 2 times, most recently from ca870da to 8d1ade6 Compare October 21, 2024 11:54
@serge-sans-paille
Copy link
Collaborator Author

Just pushed a variant that makes a more educated pick between the two combined offset in case of equality. It seems to me it passes all the test you've been crafting so far ^^!

@hvdijk
Copy link
Contributor

hvdijk commented Oct 21, 2024

The problem is not limited to the case where remaining object sizes are equal though, it's just that those are easiest to come up with test cases for. Consider:

#include <stdio.h>
int x;
int main(void) {
  int array1[4];
  int array2[8];
  int *ptr;
  if (x) {
    printf("hello\n");
    ptr = array1 + 1;
  } else {
    ptr = array2 + 6;
  }
  printf("%zu\n", __builtin_object_size(ptr - 2, 0));
}

This must print 16 or higher, but prints 0 (when optimisations are enabled). During InstCombine __builtin_object_size cannot be resolved yet, so instead it is saved for LowerConstantIntrinsics. During LowerConstantIntrinsics, it evaluates with ObjectSizeOpts::Mode::Max, sees incoming PHI values {16, 4} and {32, 24}. The former says 12 bytes are available, the latter 8, so it picks the former. Then it applies the negative offset of -8 bytes to it, and because it thinks it knows it's 4 bytes past the start of an object, it wrongly concludes the pointer arithmetic produces an out of bounds value.

@serge-sans-paille
Copy link
Collaborator Author

The problem is not limited to the case where remaining object sizes are equal though, it's just that those are easiest to come up with test cases for. Consider:

#include <stdio.h>
int x;
int main(void) {
  int array1[4];
  int array2[8];
  int *ptr;
  if (x) {
    printf("hello\n");
    ptr = array1 + 1;
  } else {
    ptr = array2 + 6;
  }
  printf("%zu\n", __builtin_object_size(ptr - 2, 0));
}

This must print 16 or higher, but prints 0 (when optimisations are enabled). During InstCombine __builtin_object_size cannot be resolved yet, so instead it is saved for LowerConstantIntrinsics. During LowerConstantIntrinsics, it evaluates with ObjectSizeOpts::Mode::Max, sees incoming PHI values {16, 4} and {32, 24}. The former says 12 bytes are available, the latter 8, so it picks the former. Then it applies the negative offset of -8 bytes to it, and because it thinks it knows it's 4 bytes past the start of an object, it wrongly concludes the pointer arithmetic produces an out of bounds value.

Indeed. Looks like my previous approach which propagates the constant (negative) offset would have worked in that situation. But it couldn't work with a dynamic negative offset!
Looks like we should be able to pick different bounds based on the offset sign, I'll explore that path.

In any case, thanks a lot for your patience regarding this issue. 🙇

@hvdijk
Copy link
Contributor

hvdijk commented Oct 25, 2024

Indeed. Looks like my previous approach which propagates the constant (negative) offset would have worked in that situation. But it couldn't work with a dynamic negative offset!

You're right, that would handle that one. Can I suggest an alternative though?

Right now, we act on SizeOffsetAPInt objects which contain a size (optional) and offset (optional), and the problem with the PHI handling is that we intentionally risk returning an incorrect offset and an incorrect size, on the assumption that we know how the result will be used and the two errors will cancel out. It turns out they do not cancel out in all cases because we do not have a full understanding of how they are used.

What if, instead, we track three APInts? What if we track TotalSize, Offset, RemainingSize?

For my original test, we have PHIs where the incoming values resolve to SizeOffsetAPInts of {0, 0} and {16, 16}. With my idea, they would instead resolve to {0, 0, 0} and {16, 16, 0}. This can then be correctly combined into {unknown, unknown, 0}, which gives us enough information to say that if a subsequently applied offset is negative, we cannot draw any conclusions. At the same time, this fully preserves the RemainingSize to allow the currently intended optimisations, which assume positive offsets, to continue to be performed: for any positive offset, only RemainingSize needs to be looked at.

It does not quite handle all the cases that your ConstantOffset approach handles, but it would, I believe, be easier to understand and reason to be correct, it would be easier to assure ourselves we are not missing any cases.

@serge-sans-paille
Copy link
Collaborator Author

@hvdijk I went from your idea, but using only two values: one to track the amount of memory available before current point, and one to track the amount of memory available after current point. It is indeed easy to reason about, as in your proposal, and behaves in a very elegant manner wrt. the combine operation.
And it passes all our examples \o/
Please let me know what you think of this.

@hvdijk
Copy link
Contributor

hvdijk commented Oct 25, 2024

Nice work! Using a new class rather than extending SizeOffsetAPInt is probably good, it helps maintain compatibility, which is a plus if it means we can also backport this to LLVM 19 to fix the wrong UBSAN errors there.

The change to llvm/test/Transforms/InstCombine/builtin-object-size-offset.ll should not be needed. With -passes=instcombine, it is supposed to check for the size to be known exactly. Here, the size was previously known exactly and it is a regression that we no longer pick that up. The reason this is failing is because in ObjectSizeOffsetVisitor::compute, you have if (!Span.bothKnown()) return {};, but if EvalMode != ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset, that is not the best way of handling it. For ExactSizeFromOffset specifically, the idea is that Before does not matter to the caller, you can construct the result from After alone. By requiring both Before and After to be known, you are effectively making it so that ExactSizeFromOffset and ExactUnderlyingSizeAndOffset do the same thing, which they should not.

However, if ExactSizeFromOffset is restored to handle that case, it then becomes important to ensure that nothing calls ObjectSizeOffsetVisitor::compute with Options.EvalMode = ObjectSizeOpts::Mode::ExactSizeFromOffset if actually it does care about the offset. Yet ObjectSizeOffsetEvaluator::compute_ does exactly that. Fixing that is possible by making sure ObjectSizeOffsetVisitor has some public compute version that returns OffsetSpan, and making ObjectSizeOffsetEvaluator::compute_ use that as well. Alternatively, it may be possible to get ObjectSizeOffsetEvaluator::compute_ to use a different EvalMode, depending on how exactly ObjectSizeOffsetVisitor::compute gets changed.


P.S.: The reason why I suggested keeping TotalSize is that this case can give an upper bound when we know we are indexing into a specific object, but do not know the index (e.g. __builtin_object_size(array + i, 0) where i is only known at runtime can still be bound by sizeof array). I thought we already handled that, but we don't, so it is fine that your PR leaves that unhandled.

I will do more extensive testing later.

@serge-sans-paille
Copy link
Collaborator Author

patch updated with your suggestion to adjust the 'before' field when it actually doesn't matter. Works like a charm!

@hvdijk
Copy link
Contributor

hvdijk commented Oct 26, 2024

I did write in my previous comment:

However, if ExactSizeFromOffset is restored to handle that case, it then becomes important to ensure that nothing calls ObjectSizeOffsetVisitor::compute with Options.EvalMode = ObjectSizeOpts::Mode::ExactSizeFromOffset if actually it does care about the offset. Yet ObjectSizeOffsetEvaluator::compute_ does exactly that.

The result is that this latest iteration does not handle my test case from #111827 (comment): that now prints 0 again, but should not.

The possible approach I mentioned in my previous comment to use a different EvalMode is simple to do:

--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -1083,7 +1083,9 @@ SizeOffsetValue ObjectSizeOffsetEvaluator::compute(Value *V) {
 }
 
 SizeOffsetValue ObjectSizeOffsetEvaluator::compute_(Value *V) {
-  ObjectSizeOffsetVisitor Visitor(DL, TLI, Context, EvalOpts);
+  ObjectSizeOpts VisitorEvalOpts(EvalOpts);
+  VisitorEvalOpts.EvalMode = ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset;
+  ObjectSizeOffsetVisitor Visitor(DL, TLI, Context, VisitorEvalOpts);
   SizeOffsetAPInt Const = Visitor.compute(V);
   if (Const.bothKnown())
     return SizeOffsetValue(ConstantInt::get(Context, Const.Size),

This does also need builtin-object-size-phi.ll to be updated, but I think the effect that it has on that is correct, I think that adequately tests that the bug is fixed.

@serge-sans-paille
Copy link
Collaborator Author

I've applied your patch with a few extra comments in the diff and in the commit.

Copy link
Contributor

@hvdijk hvdijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, at this stage, I believe your PR is correct and fixes all the bugs! In one case, it results in slightly worse results (but still valid results) though, I've left a comment on the specific bit of code that causes this, please take a look and see if you agree.

@serge-sans-paille serge-sans-paille force-pushed the fix/bos-select branch 3 times, most recently from 4b5c836 to 9b2807e Compare October 28, 2024 20:38
Copy link
Contributor

@hvdijk hvdijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with this version, I think it is correct, thank you for the hard work! Please leave a little time before merging to give others time to review if they want, but I will mark it as approved from my end.

…and Select/Phi

Change ObjectSizeOffsetVisitor internal data structure to represent the
allocated memory space from (size, offset) to (before, after), i.e.
memory available before and after the considered point.

This is an internal change that doesn't impact current interface that
still returns (Size, Offset), as we can switch between one
representation to another, but when picking the Span before/after
representation, we can make more fine-grain decision when computing
max/min.

Also change ObjectSizeOffsetEvaluator to always call
ObjectSizeOffsetVisitor in Mode::ExactUnderlyingSizeAndOffset to
correctly handle negative dynamic offset that are incompatible with
Mode::ExactSizeFromOffset.

Fix llvm#111709 (and the many test cases which appeared in that thread).
@serge-sans-paille
Copy link
Collaborator Author

(rebased on main, I'll merge afterward)

@serge-sans-paille serge-sans-paille merged commit 01a103b into llvm:main Nov 2, 2024
8 checks passed
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
llvm#111827)

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior
differs if we're going to apply a constant offset that's positive or
negative: If it's positive, then we need to compare the remaining bytes
(i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes
(i.e. Offset).

Fix llvm#111709
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
llvm#111827)

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior
differs if we're going to apply a constant offset that's positive or
negative: If it's positive, then we need to compare the remaining bytes
(i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes
(i.e. Offset).

Fix llvm#111709
hvdijk pushed a commit to llvmbot/llvm-project that referenced this pull request Nov 12, 2024
llvm#111827)

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior
differs if we're going to apply a constant offset that's positive or
negative: If it's positive, then we need to compare the remaining bytes
(i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes
(i.e. Offset).

Fix llvm#111709

(cherry picked from commit 01a103b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UBSan false positive: runtime error: load of address with insufficient space for an object of type 'int'
3 participants