Skip to content

[llvm] Fix behavior of llvm.objectsize in presence of negative / large offset #115504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 18, 2024

Conversation

serge-sans-paille
Copy link
Collaborator

@serge-sans-paille serge-sans-paille commented Nov 8, 2024

The internal structure used to carry intermediate computations hold
signed values. If an object size happens to overflow signed values, we
can get invalid result, so make sure this situation never happens.

This is not very limitative as static allocation of such large values
should scarcely happen.

@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Nov 8, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-llvm-analysis

Author: None (serge-sans-paille)

Changes

…sion

The internal structure used to carry intermediate computations hold signed values. If an object size happens to overflow signed values, we can get invalid result, so make sure this situation never happens.

This is not very limitative as static allocation of such large values should scarcely happen.


Full diff: https://github.com/llvm/llvm-project/pull/115504.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+5-1)
  • (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+18-6)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+14)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll (+56)
diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index c3b11cdf5cf5db..e8f43140564509 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -223,6 +223,10 @@ struct SizeOffsetAPInt : public SizeOffsetType<APInt, SizeOffsetAPInt> {
 
 /// OffsetSpan - Used internally by \p ObjectSizeOffsetVisitor. Represents a
 /// point in memory as a pair of allocated bytes before and after it.
+///
+/// \c Before and \c After fields are signed values. It makes it possible to
+/// represent out-of-bound access, e.g. as a result of a GEP, at the expense of
+/// not being able to represent very large allocation.
 struct OffsetSpan {
   APInt Before; /// Number of allocated bytes before this point.
   APInt After;  /// Number of allocated bytes after this point.
@@ -255,7 +259,7 @@ class ObjectSizeOffsetVisitor
   SmallDenseMap<Instruction *, OffsetSpan, 8> SeenInsts;
   unsigned InstructionsVisited;
 
-  APInt align(APInt Size, MaybeAlign Align);
+  APInt alignAndCap(APInt Size, MaybeAlign Align);
 
   static OffsetSpan unknown() { return OffsetSpan(); }
 
diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp b/llvm/lib/Analysis/MemoryBuiltins.cpp
index d9769c48f26560..f5dc1f29db2c4a 100644
--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -668,7 +668,13 @@ STATISTIC(ObjectVisitorArgument,
 STATISTIC(ObjectVisitorLoad,
           "Number of load instructions with unsolved size and offset");
 
-APInt ObjectSizeOffsetVisitor::align(APInt Size, MaybeAlign Alignment) {
+/// Align \p Size according to \p Alignment. If \p Size is greater than
+/// getSignedMaxValue(), set it as unknown as we can only represent signed value
+/// in OffsetSpan.
+APInt ObjectSizeOffsetVisitor::alignAndCap(APInt Size, MaybeAlign Alignment) {
+  if (Size.isNegative())
+    return APInt();
+
   if (Options.RoundToAlign && Alignment)
     return APInt(IntTyBits, alignTo(Size.getZExtValue(), *Alignment));
   return Size;
@@ -780,8 +786,9 @@ OffsetSpan ObjectSizeOffsetVisitor::visitAllocaInst(AllocaInst &I) {
   if (!isUIntN(IntTyBits, ElemSize.getKnownMinValue()))
     return ObjectSizeOffsetVisitor::unknown();
   APInt Size(IntTyBits, ElemSize.getKnownMinValue());
+
   if (!I.isArrayAllocation())
-    return OffsetSpan(Zero, align(Size, I.getAlign()));
+    return OffsetSpan(Zero, alignAndCap(Size, I.getAlign()));
 
   Value *ArraySize = I.getArraySize();
   if (const ConstantInt *C = dyn_cast<ConstantInt>(ArraySize)) {
@@ -791,8 +798,9 @@ OffsetSpan ObjectSizeOffsetVisitor::visitAllocaInst(AllocaInst &I) {
 
     bool Overflow;
     Size = Size.umul_ov(NumElems, Overflow);
+
     return Overflow ? ObjectSizeOffsetVisitor::unknown()
-                    : OffsetSpan(Zero, align(Size, I.getAlign()));
+                    : OffsetSpan(Zero, alignAndCap(Size, I.getAlign()));
   }
   return ObjectSizeOffsetVisitor::unknown();
 }
@@ -806,12 +814,16 @@ OffsetSpan ObjectSizeOffsetVisitor::visitArgument(Argument &A) {
   }
 
   APInt Size(IntTyBits, DL.getTypeAllocSize(MemoryTy));
-  return OffsetSpan(Zero, align(Size, A.getParamAlign()));
+  return OffsetSpan(Zero, alignAndCap(Size, A.getParamAlign()));
 }
 
 OffsetSpan ObjectSizeOffsetVisitor::visitCallBase(CallBase &CB) {
-  if (std::optional<APInt> Size = getAllocSize(&CB, TLI))
+  if (std::optional<APInt> Size = getAllocSize(&CB, TLI)) {
+    // Very large unsigned value cannot be represented as OffsetSpan.
+    if (Size->isNegative())
+      return ObjectSizeOffsetVisitor::unknown();
     return OffsetSpan(Zero, *Size);
+  }
   return ObjectSizeOffsetVisitor::unknown();
 }
 
@@ -852,7 +864,7 @@ OffsetSpan ObjectSizeOffsetVisitor::visitGlobalVariable(GlobalVariable &GV) {
     return ObjectSizeOffsetVisitor::unknown();
 
   APInt Size(IntTyBits, DL.getTypeAllocSize(GV.getValueType()));
-  return OffsetSpan(Zero, align(Size, GV.getAlign()));
+  return OffsetSpan(Zero, alignAndCap(Size, GV.getAlign()));
 }
 
 OffsetSpan ObjectSizeOffsetVisitor::visitIntToPtrInst(IntToPtrInst &) {
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
index 2974228e6a8303..fbb37560fb2d0e 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
@@ -118,6 +118,20 @@ if.end:
   ret i64 %size
 }
 
+define dso_local i64 @pick_max_large(i1 %c) local_unnamed_addr {
+; CHECK-LABEL: @pick_max_large(
+; CHECK-NEXT:    [[BUFFER:%.*]] = alloca i8, i64 -7, align 1
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[C:%.*]], ptr null, ptr [[BUFFER]]
+; CHECK-NEXT:    ret i64 -1
+;
+  %buffer = alloca i8, i64 -7 ; Actually a very large positive integer
+  %s = select i1 %c, ptr null, ptr %buffer
+  %objsize = tail call i64 @llvm.objectsize.i64.p0(ptr %s, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+
+}
+
+
 define i64 @pick_negative_offset(i32 %n) {
 ; CHECK-LABEL: @pick_negative_offset(
 ; CHECK-NEXT:  entry:
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll b/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll
index 568070a8660698..ae7399a1eafe52 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll
@@ -195,6 +195,62 @@ define i64 @out_of_bound_gep() {
   ret i64 %objsize
 }
 
+define i64 @wrapping_gep() {
+; CHECK-LABEL: @wrapping_gep(
+; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i64 4, align 1
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    [[SLIDE_BIS:%.*]] = getelementptr i8, ptr [[SLIDE]], i64 9223372036854775807
+; CHECK-NEXT:    ret i64 0
+;
+  %obj = alloca i8, i64 4
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %slide.bis = getelementptr i8, ptr %slide, i64 9223372036854775807
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide.bis, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
+define i64 @wrapping_gep_neg() {
+; CHECK-LABEL: @wrapping_gep_neg(
+; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i64 9223372036854775807, align 1
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    [[SLIDE_BIS:%.*]] = getelementptr i8, ptr [[SLIDE]], i64 3
+; CHECK-NEXT:    [[SLIDE_TER:%.*]] = getelementptr i8, ptr [[SLIDE_BIS]], i64 -4
+; CHECK-NEXT:    ret i64 1
+;
+  %obj = alloca i8, i64 9223372036854775807
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %slide.bis = getelementptr i8, ptr %slide, i64 3
+  %slide.ter = getelementptr i8, ptr %slide.bis, i64 -4
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide.ter, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
+; We don't analyze allocations larger than platform's ptrdiff_t
+define i64 @large_alloca() {
+; CHECK-LABEL: @large_alloca(
+; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i64 -9223372036854775808, align 1
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    ret i64 -1
+;
+  %obj = alloca i8, i64 9223372036854775808
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
+; We don't analyze allocations larger than platform's ptrdiff_t
+define i64 @large_malloc() {
+; CHECK-LABEL: @large_malloc(
+; CHECK-NEXT:    [[OBJ:%.*]] = call ptr @malloc(i64 -9223372036854775808)
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    ret i64 -1
+;
+  %obj = call ptr @malloc(i64 9223372036854775808)
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
 define i64 @out_of_bound_negative_gep() {
 ; CHECK-LABEL: @out_of_bound_negative_gep(
 ; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i32 4, align 1

@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (serge-sans-paille)

Changes

…sion

The internal structure used to carry intermediate computations hold signed values. If an object size happens to overflow signed values, we can get invalid result, so make sure this situation never happens.

This is not very limitative as static allocation of such large values should scarcely happen.


Full diff: https://github.com/llvm/llvm-project/pull/115504.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+5-1)
  • (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+18-6)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+14)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll (+56)
diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index c3b11cdf5cf5db..e8f43140564509 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -223,6 +223,10 @@ struct SizeOffsetAPInt : public SizeOffsetType<APInt, SizeOffsetAPInt> {
 
 /// OffsetSpan - Used internally by \p ObjectSizeOffsetVisitor. Represents a
 /// point in memory as a pair of allocated bytes before and after it.
+///
+/// \c Before and \c After fields are signed values. It makes it possible to
+/// represent out-of-bound access, e.g. as a result of a GEP, at the expense of
+/// not being able to represent very large allocation.
 struct OffsetSpan {
   APInt Before; /// Number of allocated bytes before this point.
   APInt After;  /// Number of allocated bytes after this point.
@@ -255,7 +259,7 @@ class ObjectSizeOffsetVisitor
   SmallDenseMap<Instruction *, OffsetSpan, 8> SeenInsts;
   unsigned InstructionsVisited;
 
-  APInt align(APInt Size, MaybeAlign Align);
+  APInt alignAndCap(APInt Size, MaybeAlign Align);
 
   static OffsetSpan unknown() { return OffsetSpan(); }
 
diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp b/llvm/lib/Analysis/MemoryBuiltins.cpp
index d9769c48f26560..f5dc1f29db2c4a 100644
--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -668,7 +668,13 @@ STATISTIC(ObjectVisitorArgument,
 STATISTIC(ObjectVisitorLoad,
           "Number of load instructions with unsolved size and offset");
 
-APInt ObjectSizeOffsetVisitor::align(APInt Size, MaybeAlign Alignment) {
+/// Align \p Size according to \p Alignment. If \p Size is greater than
+/// getSignedMaxValue(), set it as unknown as we can only represent signed value
+/// in OffsetSpan.
+APInt ObjectSizeOffsetVisitor::alignAndCap(APInt Size, MaybeAlign Alignment) {
+  if (Size.isNegative())
+    return APInt();
+
   if (Options.RoundToAlign && Alignment)
     return APInt(IntTyBits, alignTo(Size.getZExtValue(), *Alignment));
   return Size;
@@ -780,8 +786,9 @@ OffsetSpan ObjectSizeOffsetVisitor::visitAllocaInst(AllocaInst &I) {
   if (!isUIntN(IntTyBits, ElemSize.getKnownMinValue()))
     return ObjectSizeOffsetVisitor::unknown();
   APInt Size(IntTyBits, ElemSize.getKnownMinValue());
+
   if (!I.isArrayAllocation())
-    return OffsetSpan(Zero, align(Size, I.getAlign()));
+    return OffsetSpan(Zero, alignAndCap(Size, I.getAlign()));
 
   Value *ArraySize = I.getArraySize();
   if (const ConstantInt *C = dyn_cast<ConstantInt>(ArraySize)) {
@@ -791,8 +798,9 @@ OffsetSpan ObjectSizeOffsetVisitor::visitAllocaInst(AllocaInst &I) {
 
     bool Overflow;
     Size = Size.umul_ov(NumElems, Overflow);
+
     return Overflow ? ObjectSizeOffsetVisitor::unknown()
-                    : OffsetSpan(Zero, align(Size, I.getAlign()));
+                    : OffsetSpan(Zero, alignAndCap(Size, I.getAlign()));
   }
   return ObjectSizeOffsetVisitor::unknown();
 }
@@ -806,12 +814,16 @@ OffsetSpan ObjectSizeOffsetVisitor::visitArgument(Argument &A) {
   }
 
   APInt Size(IntTyBits, DL.getTypeAllocSize(MemoryTy));
-  return OffsetSpan(Zero, align(Size, A.getParamAlign()));
+  return OffsetSpan(Zero, alignAndCap(Size, A.getParamAlign()));
 }
 
 OffsetSpan ObjectSizeOffsetVisitor::visitCallBase(CallBase &CB) {
-  if (std::optional<APInt> Size = getAllocSize(&CB, TLI))
+  if (std::optional<APInt> Size = getAllocSize(&CB, TLI)) {
+    // Very large unsigned value cannot be represented as OffsetSpan.
+    if (Size->isNegative())
+      return ObjectSizeOffsetVisitor::unknown();
     return OffsetSpan(Zero, *Size);
+  }
   return ObjectSizeOffsetVisitor::unknown();
 }
 
@@ -852,7 +864,7 @@ OffsetSpan ObjectSizeOffsetVisitor::visitGlobalVariable(GlobalVariable &GV) {
     return ObjectSizeOffsetVisitor::unknown();
 
   APInt Size(IntTyBits, DL.getTypeAllocSize(GV.getValueType()));
-  return OffsetSpan(Zero, align(Size, GV.getAlign()));
+  return OffsetSpan(Zero, alignAndCap(Size, GV.getAlign()));
 }
 
 OffsetSpan ObjectSizeOffsetVisitor::visitIntToPtrInst(IntToPtrInst &) {
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
index 2974228e6a8303..fbb37560fb2d0e 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
@@ -118,6 +118,20 @@ if.end:
   ret i64 %size
 }
 
+define dso_local i64 @pick_max_large(i1 %c) local_unnamed_addr {
+; CHECK-LABEL: @pick_max_large(
+; CHECK-NEXT:    [[BUFFER:%.*]] = alloca i8, i64 -7, align 1
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[C:%.*]], ptr null, ptr [[BUFFER]]
+; CHECK-NEXT:    ret i64 -1
+;
+  %buffer = alloca i8, i64 -7 ; Actually a very large positive integer
+  %s = select i1 %c, ptr null, ptr %buffer
+  %objsize = tail call i64 @llvm.objectsize.i64.p0(ptr %s, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+
+}
+
+
 define i64 @pick_negative_offset(i32 %n) {
 ; CHECK-LABEL: @pick_negative_offset(
 ; CHECK-NEXT:  entry:
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll b/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll
index 568070a8660698..ae7399a1eafe52 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll
@@ -195,6 +195,62 @@ define i64 @out_of_bound_gep() {
   ret i64 %objsize
 }
 
+define i64 @wrapping_gep() {
+; CHECK-LABEL: @wrapping_gep(
+; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i64 4, align 1
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    [[SLIDE_BIS:%.*]] = getelementptr i8, ptr [[SLIDE]], i64 9223372036854775807
+; CHECK-NEXT:    ret i64 0
+;
+  %obj = alloca i8, i64 4
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %slide.bis = getelementptr i8, ptr %slide, i64 9223372036854775807
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide.bis, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
+define i64 @wrapping_gep_neg() {
+; CHECK-LABEL: @wrapping_gep_neg(
+; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i64 9223372036854775807, align 1
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    [[SLIDE_BIS:%.*]] = getelementptr i8, ptr [[SLIDE]], i64 3
+; CHECK-NEXT:    [[SLIDE_TER:%.*]] = getelementptr i8, ptr [[SLIDE_BIS]], i64 -4
+; CHECK-NEXT:    ret i64 1
+;
+  %obj = alloca i8, i64 9223372036854775807
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %slide.bis = getelementptr i8, ptr %slide, i64 3
+  %slide.ter = getelementptr i8, ptr %slide.bis, i64 -4
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide.ter, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
+; We don't analyze allocations larger than platform's ptrdiff_t
+define i64 @large_alloca() {
+; CHECK-LABEL: @large_alloca(
+; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i64 -9223372036854775808, align 1
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    ret i64 -1
+;
+  %obj = alloca i8, i64 9223372036854775808
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
+; We don't analyze allocations larger than platform's ptrdiff_t
+define i64 @large_malloc() {
+; CHECK-LABEL: @large_malloc(
+; CHECK-NEXT:    [[OBJ:%.*]] = call ptr @malloc(i64 -9223372036854775808)
+; CHECK-NEXT:    [[SLIDE:%.*]] = getelementptr i8, ptr [[OBJ]], i64 9223372036854775807
+; CHECK-NEXT:    ret i64 -1
+;
+  %obj = call ptr @malloc(i64 9223372036854775808)
+  %slide = getelementptr i8, ptr %obj, i64 9223372036854775807
+  %objsize = call i64 @llvm.objectsize.i64(ptr %slide, i1 false, i1 false, i1 false)
+  ret i64 %objsize
+}
+
 define i64 @out_of_bound_negative_gep() {
 ; CHECK-LABEL: @out_of_bound_negative_gep(
 ; CHECK-NEXT:    [[OBJ:%.*]] = alloca i8, i32 4, align 1

@serge-sans-paille serge-sans-paille changed the title [llvm] Fix handling of very large allocation in llvm.objectsize expan… [llvm] Fix behavior of llvm.objectsize in presence of negative / large offset Nov 12, 2024
///
/// \c Before and \c After fields are signed values. It makes it possible to
/// represent out-of-bound access, e.g. as a result of a GEP, at the expense of
/// not being able to represent very large allocation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// not being able to represent very large allocation.
/// not being able to represent very large allocations.

@@ -255,7 +259,7 @@ class ObjectSizeOffsetVisitor
SmallDenseMap<Instruction *, OffsetSpan, 8> SeenInsts;
unsigned InstructionsVisited;

APInt align(APInt Size, MaybeAlign Align);
APInt alignAndCap(APInt Size, MaybeAlign Align);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original name, align, is actually better: "cap" suggests to me some sort of saturation. That isn't what happens, when the size exceeds the representable values, you don't return the highest representable value, you return a special value indicating "unknown".

// We could compute an over-approximation in that situation, may be if
// Opts.EvalMode == Max, but let's bee conservative and bail out.
if (Data.Offset.isNegative())
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is right. If offsets should always be treated as signed, which is what I think this PR is saying they should be, then a negative offset results in an out of bounds pointer and should result in a size of zero. I worry that this will cause UBSan to stop reporting certain kinds of out of bounds accesses. Isn't the out_of_bound_negative_gep test update in this PR showing that exactly that will happen?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree :-)
Let's consider the following fortified access:

inline int access(int* ptr, int n) {
    long bos = __builtin_object_size(ptr, 0);
    if (/*fail to perform static check */ bos < 0 || /*in bound access */ n < bos)
        return ptr[n];
    else
        abort(); // we're sure the access is invalid
}

which looks like a decent usage of __builtin_object_size.

Then what happens with the following call site?

int caller() {
    int a[2] = {-1, -1};
    return access(a - 2, 3);
}
  1. this behavior is undefined (forming a pointer that's previous the actual object. -fsanitize=undefined correctly catches it (see https://godbolt.org/z/6Ejojc7Y5 for an equivalent scenario).
  2. the access call will incorrectly abort, throwing a false positive. It can be argued that it's an UB so it's ok (and it is!) but returning -1 looks like the path of least surprise to me. See https://godbolt.org/z/nG5voPco5 for an equivalent scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which looks like a decent usage of __builtin_object_size.

I actually don't think it is. The documentation says that __builtin_object_size(ptr, type) "is a built-in construct that returns a constant number of bytes from ptr to the end of the object ptr pointer points to (if known at compile time)." If your function access is specifically relying on ptr not pointing to an object, it should not be using __builtin_object_size. The typical users of it are the fortified string functions, where it is important for the fortification that out-of-bounds pointers result in a reported size of zero. This general concept can be seen in e.g. char array[10]; strcpy(array - 4, "hello, world!"); which results in "*** buffer overflow detected ***: terminated" when compiled with -D_FORTIFY_SOURCE=2, both with GCC and Clang. That array - 4 should not be interpreted as the start of 14 potentially accessible bytes of memory. (This particular example does continue to behave as expected even with your PR.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that the relevant part is

 If it is not possible to determine which objects ptr points to at compile time, __builtin_object_size should return (size_t) -1

In our case we don't know to which object it points to (it's somewhere before the original array) then -1 ?
Actually any returned value would be correct because the access is UB.

A motivating example is the following:

https://godbolt.org/z/Evqbj5v4f

define i64 @pain(i1 %0) {
  %p = alloca [2 x i8], align 1
  br i1 %0, label %if.then, label %if.else

if.then:
  %p.then = getelementptr  [2 x i8], ptr %p, i64 0, i64 1
  br label %if.end

if.else:
  %p.else = getelementptr  [2 x i8], ptr %p, i64 0, i64 -1
  br label %if.end

if.end:
  %p.end = phi ptr [%p.else, %if.else], [%p.then, %if.then]
  %objsize = call i64 @llvm.objectsize.i64.p0(ptr %p.end, i1 false, i1 true, i1 false)
  ret i64 %objsize
}

for which we currently return 3, and we used to return 1. If we do so it's because we consider %p.else points to an object of size 3. It is inconsistent with the fact that

define i64 @pain(i1 %0) {
  %p = alloca [2 x i8], align 1
  %p.end = getelementptr  [2 x i8], ptr %p, i64 0, i64 -1
  %objsize = call i64 @llvm.objectsize.i64.p0(ptr %p.end, i1 false, i1 true, i1 false)
  ret i64 %objsize
}

returns 0 (and always have). If we want the first example to be consistent with the second, which I don't hate, I think we have to make sure the first example return 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for which we currently return 3, and we used to return 1

Yes, that is a good example, thank you, I agree that 1 is a better result here. We have a phi with two incoming values, %p.then and %p.else.

For %p.then, the start of the object is 1 byte before the pointer, and the end of the object is 1 byte after the pointer.
For %p.else, the start of the object is 1 byte after the pointer, and the end of the object is 3 bytes after the pointer.

Since we are evaluating with EvalMode==Max, we merge these to conclude that we have a pointer at most 1 byte before the start of the object, and at most 3 bytes before the end of the object. This is all correct, and yet, the result is not the result we want. We no longer have enough information to return the result we want, and I am not sure we can fix that without a bigger re-working of the code without also returning incorrect results in other cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with current version!
I think it matches what we expect in all the above situation. And it is consistent \o/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your updated version still has the same potential problem. At the LLVM IR level, out of bounds pointers are not necessarily UB, it depends on whether a getelemenptr has the inbounds flag. In LLVM IR, it is possible to do the equivalent of __builtin_object_size(array - 2 + 2, 0) while keeping behaviour defined. To have this work, computeImpl needs to return the real OffsetSpan even for out of bounds pointers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the reason why we are able to see that the best result is 1, it is because we don't combine the phis in the way the code does. We simply go over all possibilities, we have an object that is either %p.then in which case the result should ideally be 1, or that is %p.else in which case the result should ideally be 0. The maximum of those is 1. Since we do not mentally combine the ranges of %p.then and %p.else in the way that the code does, but only combine the final results, we avoid the problem that we lose information in the earlier combining.

The combining of ranges makes sense to prevent exponential time complexity. But there may be other ways of preventing that. Your earlier ConstantOffset PR was a limited form of passing in information into the visitor about what subsequent operations will be performed on the ranges. I think revisiting that might be a good idea. It did not in its initial version handle all cases, but when combining that with the other improvements, I think it can, and I think it would result in us returning 1 here while keeping the out-of-bounds pointer handling in other cases intact.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. Let's explore that in another PR though, this one already looks like a nice step forward and I'm not eager to see it grow beyond its state.

…sion

The internal structure used to carry intermediate computations hold
signed values. If an object size happens to overflow signed values, we
can get invalid result, so make sure this situation never happens.

This is not very limitative as static allocation of such large values
should scarcely happen.
@serge-sans-paille serge-sans-paille force-pushed the fix/phi-large branch 2 times, most recently from 2ec5ced to 3bdf857 Compare November 13, 2024 20:02
Copy link

github-actions bot commented Nov 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

When an object is located before it's allocation point, e.g.

   char a[10];
   char* b = a[-3];

If we ask for the maximum amount of memory addressable from `b` through

   __builtin_object_size(b, 0)

It is better to return -1, even if we actually know everything about the
allocation point, than to return 0, which we currently do and that leads
to sanitizer raising invalid/incorrect diagnostic.
@serge-sans-paille
Copy link
Collaborator Author

(the build failure seems unrelated: python import issues in MLIR on windows)

@serge-sans-paille
Copy link
Collaborator Author

@hvdijk : Are you fine with me landing this as part of #115522 ?
We can elaborate on the negative gep composition aspects in a later patch.

@hvdijk
Copy link
Contributor

hvdijk commented Nov 15, 2024

@hvdijk : Are you fine with me landing this as part of #115522 ?

I'm fine with combining the two PRs, but make sure that the approvals in the other PR cover everything. When I asked "To avoid confusion: this includes the changes from #115504 unmodified, and only the additional commit on top of that should be reviewed here?", you gave a thumbs up to that, and I think it's reasonable to assume other people who approve your other PR will also read that and only look at the most recent commit.

I still think that this PR, even if it results in improved results in some cases, will result in incorrect results in other cases, so personally, I do not think it should be merged as is, but if there is agreement with other people and those possible incorrect results will soon be addressed in a followup PR, I'm also not going to stand in the way.

@serge-sans-paille serge-sans-paille merged commit 1dcb3db into llvm:main Nov 18, 2024
6 of 8 checks passed
@serge-sans-paille
Copy link
Collaborator Author

Merging and merging #115522 once the validation train is good

@nathanchance
Copy link
Member

Hi @serge-sans-paille, I bisected a boot failure that I am seeing when building the Linux kernel with CONFIG_UBSAN_BOUNDS, CONFIG_UBSAN_TRAP, and CONFIG_FORTIFY_SOURCE to this change:

# bad: [c526eb891bda371c0481fcdc1507adc496431d03] Fix buildbots with no x86 target.
# good: [315519917368dce841f1cb1e7b296846d13497c3] Thread safety analysis: Eliminate unneeded const_cast, NFC
git bisect start 'c526eb891bda371c0481fcdc1507adc496431d03' '315519917368dce841f1cb1e7b296846d13497c3'
# bad: [6bf8f08989420ccd10efed5fac88052ca16e1250] [memprof] Add InstrProfWriter::addMemProfData (#116528)
git bisect bad 6bf8f08989420ccd10efed5fac88052ca16e1250
# bad: [0c04d43e8060f7b5bd4745c3400431abb3ad10b6] [RISCV][NFC] refactor CFI emitting (#114227)
git bisect bad 0c04d43e8060f7b5bd4745c3400431abb3ad10b6
# good: [1c4f335ec29c6bb269d0f8b2d6149d439312c69a] [PAC][lld] Use braa instr in PAC PLT sequence with valid PAuth core info (#113945)
git bisect good 1c4f335ec29c6bb269d0f8b2d6149d439312c69a
# bad: [63b926af5ff43a90dac285bbe0750e41e622eb3f] [mlir] Add apply_patterns.linalg.generalize_pack_unpack TD Op (#116373)
git bisect bad 63b926af5ff43a90dac285bbe0750e41e622eb3f
# good: [5ff52436fd0c7739765f1d849992713a3e9ae237] Relax clang/test/CodeGen/tbaa-pointers.c for -Asserts.
git bisect good 5ff52436fd0c7739765f1d849992713a3e9ae237
# good: [3c31ee740669fc80ff1bb4ae3724aa778cd1659e] [APInt] Call countTrailingZerosSlowCase() directly from isShiftedMask. NFC
git bisect good 3c31ee740669fc80ff1bb4ae3724aa778cd1659e
# bad: [1dcb3db0ac1255bf556bf6b62d03a113bd5191d8] [llvm] Fix behavior of llvm.objectsize in presence of negative / large offset (#115504)
git bisect bad 1dcb3db0ac1255bf556bf6b62d03a113bd5191d8
# good: [b4c0ef18226b7d1f82d71fc0171b99caec0d8d12] [flang][OpenMP] Add MLIR lowering for `loop ... bind` (#114219)
git bisect good b4c0ef18226b7d1f82d71fc0171b99caec0d8d12
# first bad commit: [1dcb3db0ac1255bf556bf6b62d03a113bd5191d8] [llvm] Fix behavior of llvm.objectsize in presence of negative / large offset (#115504)

The initial error I saw originated from mm/slub.c but after disabling UBSAN for that translation unit, another error popped up from lib/crypto/sha256.c. Looking at lib/crypto/sha256.o, it seems like this change causes sha256_transform_blocks() to get optimized to a simple ud2, so any call to it will just panic.

$ llvm-readelf -p .comment sha256.o
String dump of section '.comment':
[     1] ClangBuiltLinux clang version 20.0.0git (https://github.com/llvm/llvm-project.git 1dcb3db0ac1255bf556bf6b62d03a113bd5191d8)

$ llvm-objdump --disassemble-symbols=sha256_transform_blocks -r sha256.o

sha256.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000100 <sha256_transform_blocks>:
     100: 0f 0b                         ud2
     102: 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%rax,%rax)

At the immediately previous change, this does not happen.

cvise spits out:

typedef unsigned long size_t;
struct sha256_state {
  long count;
  unsigned char buf[];
} lib_sha256_base_do_finalize_sctx;
typedef void(sha256_block_fn)(struct sha256_state *, unsigned char const *,
                              int);
int BLEND_OP___trans_tmp_2, sha256_transform_blocks_i;
int sha256_transform_blocks_W[64];
void *memset();
void memzero_explicit(void *s, size_t count) {
  memset(s, 0, count);
  __asm__ __volatile__("" : : "r"(s) : "memory");
}
int lib_sha256_base_do_update(sha256_block_fn block_fn) { return 0; }
int lib_sha256_base_do_finalize(sha256_block_fn block_fn) {
  long *bits = 0;
  block_fn(&lib_sha256_base_do_finalize_sctx,
           lib_sha256_base_do_finalize_sctx.buf, 1);
  *bits = 0;
  block_fn(&lib_sha256_base_do_finalize_sctx,
           lib_sha256_base_do_finalize_sctx.buf, 1);
  return 0;
}
static void sha256_transform_blocks(struct sha256_state *sctx,
                                    const unsigned char *input, int blocks) {
  int *__trans_tmp_3 = sha256_transform_blocks_W, *W = __trans_tmp_3;
  for (sha256_transform_blocks_i = 6; sha256_transform_blocks_i < 64;
       sha256_transform_blocks_i += 8) {
    {
      int *__trans_tmp_4 = W, *__trans_tmp_5 = W, *__trans_tmp_6 = W;
      {
        int I = sha256_transform_blocks_i;
        ;
        W[I] = __trans_tmp_4[7] + BLEND_OP___trans_tmp_2 + __trans_tmp_4[I - 6];
      }
      {
        int I = sha256_transform_blocks_i, word = __trans_tmp_5[5];
        BLEND_OP___trans_tmp_2 = word << 7;
        W[I] = __trans_tmp_5[7] + BLEND_OP___trans_tmp_2 + __trans_tmp_5[6];
      }
      {
        int I = sha256_transform_blocks_i, word = __trans_tmp_6[5];
        BLEND_OP___trans_tmp_2 = word;
        W[I] = __trans_tmp_6[7] + BLEND_OP___trans_tmp_2 + __trans_tmp_6[6];
      }
    }
  }
  memzero_explicit(sha256_transform_blocks_W,
                   sizeof(sha256_transform_blocks_W));
}
void sha256_update() { lib_sha256_base_do_update(sha256_transform_blocks); }
void sha224_final() { lib_sha256_base_do_finalize(sha256_transform_blocks); }

but I am not sure this is equivalent to the original code, so I have also uploaded the full preprocessed file.

The behavior above can be reproduced with the following clang command with either file.

$ clang --target=x86_64-linux-gnu -fno-strict-overflow -fsanitize=local-bounds -mno-sse -O2 -c sha256.i

@mikaelholmen
Copy link
Collaborator

Hi @serge-sans-paille ,

We also see a miscompile with this patch. I've tried to extract a reproducer and I think that we see it with:
opt -passes="bounds-checking" bbi-101449.ll -S -o -
Before this patch the above resulted in

@str = global [100 x i8] zeroinitializer, align 1

define i16 @main() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %for.inc, %entry
  %i.0 = phi i8 [ 65, %entry ], [ %inc, %for.inc ]
  %exitcond.not = icmp eq i8 %i.0, 76
  br i1 %exitcond.not, label %for.end, label %for.inc

for.inc:                                          ; preds = %for.cond
  %i.0.c = sext i8 %i.0 to i64
  %0 = add i64 -65, %i.0.c
  %gep = getelementptr i8, ptr getelementptr (i8, ptr @str, i8 -65), i8 %i.0
  %1 = sub i64 100, %0
  store i8 %i.0, ptr %gep, align 1
  %inc = add nuw nsw i8 %i.0, 1
  br label %for.cond

for.end:                                          ; preds = %for.cond
  ret i16 0
}

which I think is ok, but with this patch we instead get

@str = global [100 x i8] zeroinitializer, align 1

define i16 @main() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %4, %entry
  %i.0 = phi i8 [ 65, %entry ], [ %inc, %4 ]
  %exitcond.not = icmp eq i8 %i.0, 76
  br i1 %exitcond.not, label %for.end, label %for.inc

for.inc:                                          ; preds = %for.cond
  %i.0.c = sext i8 %i.0 to i64
  %0 = add i64 0, %i.0.c
  %gep = getelementptr i8, ptr getelementptr (i8, ptr @str, i8 -65), i8 %i.0
  %1 = sub i64 0, %0
  %2 = icmp ult i64 0, %0
  %3 = or i1 %2, false
  br i1 %3, label %trap, label %4

4:                                                ; preds = %for.inc
  store i8 %i.0, ptr %gep, align 1
  %inc = add nuw nsw i8 %i.0, 1
  br label %for.cond

for.end:                                          ; preds = %for.cond
  ret i16 0

trap:                                             ; preds = %for.inc
  call void @llvm.trap() #1
  unreachable
}

which I think is wrong as it makes execution end up at the trap.
bbi-101449.ll.gz

@serge-sans-paille
Copy link
Collaborator Author

I've got a patch. I'll submit that soon.

@serge-sans-paille
Copy link
Collaborator Author

Fixed by #116955

@mikaelholmen
Copy link
Collaborator

Fixed by #116955

I've verified that the miscompiles I've seen disappears with that fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants