Skip to content

[Clang] Correct handling of negative and out-of-bounds indices #71877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 20, 2023

Conversation

bwendling
Copy link
Collaborator

GCC returns 0 for a negative index on an array in a structure. It also
returns 0 for an array index that goes beyond the extent of the array.
In addition. a pointer to a struct field returns that field's size, not
the size of it plus the rest of the struct, unless it's the first field
in the struct.

struct s {
int count;
char dummy;
int array[] __attribute((counted_by(count)));
};

struct s *p = malloc(...);

p->count = 10;

A __bdos on the elements of p return:

__bdos(p, 0) == 30
__bdos(p->array, 0) == 10
__bdos(&p->array[0], 0) == 10
__bdos(&p->array[-1], 0) == 0
__bdos(&p->array[42], 0) == 0

Also perform some refactoring, putting the "counted_by" calculations in
their own function.

GCC returns 0 for a negative index on an array in a structure. It also
returns 0 for an array index that goes beyond the extent of the array.
In addition. a pointer to a struct field returns that field's size, not
the size of it plus the rest of the struct, unless it's the first field
in the struct.

  struct s {
    int count;
    char dummy;
    int array[] __attribute((counted_by(count)));
  };

  struct s *p = malloc(...);

  p->count = 10;

A __bdos on the elements of p return:

  __bdos(p, 0) == 30
  __bdos(p->array, 0) == 10
  __bdos(&p->array[0], 0) == 10
  __bdos(&p->array[-1], 0) == 0
  __bdos(&p->array[42], 0) == 0

Also perform some refactoring, putting the "counted_by" calculations in
their own function.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Nov 9, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 9, 2023

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Bill Wendling (bwendling)

Changes

GCC returns 0 for a negative index on an array in a structure. It also
returns 0 for an array index that goes beyond the extent of the array.
In addition. a pointer to a struct field returns that field's size, not
the size of it plus the rest of the struct, unless it's the first field
in the struct.

struct s {
int count;
char dummy;
int array[] __attribute((counted_by(count)));
};

struct s *p = malloc(...);

p->count = 10;

A __bdos on the elements of p return:

__bdos(p, 0) == 30
__bdos(p->array, 0) == 10
__bdos(&p->array[0], 0) == 10
__bdos(&p->array[-1], 0) == 0
__bdos(&p->array[42], 0) == 0

Also perform some refactoring, putting the "counted_by" calculations in
their own function.


Patch is 42.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71877.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+166-143)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+3)
  • (modified) clang/test/CodeGen/attr-counted-by.c (+99-89)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 03ea7ad187e53dc..ebbd5016160b22f 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -827,6 +827,165 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+                                             llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //       struct s {
+  //         unsigned long flags;
+  //         int count;
+  //         int array[] __attribute__((counted_by(count)));
+  //       }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  //     __builtin_dynamic_object_size(p->array, 1) ==
+  //         p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  //     __builtin_dynamic_object_size(&p->array[42], 1) ==
+  //         (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  //     __builtin_dynamic_object_size(p, 1) ==
+  //        max(sizeof(struct s),
+  //            offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast<UnaryOperator>(Base);
+      UO && UO->getOpcode() == UO_AddrOf) {
+    Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+    if (const auto *ASE = dyn_cast<ArraySubscriptExpr>(SubExpr)) {
+      Base = ASE->getBase()->IgnoreParenImpCasts();
+      Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+      if (const auto *IL = dyn_cast<IntegerLiteral>(Idx)) {
+        int64_t Val = IL->getValue().getSExtValue();
+        if (Val < 0)
+          // __bdos returns 0 for negative indexes into an array in a struct.
+          return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+        if (Val == 0)
+          // The index is 0, so we don't need to take it into account.
+          Idx = nullptr;
+      }
+    } else {
+      // Potential pointer to another element in the struct.
+      Base = SubExpr;
+    }
+  }
+
+  // Get the flexible array member Decl.
+  const ValueDecl *FAMDecl = nullptr;
+  if (const auto *ME = dyn_cast<MemberExpr>(Base)) {
+    // Check if \p Base is referencing the FAM itself.
+    if (const ValueDecl *MD = ME->getMemberDecl()) {
+      const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+          getLangOpts().getStrictFlexArraysLevel();
+      if (!Decl::isFlexibleArrayMemberLike(
+              Ctx, MD, MD->getType(), StrictFlexArraysLevel,
+              /*IgnoreTemplateOrMacroSubstitution=*/true))
+        return nullptr;
+
+      FAMDecl = MD;
+    }
+  } else if (const auto *DRE = dyn_cast<DeclRefExpr>(Base)) {
+    // Check if we're pointing to the whole struct.
+    QualType Ty = DRE->getDecl()->getType();
+    if (Ty->isPointerType())
+      Ty = Ty->getPointeeType();
+
+    if (const auto *RD = Ty->getAsRecordDecl())
+      // Don't use the outer lexical record because the FAM might be in a
+      // different RecordDecl.
+      FAMDecl = FindFlexibleArrayMemberField(Ctx, RD);
+  }
+
+  if (!FAMDecl || !FAMDecl->hasAttr<CountedByAttr>())
+    // No flexible array member found or it doesn't have the "counted_by"
+    // attribute.
+    return nullptr;
+
+  const ValueDecl *CountedByFD = FindCountedByField(Base);
+  if (!CountedByFD)
+    // Can't find the field referenced by the "counted_by" attribute.
+    return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  Value *CountedByInst = EmitAnyExprToTemp(CountedByExpr).getScalarVal();
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+    bool IdxSigned = Idx->getType()->isSignedIntegerType();
+    IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+    IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+                        : Builder.CreateZExtOrTrunc(IdxInst, CountedByTy);
+
+    // We go ahead with the calculation here. If the index turns out to be
+    // negative, we'll catch it at the end.
+    CountedByInst =
+        Builder.CreateSub(CountedByInst, IdxInst, "", !IsSigned, IsSigned);
+  }
+
+  // Calculate how large the flexible array member is in bytes.
+  const ArrayType *ArrayTy = Ctx.getAsArrayType(FAMDecl->getType());
+  CharUnits Size = Ctx.getTypeSizeInChars(ArrayTy->getElementType());
+  llvm::Constant *ElemSize =
+      llvm::ConstantInt::get(CountedByTy, Size.getQuantity(), IsSigned);
+  Value *FAMSize =
+      Builder.CreateMul(CountedByInst, ElemSize, "", !IsSigned, IsSigned);
+  FAMSize = IsSigned ? Builder.CreateSExtOrTrunc(FAMSize, ResType)
+                     : Builder.CreateZExtOrTrunc(FAMSize, ResType);
+  Value *Res = FAMSize;
+
+  if (const auto *DRE = dyn_cast<DeclRefExpr>(Base)) {
+    // The whole struct is specificed in the __bdos.
+    const RecordDecl *OuterRD =
+        CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+    const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(OuterRD);
+
+    // Get the offset of the FAM.
+    CharUnits Offset = Ctx.toCharUnitsFromBits(Ctx.getFieldOffset(FAMDecl));
+    llvm::Constant *FAMOffset =
+        ConstantInt::get(ResType, Offset.getQuantity(), IsSigned);
+    Value *OffsetAndFAMSize =
+        Builder.CreateAdd(FAMOffset, Res, "", !IsSigned, IsSigned);
+
+    // Get the full size of the struct.
+    llvm::Constant *SizeofStruct =
+        ConstantInt::get(ResType, Layout.getSize().getQuantity(), IsSigned);
+
+    // max(sizeof(struct s),
+    //     offsetof(struct s, array) + p->count * sizeof(*p->array))
+    Res = IsSigned
+              ? Builder.CreateBinaryIntrinsic(llvm::Intrinsic::smax,
+                                              OffsetAndFAMSize, SizeofStruct)
+              : Builder.CreateBinaryIntrinsic(llvm::Intrinsic::umax,
+                                              OffsetAndFAMSize, SizeofStruct);
+  }
+
+  // A negative \p IdxInst or \p CountedByInst means that the index lands
+  // outside of the flexible array member. If that's the case, we want to
+  // return 0.
+  Value *Cmp = Builder.CreateIsNotNeg(CountedByInst);
+  if (IdxInst)
+    Cmp = Builder.CreateAnd(Builder.CreateIsNotNeg(IdxInst), Cmp);
+
+  return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned));
+}
+
 /// Returns a Value corresponding to the size of the given expression.
 /// This Value may be either of the following:
 ///   - A llvm::Argument (if E is a param with the pass_object_size attribute on
@@ -859,155 +1018,19 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, unsigned Type,
     }
   }
 
+  if (IsDynamic) {
+    // Emit special code for a flexible array member with the "counted_by"
+    // attribute.
+    if (Value *V = emitFlexibleArrayMemberSize(E, Type, ResType))
+      return V;
+  }
+
   // LLVM can't handle Type=3 appropriately, and __builtin_object_size shouldn't
   // evaluate E for side-effects. In either case, we shouldn't lower to
   // @llvm.objectsize.
   if (Type == 3 || (!EmittedE && E->HasSideEffects(getContext())))
     return getDefaultBuiltinObjectSizeResult(Type, ResType);
 
-  if (IsDynamic) {
-    // The code generated here calculates the size of a struct with a flexible
-    // array member that uses the counted_by attribute. There are two instances
-    // we handle:
-    //
-    //       struct s {
-    //         unsigned long flags;
-    //         int count;
-    //         int array[] __attribute__((counted_by(count)));
-    //       }
-    //
-    //   1) bdos of the flexible array itself:
-    //
-    //     __builtin_dynamic_object_size(p->array, 1) ==
-    //         p->count * sizeof(*p->array)
-    //
-    //   2) bdos of a pointer into the flexible array:
-    //
-    //     __builtin_dynamic_object_size(&p->array[42], 1) ==
-    //         (p->count - 42) * sizeof(*p->array)
-    //
-    //   2) bdos of the whole struct, including the flexible array:
-    //
-    //     __builtin_dynamic_object_size(p, 1) ==
-    //        max(sizeof(struct s),
-    //            offsetof(struct s, array) + p->count * sizeof(*p->array))
-    //
-    const Expr *Base = E->IgnoreParenImpCasts();
-    const Expr *Idx = nullptr;
-    if (const auto *UO = dyn_cast<UnaryOperator>(Base);
-        UO && UO->getOpcode() == UO_AddrOf) {
-      if (const auto *ASE =
-              dyn_cast<ArraySubscriptExpr>(UO->getSubExpr()->IgnoreParens())) {
-        Base = ASE->getBase();
-        Idx = ASE->getIdx()->IgnoreParenImpCasts();
-
-        if (const auto *IL = dyn_cast<IntegerLiteral>(Idx);
-            IL && !IL->getValue().getSExtValue())
-          Idx = nullptr;
-      }
-    }
-
-    if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
-      bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
-      const RecordDecl *OuterRD =
-          CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
-      ASTContext &Ctx = getContext();
-
-      // Load the counted_by field.
-      const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
-      Value *CountedByInst = EmitAnyExprToTemp(CountedByExpr).getScalarVal();
-      llvm::Type *CountedByTy = CountedByInst->getType();
-
-      if (Idx) {
-        // There's an index into the array. Remove it from the count.
-        bool IdxSigned = Idx->getType()->isSignedIntegerType();
-        Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
-        IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
-                            : Builder.CreateZExtOrTrunc(IdxInst, CountedByTy);
-
-        // If the index is negative, don't subtract it from the counted_by
-        // value. The pointer is pointing to something before the FAM.
-        IdxInst = Builder.CreateNeg(IdxInst, "", !IdxSigned, IdxSigned);
-        CountedByInst =
-            Builder.CreateAdd(CountedByInst, IdxInst, "", !IsSigned, IsSigned);
-      }
-
-      // Get the size of the flexible array member's base type.
-      const ValueDecl *FAMDecl = nullptr;
-      if (const auto *ME = dyn_cast<MemberExpr>(Base)) {
-        const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-            getLangOpts().getStrictFlexArraysLevel();
-        if (const ValueDecl *MD = ME->getMemberDecl();
-            MD && Decl::isFlexibleArrayMemberLike(
-                      Ctx, MD, MD->getType(), StrictFlexArraysLevel,
-                      /*IgnoreTemplateOrMacroSubstitution=*/true))
-          // Base is referencing the FAM itself.
-          FAMDecl = MD;
-      }
-
-      if (!FAMDecl)
-        FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD);
-
-      assert(FAMDecl && "Can't find the flexible array member field");
-
-      const ArrayType *ArrayTy = Ctx.getAsArrayType(FAMDecl->getType());
-      CharUnits Size = Ctx.getTypeSizeInChars(ArrayTy->getElementType());
-      llvm::Constant *ElemSize =
-          llvm::ConstantInt::get(CountedByTy, Size.getQuantity(), IsSigned);
-
-      // Calculate how large the flexible array member is in bytes.
-      Value *FAMSize =
-          Builder.CreateMul(CountedByInst, ElemSize, "", !IsSigned, IsSigned);
-      FAMSize = IsSigned ? Builder.CreateSExtOrTrunc(FAMSize, ResType)
-                         : Builder.CreateZExtOrTrunc(FAMSize, ResType);
-      Value *Res = FAMSize;
-
-      if (const auto *DRE = dyn_cast<DeclRefExpr>(Base)) {
-        // The whole struct is specificed in the __bdos.
-        const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(OuterRD);
-
-        // Get the offset of the FAM.
-        CharUnits Offset = Ctx.toCharUnitsFromBits(Ctx.getFieldOffset(FAMDecl));
-        llvm::Constant *FAMOffset =
-            ConstantInt::get(ResType, Offset.getQuantity(), IsSigned);
-
-        // max(sizeof(struct s),
-        //     offsetof(struct s, array) + p->count * sizeof(*p->array))
-        Value *OffsetAndFAMSize =
-            Builder.CreateAdd(FAMOffset, Res, "", !IsSigned, IsSigned);
-
-        // Get the full size of the struct.
-        llvm::Constant *SizeofStruct =
-            ConstantInt::get(ResType, Layout.getSize().getQuantity(), IsSigned);
-
-        Res = IsSigned
-                  ? Builder.CreateBinaryIntrinsic(
-                        llvm::Intrinsic::smax, OffsetAndFAMSize, SizeofStruct)
-                  : Builder.CreateBinaryIntrinsic(
-                        llvm::Intrinsic::umax, OffsetAndFAMSize, SizeofStruct);
-      } else if (const auto *ME = dyn_cast<MemberExpr>(Base)) {
-        // Pointing to a place before the FAM. Add the difference to the FAM's
-        // size.
-        if (const ValueDecl *MD = ME->getMemberDecl(); MD != FAMDecl) {
-          CharUnits Offset = Ctx.toCharUnitsFromBits(Ctx.getFieldOffset(MD));
-          CharUnits FAMOffset =
-              Ctx.toCharUnitsFromBits(Ctx.getFieldOffset(FAMDecl));
-
-          Res = Builder.CreateAdd(
-              Res, ConstantInt::get(ResType, FAMOffset.getQuantity() -
-                                                 Offset.getQuantity()));
-        }
-      }
-
-      // A negative 'FAMSize' means that the index was greater than the count,
-      // or an improperly set count field. Return -1 (for types 0 and 1) or 0
-      // (for types 2 and 3).
-      return Builder.CreateSelect(
-          Builder.CreateIsNeg(FAMSize),
-          getDefaultBuiltinObjectSizeResult(Type, ResType), Res);
-    }
-  }
-
   Value *Ptr = EmittedE ? EmittedE : EmitScalarExpr(E);
   assert(Ptr->getType()->isPointerTy() &&
          "Non-pointer passed to __builtin_object_size?");
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index bb8c14401032b5d..275a227bea1c577 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -4825,6 +4825,9 @@ class CodeGenFunction : public CodeGenTypeCache {
                                      llvm::Value *EmittedE,
                                      bool IsDynamic);
 
+  llvm::Value *emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+                                           llvm::IntegerType *ResType);
+
   void emitZeroOrPatternForAutoVarInit(QualType type, const VarDecl &D,
                                        Address Loc);
 
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index 3d3ce572ba62c00..5cefff0e6f1cd5c 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -115,9 +115,9 @@ void test1(struct annotated *p, int index, int val) {
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize !6
 // SANITIZE-WITH-ATTR:       cont12:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [[STRUCT_ANNOTATED]], ptr [[P]], i64 0, i32 2, i64 [[INDEX]]
-// SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp sgt i32 [[TMP0]], -1
+// SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp slt i32 [[TMP0]], 0
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP3:%.*]] = shl nsw i32 [[TMP0]], 2
-// SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = select i1 [[DOTINV]], i32 [[TMP3]], i32 -1
+// SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = select i1 [[DOTINV]], i32 0, i32 [[TMP3]]
 // SANITIZE-WITH-ATTR-NEXT:    store i32 [[NARROW]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
 // SANITIZE-WITH-ATTR-NEXT:    ret void
 //
@@ -127,8 +127,8 @@ void test1(struct annotated *p, int index, int val) {
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[COUNT:%.*]] = getelementptr inbounds [[STRUCT_ANNOTATED:%.*]], ptr [[P]], i64 0, i32 1
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = load i32, ptr [[COUNT]], align 8, !tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = shl nsw i32 [[TMP0]], 2
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp sgt i32 [[TMP0]], -1
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = select i1 [[DOTINV]], i32 [[TMP1]], i32 -1
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp slt i32 [[TMP0]], 0
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = select i1 [[DOTINV]], i32 0, i32 [[TMP1]]
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [[STRUCT_ANNOTATED]], ptr [[P]], i64 0, i32 2, i64 [[INDEX]]
 // NO-SANITIZE-WITH-ATTR-NEXT:    store i32 [[NARROW]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:    ret void
@@ -164,12 +164,12 @@ void test2(struct annotated *p, size_t index) {
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize !6
 // SANITIZE-WITH-ATTR:       cont12:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [[STRUCT_ANNOTATED]], ptr [[P]], i64 0, i32 2, i64 [[INDEX]]
+// SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp slt i32 [[TMP0]], 0
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP3:%.*]] = shl nsw i32 [[TMP0]], 2
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP4:%.*]] = tail call i32 @llvm.smax.i32(i32 [[TMP3]], i32 4)
 // SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = add nuw i32 [[TMP4]], 12
-// SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp sgt i32 [[TMP0]], -1
-// SANITIZE-WITH-ATTR-NEXT:    [[CONV:%.*]] = select i1 [[DOTINV]], i32 [[NARROW]], i32 -1
-// SANITIZE-WITH-ATTR-NEXT:    store i32 [[CONV]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
+// SANITIZE-WITH-ATTR-NEXT:    [[NARROW15:%.*]] = select i1 [[DOTINV]], i32 0, i32 [[NARROW]]
+// SANITIZE-WITH-ATTR-NEXT:    store i32 [[NARROW15]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
 // SANITIZE-WITH-ATTR-NEXT:    ret void
 //
 // NO-SANITIZE-WITH-ATTR-LABEL: define dso_local void @test3(
@@ -180,10 +180,10 @@ void test2(struct annotated *p, size_t index) {
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = shl nsw i32 [[TMP0]], 2
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP2:%.*]] = tail call i32 @llvm.smax.i32(i32 [[TMP1]], i32 4)
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = add nuw i32 [[TMP2]], 12
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp sgt i32 [[TMP0]], -1
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[CONV:%.*]] = select i1 [[DOTINV]], i32 [[NARROW]], i32 -1
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOTINV:%.*]] = icmp slt i32 [[TMP0]], 0
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[NARROW2:%.*]] = select i1 [[DOTINV]], i32 0, i32 [[NARROW]]
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [[STRUCT_ANNOTATED]], ptr [[P]], i64 0, i32 2, i64 [[INDEX]]
-// NO-SANITIZE-WITH-ATTR-NEXT:    store i32 [[CONV]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
+// NO-SANITIZE-WITH-ATTR-NEXT:    store i32 [[NARROW2]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:    ret void
 //
 // SANITIZE-WITHOUT-ATTR-LABEL: define dso_local void @test3(
@@ -220,76 +220,82 @@ void test3(struct annotated *p, size_t index) {
 // SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB6:[0-9]+]], i64 [[TMP4]]) #[[ATTR4]], !nosanitize !6
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize !6
 // SANITIZE-WITH-ATTR:       cont13:
-// SANITIZE-WITH-ATTR-NEXT:    [[TMP5:%.*]] = shl i32 [[TMP0]], 2
-// SANITIZE-WITH-ATTR-NEXT:    [[TMP6:%.*]] = add i32 [[TMP5]], -12
-// SANITIZE-WITH-ATTR-NEXT:    [[NARROW:%.*]] = tail call i32 @llvm.smax.i32(i32 [[TMP6]], i32 -1)
-// SANITIZE-WITH-ATTR-NEXT:    [[CONV3:%.*]] = and i32 [[NARROW]], 255
+// SANITIZE-WITH-ATTR-NEXT:    [[TMP5:%.*]] = icmp sgt i32 [[TMP0]], 2
+// SANITIZE-WITH-ATTR-NEXT:    [[TMP6:%.*]] = shl i32 [[TMP0]], 2
+// SANITIZE-WITH-ATTR-NEXT:    [[TMP7:%.*]] = add i32 [[TMP6]], 244
+// SANITIZE-WITH-ATTR-NEXT:    [[TMP8:%.*]] = and i32 [[TMP7]], 252
+// SANITIZE-WITH-ATTR-NEXT:    [[CONV3:%.*]] = select i1 [[TMP5]], i32 [[TMP8]], i32 0
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [[STRUCT_ANNOTATED]], ptr [[P]], i64 0, i32 2, i64 [[TMP1]]
 // SANITIZE-WITH-ATTR-NEXT:    store i32 [[CONV3]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
-// SANITIZE-WITH-ATTR-NEXT:    [[TMP7:%.*]] = loa...
[truncated]

@bwendling
Copy link
Collaborator Author

/sub @rapidsna

@bwendling
Copy link
Collaborator Author

Friendly ping to @rapidsna and @nickdesaulniers

Copy link
Contributor

@kees kees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the test cases, I wonder if it might be good to add __bdos() calls with type 0 as well. The results should always be the same, but we do want to check for that. i.e.:

p->array[index] = __builtin_dynamic_object_size(&p->count, 0);

}

if (!FAMDecl || !FAMDecl->hasAttr<CountedByAttr>())
// No flexible array member found or it doesn't have the "counted_by"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the "alloc_size" attribute?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwendling
Copy link
Collaborator Author

For the test cases, I wonder if it might be good to add __bdos() calls with type 0 as well. The results should always be the same, but we do want to check for that. i.e.:

p->array[index] = __builtin_dynamic_object_size(&p->count, 0);

I can do that in a separate commit. It'll be nicer than crowding changes to the test case in here. :-)

@bwendling
Copy link
Collaborator Author

@kees interesting difference between 0 and 1 in __bdos(). You probably know about it, but I was surprised: https://godbolt.org/z/KTeqanjKf

@kees
Copy link
Contributor

kees commented Nov 15, 2023

Yeah, this is the "compiler doesn't know if pointer points into an array of structs or not" that has driven me crazy for years. But we can now reliably disambiguate this for structs that end with a flexible array member (or future pointers marked with __single). It's been a long time frustration with FORTIFY coverage. :P

@@ -827,6 +827,165 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const Expr *E, unsigned Type,
return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
}

llvm::Value *
CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
llvm::IntegerType *ResType) {
Copy link
Member

@nickdesaulniers nickdesaulniers Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's hard to tell what changed since this patch:

  1. creates CodeGenFunction::emitFlexibleArrayMemberSize moving existing logic to it
  2. changes logic within CodeGenFunction::emitFlexibleArrayMemberSize

Mind pre-committing 1 so that it's easier to see what logic was changed in isolation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a NFC patch that just does the refactoring and rebase this onto it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been rebased with the NFC refactor. PTAL.

@bwendling bwendling merged commit d8447c7 into llvm:main Nov 20, 2023
@bwendling bwendling deleted the llvm-project/counted-by-offsets branch November 20, 2023 17:49
bwendling added a commit to bwendling/llvm-project that referenced this pull request Dec 18, 2023
There are many issues that popped up with the counted_by feature. The
patch has grown too large and approval is blocking Linux testing.

Includes reverts of:
commit 769bc11 ("[Clang] Implement the 'counted_by' attribute (llvm#68750)")
commit bc09ec6 ("[CodeGen] Revamp counted_by calculations (llvm#70606)")
commit 1a09cfb ("[Clang] counted_by attr can apply only to C99 flexible array members (llvm#72347)")
commit a76adfb ("[NFC][Clang] Refactor code to calculate flexible array member size (llvm#72790)")
commit d8447c7 ("[Clang] Correct handling of negative and out-of-bounds indices (llvm#71877)")
Partial commit b31cd07 ("[Clang] Regenerate test checks (NFC)")
bwendling added a commit that referenced this pull request Dec 18, 2023
There are many issues that popped up with the counted_by feature. The
patch #73730 has grown too large and approval is blocking Linux testing.

Includes reverts of:
commit 769bc11 ("[Clang] Implement the 'counted_by' attribute
(#68750)")
commit bc09ec6 ("[CodeGen] Revamp counted_by calculations
(#70606)")
commit 1a09cfb ("[Clang] counted_by attr can apply only to C99
flexible array members (#72347)")
commit a76adfb ("[NFC][Clang] Refactor code to calculate flexible
array member size (#72790)")
commit d8447c7 ("[Clang] Correct handling of negative and
out-of-bounds indices (#71877)")
Partial commit b31cd07 ("[Clang] Regenerate test checks (NFC)")

Closes #73168
Closes #75173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants