Skip to content

Commit 7475156

Browse files
bwendlingisanbardAaronBallman
authored
[Clang] Add __builtin_counted_by_ref builtin (#114495)
The __builtin_counted_by_ref builtin is used on a flexible array pointer and returns a pointer to the "counted_by" attribute's COUNT argument, which is a field in the same non-anonymous struct as the flexible array member. This is useful for automatically setting the count field without needing the programmer's intervention. Otherwise it's possible to get this anti-pattern: ptr = alloc(<ty>, ..., COUNT); ptr->FAM[9] = 42; /* <<< Sanitizer will complain */ ptr->count = COUNT; To prevent this anti-pattern, the user can create an allocator that automatically performs the assignment: #define alloc(TY, FAM, COUNT) ({ \ TY __p = alloc(get_size(TY, COUNT)); \ if (__builtin_counted_by_ref(__p->FAM)) \ *__builtin_counted_by_ref(__p->FAM) = COUNT; \ __p; \ }) The builtin's behavior is heavily dependent upon the "counted_by" attribute existing. It's main utility is during allocation to avoid the above anti-pattern. If the flexible array member doesn't have that attribute, the builtin becomes a no-op. Therefore, if the flexible array member has a "count" field not referenced by "counted_by", it must be set explicitly after the allocation as this builtin will return a "nullptr" and the assignment will most likely be elided. --------- Co-authored-by: Bill Wendling <[email protected]> Co-authored-by: Aaron Ballman <[email protected]>
1 parent ae9d062 commit 7475156

15 files changed

+618
-12
lines changed

clang/docs/LanguageExtensions.rst

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3774,6 +3774,74 @@ type-generic alternative to the ``__builtin_clz{,l,ll}`` (respectively
37743774
``__builtin_ctz{,l,ll}``) builtins, with support for other integer types, such
37753775
as ``unsigned __int128`` and C23 ``unsigned _BitInt(N)``.
37763776
3777+
``__builtin_counted_by_ref``
3778+
----------------------------
3779+
3780+
``__builtin_counted_by_ref`` returns a pointer to the count field from the
3781+
``counted_by`` attribute.
3782+
3783+
The argument must be a flexible array member. If the argument isn't a flexible
3784+
array member or doesn't have the ``counted_by`` attribute, the builtin returns
3785+
``(void *)0``.
3786+
3787+
**Syntax**:
3788+
3789+
.. code-block:: c
3790+
3791+
T *__builtin_counted_by_ref(void *array)
3792+
3793+
**Examples**:
3794+
3795+
.. code-block:: c
3796+
3797+
#define alloc(P, FAM, COUNT) ({ \
3798+
size_t __ignored_assignment; \
3799+
typeof(P) __p = NULL; \
3800+
__p = malloc(MAX(sizeof(*__p), \
3801+
sizeof(*__p) + sizeof(*__p->FAM) * COUNT)); \
3802+
\
3803+
*_Generic( \
3804+
__builtin_counted_by_ref(__p->FAM), \
3805+
void *: &__ignored_assignment, \
3806+
default: __builtin_counted_by_ref(__p->FAM)) = COUNT; \
3807+
\
3808+
__p; \
3809+
})
3810+
3811+
**Description**:
3812+
3813+
The ``__builtin_counted_by_ref`` builtin allows the programmer to prevent a
3814+
common error associated with the ``counted_by`` attribute. When using the
3815+
``counted_by`` attribute, the ``count`` field **must** be set before the
3816+
flexible array member can be accessed. Otherwise, the sanitizers may view such
3817+
accesses as false positives. For instance, it's not uncommon for programmers to
3818+
initialize the flexible array before setting the ``count`` field:
3819+
3820+
.. code-block:: c
3821+
3822+
struct s {
3823+
int dummy;
3824+
short count;
3825+
long array[] __attribute__((counted_by(count)));
3826+
};
3827+
3828+
struct s *ptr = malloc(sizeof(struct s) + sizeof(long) * COUNT);
3829+
3830+
for (int i = 0; i < COUNT; ++i)
3831+
ptr->array[i] = i;
3832+
3833+
ptr->count = COUNT;
3834+
3835+
Enforcing the rule that ``ptr->count = COUNT;`` must occur after every
3836+
allocation of a struct with a flexible array member with the ``counted_by``
3837+
attribute is prone to failure in large code bases. This builtin mitigates this
3838+
for allocators (like in Linux) that are implemented in a way where the counter
3839+
assignment can happen automatically.
3840+
3841+
**Note:** The value returned by ``__builtin_counted_by_ref`` cannot be assigned
3842+
to a variable, have its address taken, or passed into or returned from a
3843+
function, because doing so violates bounds safety conventions.
3844+
37773845
Multiprecision Arithmetic Builtins
37783846
----------------------------------
37793847

clang/docs/ReleaseNotes.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -313,6 +313,29 @@ Non-comprehensive list of changes in this release
313313
as well as declarations.
314314
- ``__builtin_abs`` function can now be used in constant expressions.
315315

316+
- The new builtin ``__builtin_counted_by_ref`` was added. In contexts where the
317+
programmer needs access to the ``counted_by`` attribute's field, but it's not
318+
available --- e.g. in macros. For instace, it can be used to automatically
319+
set the counter during allocation in the Linux kernel:
320+
321+
.. code-block:: c
322+
323+
/* A simplified version of Linux allocation macros */
324+
#define alloc(PTR, FAM, COUNT) ({ \
325+
sizeof_t __ignored_assignment; \
326+
typeof(P) __p; \
327+
size_t __size = sizeof(*P) + sizeof(*P->FAM) * COUNT; \
328+
__p = malloc(__size); \
329+
*_Generic( \
330+
__builtin_counted_by_ref(__p->FAM), \
331+
void *: &__ignored_assignment, \
332+
default: __builtin_counted_by_ref(__p->FAM)) = COUNT; \
333+
__p; \
334+
})
335+
336+
The flexible array member (FAM) can now be accessed immediately without causing
337+
issues with the sanitizer because the counter is automatically set.
338+
316339
New Compiler Flags
317340
------------------
318341

clang/include/clang/Basic/Builtins.td

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4932,3 +4932,9 @@ def ArithmeticFence : LangBuiltin<"ALL_LANGUAGES"> {
49324932
let Attributes = [CustomTypeChecking, Constexpr];
49334933
let Prototype = "void(...)";
49344934
}
4935+
4936+
def CountedByRef : Builtin {
4937+
let Spellings = ["__builtin_counted_by_ref"];
4938+
let Attributes = [NoThrow, CustomTypeChecking];
4939+
let Prototype = "int(...)";
4940+
}

clang/include/clang/Basic/DiagnosticSemaKinds.td

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6652,6 +6652,18 @@ def warn_counted_by_attr_elt_type_unknown_size :
66526652
Warning<err_counted_by_attr_pointee_unknown_size.Summary>,
66536653
InGroup<BoundsSafetyCountedByEltTyUnknownSize>;
66546654

6655+
// __builtin_counted_by_ref diagnostics:
6656+
def err_builtin_counted_by_ref_must_be_flex_array_member : Error<
6657+
"'__builtin_counted_by_ref' argument must reference a flexible array member">;
6658+
def err_builtin_counted_by_ref_cannot_leak_reference : Error<
6659+
"value returned by '__builtin_counted_by_ref' cannot be assigned to a "
6660+
"variable, have its address taken, or passed into or returned from a function">;
6661+
def err_builtin_counted_by_ref_invalid_lhs_use : Error<
6662+
"value returned by '__builtin_counted_by_ref' cannot be used in "
6663+
"%select{an array subscript|a binary}0 expression">;
6664+
def err_builtin_counted_by_ref_has_side_effects : Error<
6665+
"'__builtin_counted_by_ref' argument cannot have side-effects">;
6666+
66556667
let CategoryName = "ARC Semantic Issue" in {
66566668

66576669
// ARC-mode diagnostics.

clang/include/clang/Sema/Sema.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2510,6 +2510,8 @@ class Sema final : public SemaBase {
25102510

25112511
bool BuiltinNonDeterministicValue(CallExpr *TheCall);
25122512

2513+
bool BuiltinCountedByRef(CallExpr *TheCall);
2514+
25132515
// Matrix builtin handling.
25142516
ExprResult BuiltinMatrixTranspose(CallExpr *TheCall, ExprResult CallResult);
25152517
ExprResult BuiltinMatrixColumnMajorLoad(CallExpr *TheCall,

clang/lib/AST/Decl.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3657,6 +3657,10 @@ unsigned FunctionDecl::getBuiltinID(bool ConsiderWrapperFunctions) const {
36573657
(!hasAttr<ArmBuiltinAliasAttr>() && !hasAttr<BuiltinAliasAttr>()))
36583658
return 0;
36593659

3660+
if (getASTContext().getLangOpts().CPlusPlus &&
3661+
BuiltinID == Builtin::BI__builtin_counted_by_ref)
3662+
return 0;
3663+
36603664
const ASTContext &Context = getASTContext();
36613665
if (!Context.BuiltinInfo.isPredefinedLibFunction(BuiltinID))
36623666
return BuiltinID;

clang/lib/CodeGen/CGBuiltin.cpp

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3691,6 +3691,35 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
36913691
return RValue::get(emitBuiltinObjectSize(E->getArg(0), Type, ResType,
36923692
/*EmittedE=*/nullptr, IsDynamic));
36933693
}
3694+
case Builtin::BI__builtin_counted_by_ref: {
3695+
// Default to returning '(void *) 0'.
3696+
llvm::Value *Result = llvm::ConstantPointerNull::get(
3697+
llvm::PointerType::getUnqual(getLLVMContext()));
3698+
3699+
const Expr *Arg = E->getArg(0)->IgnoreParenImpCasts();
3700+
3701+
if (auto *UO = dyn_cast<UnaryOperator>(Arg);
3702+
UO && UO->getOpcode() == UO_AddrOf) {
3703+
Arg = UO->getSubExpr()->IgnoreParenImpCasts();
3704+
3705+
if (auto *ASE = dyn_cast<ArraySubscriptExpr>(Arg))
3706+
Arg = ASE->getBase()->IgnoreParenImpCasts();
3707+
}
3708+
3709+
if (const MemberExpr *ME = dyn_cast_if_present<MemberExpr>(Arg)) {
3710+
if (auto *CATy =
3711+
ME->getMemberDecl()->getType()->getAs<CountAttributedType>();
3712+
CATy && CATy->getKind() == CountAttributedType::CountedBy) {
3713+
const auto *FAMDecl = cast<FieldDecl>(ME->getMemberDecl());
3714+
if (const FieldDecl *CountFD = FAMDecl->findCountedByField())
3715+
Result = GetCountedByFieldExprGEP(Arg, FAMDecl, CountFD);
3716+
else
3717+
llvm::report_fatal_error("Cannot find the counted_by 'count' field");
3718+
}
3719+
}
3720+
3721+
return RValue::get(Result);
3722+
}
36943723
case Builtin::BI__builtin_prefetch: {
36953724
Value *Locality, *RW, *Address = EmitScalarExpr(E->getArg(0));
36963725
// FIXME: Technically these constants should of type 'int', yes?

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1145,15 +1145,7 @@ static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
11451145
return false;
11461146
}
11471147

1148-
/// This method is typically called in contexts where we can't generate
1149-
/// side-effects, like in __builtin_dynamic_object_size. When finding
1150-
/// expressions, only choose those that have either already been emitted or can
1151-
/// be loaded without side-effects.
1152-
///
1153-
/// - \p FAMDecl: the \p Decl for the flexible array member. It may not be
1154-
/// within the top-level struct.
1155-
/// - \p CountDecl: must be within the same non-anonymous struct as \p FAMDecl.
1156-
llvm::Value *CodeGenFunction::EmitLoadOfCountedByField(
1148+
llvm::Value *CodeGenFunction::GetCountedByFieldExprGEP(
11571149
const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
11581150
const RecordDecl *RD = CountDecl->getParent()->getOuterLexicalRecordContext();
11591151

@@ -1182,12 +1174,25 @@ llvm::Value *CodeGenFunction::EmitLoadOfCountedByField(
11821174
return nullptr;
11831175

11841176
Indices.push_back(Builder.getInt32(0));
1185-
Res = Builder.CreateInBoundsGEP(
1177+
return Builder.CreateInBoundsGEP(
11861178
ConvertType(QualType(RD->getTypeForDecl(), 0)), Res,
11871179
RecIndicesTy(llvm::reverse(Indices)), "..counted_by.gep");
1180+
}
11881181

1189-
return Builder.CreateAlignedLoad(ConvertType(CountDecl->getType()), Res,
1190-
getIntAlign(), "..counted_by.load");
1182+
/// This method is typically called in contexts where we can't generate
1183+
/// side-effects, like in __builtin_dynamic_object_size. When finding
1184+
/// expressions, only choose those that have either already been emitted or can
1185+
/// be loaded without side-effects.
1186+
///
1187+
/// - \p FAMDecl: the \p Decl for the flexible array member. It may not be
1188+
/// within the top-level struct.
1189+
/// - \p CountDecl: must be within the same non-anonymous struct as \p FAMDecl.
1190+
llvm::Value *CodeGenFunction::EmitLoadOfCountedByField(
1191+
const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
1192+
if (llvm::Value *GEP = GetCountedByFieldExprGEP(Base, FAMDecl, CountDecl))
1193+
return Builder.CreateAlignedLoad(ConvertType(CountDecl->getType()), GEP,
1194+
getIntAlign(), "..counted_by.load");
1195+
return nullptr;
11911196
}
11921197

11931198
void CodeGenFunction::EmitBoundsCheck(const Expr *E, const Expr *Base,

clang/lib/CodeGen/CodeGenFunction.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3305,6 +3305,10 @@ class CodeGenFunction : public CodeGenTypeCache {
33053305
const FieldDecl *FAMDecl,
33063306
uint64_t &Offset);
33073307

3308+
llvm::Value *GetCountedByFieldExprGEP(const Expr *Base,
3309+
const FieldDecl *FAMDecl,
3310+
const FieldDecl *CountDecl);
3311+
33083312
/// Build an expression accessing the "counted_by" field.
33093313
llvm::Value *EmitLoadOfCountedByField(const Expr *Base,
33103314
const FieldDecl *FAMDecl,

clang/lib/Sema/SemaChecking.cpp

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2973,6 +2973,10 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
29732973
}
29742974
break;
29752975
}
2976+
case Builtin::BI__builtin_counted_by_ref:
2977+
if (BuiltinCountedByRef(TheCall))
2978+
return ExprError();
2979+
break;
29762980
}
29772981

29782982
if (getLangOpts().HLSL && HLSL().CheckBuiltinFunctionCall(BuiltinID, TheCall))
@@ -5575,6 +5579,55 @@ bool Sema::BuiltinSetjmp(CallExpr *TheCall) {
55755579
return false;
55765580
}
55775581

5582+
bool Sema::BuiltinCountedByRef(CallExpr *TheCall) {
5583+
if (checkArgCount(TheCall, 1))
5584+
return true;
5585+
5586+
ExprResult ArgRes = UsualUnaryConversions(TheCall->getArg(0));
5587+
if (ArgRes.isInvalid())
5588+
return true;
5589+
5590+
// For simplicity, we support only limited expressions for the argument.
5591+
// Specifically a pointer to a flexible array member:'ptr->array'. This
5592+
// allows us to reject arguments with complex casting, which really shouldn't
5593+
// be a huge problem.
5594+
const Expr *Arg = ArgRes.get()->IgnoreParenImpCasts();
5595+
if (!isa<PointerType>(Arg->getType()) && !Arg->getType()->isArrayType())
5596+
return Diag(Arg->getBeginLoc(),
5597+
diag::err_builtin_counted_by_ref_must_be_flex_array_member)
5598+
<< Arg->getSourceRange();
5599+
5600+
if (Arg->HasSideEffects(Context))
5601+
return Diag(Arg->getBeginLoc(),
5602+
diag::err_builtin_counted_by_ref_has_side_effects)
5603+
<< Arg->getSourceRange();
5604+
5605+
if (const auto *ME = dyn_cast<MemberExpr>(Arg)) {
5606+
if (!ME->isFlexibleArrayMemberLike(
5607+
Context, getLangOpts().getStrictFlexArraysLevel()))
5608+
return Diag(Arg->getBeginLoc(),
5609+
diag::err_builtin_counted_by_ref_must_be_flex_array_member)
5610+
<< Arg->getSourceRange();
5611+
5612+
if (auto *CATy =
5613+
ME->getMemberDecl()->getType()->getAs<CountAttributedType>();
5614+
CATy && CATy->getKind() == CountAttributedType::CountedBy) {
5615+
const auto *FAMDecl = cast<FieldDecl>(ME->getMemberDecl());
5616+
if (const FieldDecl *CountFD = FAMDecl->findCountedByField()) {
5617+
TheCall->setType(Context.getPointerType(CountFD->getType()));
5618+
return false;
5619+
}
5620+
}
5621+
} else {
5622+
return Diag(Arg->getBeginLoc(),
5623+
diag::err_builtin_counted_by_ref_must_be_flex_array_member)
5624+
<< Arg->getSourceRange();
5625+
}
5626+
5627+
TheCall->setType(Context.getPointerType(Context.VoidTy));
5628+
return false;
5629+
}
5630+
55785631
namespace {
55795632

55805633
class UncoveredArgHandler {

clang/lib/Sema/SemaExpr.cpp

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9209,6 +9209,38 @@ Sema::CheckAssignmentConstraints(QualType LHSType, ExprResult &RHS,
92099209
LHSType = Context.getCanonicalType(LHSType).getUnqualifiedType();
92109210
RHSType = Context.getCanonicalType(RHSType).getUnqualifiedType();
92119211

9212+
// __builtin_counted_by_ref cannot be assigned to a variable, used in
9213+
// function call, or in a return.
9214+
auto FindBuiltinCountedByRefExpr = [&](Expr *E) -> CallExpr * {
9215+
struct BuiltinCountedByRefVisitor
9216+
: public RecursiveASTVisitor<BuiltinCountedByRefVisitor> {
9217+
CallExpr *TheCall = nullptr;
9218+
bool VisitCallExpr(CallExpr *CE) {
9219+
if (CE->getBuiltinCallee() == Builtin::BI__builtin_counted_by_ref) {
9220+
TheCall = CE;
9221+
return false;
9222+
}
9223+
return true;
9224+
}
9225+
bool VisitUnaryExprOrTypeTraitExpr(UnaryExprOrTypeTraitExpr *UE) {
9226+
// A UnaryExprOrTypeTraitExpr---e.g. sizeof, __alignof, etc.---isn't
9227+
// the same as a CallExpr, so if we find a __builtin_counted_by_ref()
9228+
// call in one, ignore it.
9229+
return false;
9230+
}
9231+
} V;
9232+
V.TraverseStmt(E);
9233+
return V.TheCall;
9234+
};
9235+
static llvm::SmallPtrSet<CallExpr *, 4> Diagnosed;
9236+
if (auto *CE = FindBuiltinCountedByRefExpr(RHS.get());
9237+
CE && !Diagnosed.count(CE)) {
9238+
Diagnosed.insert(CE);
9239+
Diag(CE->getExprLoc(),
9240+
diag::err_builtin_counted_by_ref_cannot_leak_reference)
9241+
<< CE->getSourceRange();
9242+
}
9243+
92129244
// Common case: no conversion required.
92139245
if (LHSType == RHSType) {
92149246
Kind = CK_NoOp;
@@ -13757,6 +13789,43 @@ QualType Sema::CheckAssignmentOperands(Expr *LHSExpr, ExprResult &RHS,
1375713789
ConvTy = CheckAssignmentConstraints(Loc, LHSType, RHSType);
1375813790
}
1375913791

13792+
// __builtin_counted_by_ref can't be used in a binary expression or array
13793+
// subscript on the LHS.
13794+
int DiagOption = -1;
13795+
auto FindInvalidUseOfBoundsSafetyCounter = [&](Expr *E) -> CallExpr * {
13796+
struct BuiltinCountedByRefVisitor
13797+
: public RecursiveASTVisitor<BuiltinCountedByRefVisitor> {
13798+
CallExpr *CE = nullptr;
13799+
bool InvalidUse = false;
13800+
int Option = -1;
13801+
13802+
bool VisitCallExpr(CallExpr *E) {
13803+
if (E->getBuiltinCallee() == Builtin::BI__builtin_counted_by_ref) {
13804+
CE = E;
13805+
return false;
13806+
}
13807+
return true;
13808+
}
13809+
13810+
bool VisitArraySubscriptExpr(ArraySubscriptExpr *E) {
13811+
InvalidUse = true;
13812+
Option = 0; // report 'array expression' in diagnostic.
13813+
return true;
13814+
}
13815+
bool VisitBinaryOperator(BinaryOperator *E) {
13816+
InvalidUse = true;
13817+
Option = 1; // report 'binary expression' in diagnostic.
13818+
return true;
13819+
}
13820+
} V;
13821+
V.TraverseStmt(E);
13822+
DiagOption = V.Option;
13823+
return V.InvalidUse ? V.CE : nullptr;
13824+
};
13825+
if (auto *CE = FindInvalidUseOfBoundsSafetyCounter(LHSExpr))
13826+
Diag(CE->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_lhs_use)
13827+
<< DiagOption << CE->getSourceRange();
13828+
1376013829
if (DiagnoseAssignmentResult(ConvTy, Loc, LHSType, RHSType, RHS.get(),
1376113830
AssignmentAction::Assigning))
1376213831
return QualType();

0 commit comments

Comments
 (0)