Skip to content

[Clang] Improve Sema diagnostic performance for __builtin_counted_by_ref #116719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 27, 2024

Conversation

bwendling
Copy link
Collaborator

Implement the sema checks with a placeholder. We then check for that placeholder in all of the places we care to emit a diagnostic.

Fixes: #115520

Implement the sema checks with a placeholder. We then check for that
placeholder in all of the places we care to emit a diagnostic.
@llvmbot llvmbot added clang Clang issues not falling into any other category lldb clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Nov 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Bill Wendling (bwendling)

Changes

Implement the sema checks with a placeholder. We then check for that placeholder in all of the places we care to emit a diagnostic.

Fixes: #115520


Patch is 29.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116719.diff

17 Files Affected:

  • (modified) clang/include/clang/AST/ASTContext.h (+1)
  • (modified) clang/include/clang/AST/BuiltinTypes.def (+3)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+1-1)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+4-1)
  • (modified) clang/lib/AST/ASTContext.cpp (+4)
  • (modified) clang/lib/AST/NSAPI.cpp (+1)
  • (modified) clang/lib/AST/Type.cpp (+3)
  • (modified) clang/lib/AST/TypeLoc.cpp (+1)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+1)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+11)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+66-72)
  • (modified) clang/lib/Sema/SemaStmt.cpp (+10)
  • (modified) clang/lib/Serialization/ASTCommon.cpp (+3)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+3)
  • (modified) clang/test/Modules/no-external-type-id.cppm (+1-1)
  • (modified) clang/test/Sema/builtin-counted-by-ref.c (+39-38)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+2)
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 89fcb6789d880a..39cad95d911a33 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -1184,6 +1184,7 @@ class ASTContext : public RefCountedBase<ASTContext> {
   CanQualType DependentTy, OverloadTy, BoundMemberTy, UnresolvedTemplateTy,
       UnknownAnyTy;
   CanQualType BuiltinFnTy;
+  CanQualType BuiltinCountedByRefTy;
   CanQualType PseudoObjectTy, ARCUnbridgedCastTy;
   CanQualType ObjCBuiltinIdTy, ObjCBuiltinClassTy, ObjCBuiltinSelTy;
   CanQualType ObjCBuiltinBoolTy;
diff --git a/clang/include/clang/AST/BuiltinTypes.def b/clang/include/clang/AST/BuiltinTypes.def
index 444be4311a7431..4eae6962a46de6 100644
--- a/clang/include/clang/AST/BuiltinTypes.def
+++ b/clang/include/clang/AST/BuiltinTypes.def
@@ -314,6 +314,9 @@ PLACEHOLDER_TYPE(UnknownAny, UnknownAnyTy)
 
 PLACEHOLDER_TYPE(BuiltinFn, BuiltinFnTy)
 
+// A placeholder type for __builtin_counted_by_ref.
+PLACEHOLDER_TYPE(BuiltinCountedByRef, BuiltinCountedByRefTy)
+
 // The type of a cast which, in ARC, would normally require a
 // __bridge, but which might be okay depending on the immediate
 // context.
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 3caf471d3037f9..37fb44d4bf74cd 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6673,7 +6673,7 @@ def err_builtin_counted_by_ref_must_be_flex_array_member : Error<
 def err_builtin_counted_by_ref_cannot_leak_reference : Error<
   "value returned by '__builtin_counted_by_ref' cannot be assigned to a "
   "variable, have its address taken, or passed into or returned from a function">;
-def err_builtin_counted_by_ref_invalid_lhs_use : Error<
+def err_builtin_counted_by_ref_invalid_use : Error<
   "value returned by '__builtin_counted_by_ref' cannot be used in "
   "%select{an array subscript|a binary}0 expression">;
 def err_builtin_counted_by_ref_has_side_effects : Error<
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index fd834c14ce790f..f4b71861968e77 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -1111,6 +1111,9 @@ enum PredefinedTypeIDs {
   /// \brief The '__ibm128' type
   PREDEF_TYPE_IBM128_ID = 74,
 
+  /// \brief The placeholder type for __builtin_counted_by_ref.
+  PREDEF_TYPE_BUILTIN_COUNTED_BY_REF = 75,
+
 /// OpenCL image types with auto numeration
 #define IMAGE_TYPE(ImgType, Id, SingletonId, Access, Suffix)                   \
   PREDEF_TYPE_##Id##_ID,
@@ -1148,7 +1151,7 @@ enum PredefinedTypeIDs {
 ///
 /// Type IDs for non-predefined types will start at
 /// NUM_PREDEF_TYPE_IDs.
-const unsigned NUM_PREDEF_TYPE_IDS = 513;
+const unsigned NUM_PREDEF_TYPE_IDS = 514;
 
 // Ensure we do not overrun the predefined types we reserved
 // in the enum PredefinedTypeIDs above.
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 14fbadbc35ae5d..06226afaf23aab 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1390,6 +1390,10 @@ void ASTContext::InitBuiltinTypes(const TargetInfo &Target,
   if (LangOpts.MatrixTypes)
     InitBuiltinType(IncompleteMatrixIdxTy, BuiltinType::IncompleteMatrixIdx);
 
+  // Placeholder for __builtin_counted_by_ref().
+  if (!LangOpts.CPlusPlus)
+    InitBuiltinType(BuiltinCountedByRefTy, BuiltinType::BuiltinCountedByRef);
+
   // Builtin types for 'id', 'Class', and 'SEL'.
   InitBuiltinType(ObjCBuiltinIdTy, BuiltinType::ObjCId);
   InitBuiltinType(ObjCBuiltinClassTy, BuiltinType::ObjCClass);
diff --git a/clang/lib/AST/NSAPI.cpp b/clang/lib/AST/NSAPI.cpp
index 311fec32bbfa90..6a722b89af6205 100644
--- a/clang/lib/AST/NSAPI.cpp
+++ b/clang/lib/AST/NSAPI.cpp
@@ -466,6 +466,7 @@ NSAPI::getNSNumberFactoryMethodKind(QualType T) const {
   case BuiltinType::Half:
   case BuiltinType::PseudoObject:
   case BuiltinType::BuiltinFn:
+  case BuiltinType::BuiltinCountedByRef:
   case BuiltinType::IncompleteMatrixIdx:
   case BuiltinType::ArraySection:
   case BuiltinType::OMPArrayShaping:
diff --git a/clang/lib/AST/Type.cpp b/clang/lib/AST/Type.cpp
index b70f86ef31442d..1e44c2bd234fbb 100644
--- a/clang/lib/AST/Type.cpp
+++ b/clang/lib/AST/Type.cpp
@@ -3444,6 +3444,8 @@ StringRef BuiltinType::getName(const PrintingPolicy &Policy) const {
     return "<ARC unbridged cast type>";
   case BuiltinFn:
     return "<builtin fn type>";
+  case BuiltinCountedByRef:
+    return "<builtin counted by ref type>";
   case ObjCId:
     return "id";
   case ObjCClass:
@@ -4861,6 +4863,7 @@ bool Type::canHaveNullability(bool ResultIfUnknown) const {
 #define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
 #include "clang/Basic/HLSLIntangibleTypes.def"
     case BuiltinType::BuiltinFn:
+    case BuiltinType::BuiltinCountedByRef:
     case BuiltinType::NullPtr:
     case BuiltinType::IncompleteMatrixIdx:
     case BuiltinType::ArraySection:
diff --git a/clang/lib/AST/TypeLoc.cpp b/clang/lib/AST/TypeLoc.cpp
index fbb7fc5cd76902..088bd5c379d8c4 100644
--- a/clang/lib/AST/TypeLoc.cpp
+++ b/clang/lib/AST/TypeLoc.cpp
@@ -433,6 +433,7 @@ TypeSpecifierType BuiltinTypeLoc::getWrittenTypeSpec() const {
 #define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
 #include "clang/Basic/HLSLIntangibleTypes.def"
   case BuiltinType::BuiltinFn:
+  case BuiltinType::BuiltinCountedByRef:
   case BuiltinType::IncompleteMatrixIdx:
   case BuiltinType::ArraySection:
   case BuiltinType::OMPArrayShaping:
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 2d4a7cd287b70d..52b47a995624f4 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -755,6 +755,7 @@ static ExprResult BuiltinDumpStruct(Sema &S, CallExpr *TheCall) {
     case BuiltinType::PseudoObject:
     case BuiltinType::UnknownAny:
     case BuiltinType::BuiltinFn:
+    case BuiltinType::BuiltinCountedByRef:
       // This might be a callable.
       break;
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index a36ca61a1bef30..d4a90cf262b09f 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -14690,6 +14690,17 @@ void Sema::FinalizeDeclaration(Decl *ThisDecl) {
     }
   }
 
+  // The result of __builtin_counted_by_ref cannot be assigned to a variable.
+  // It allows leaking and modification of bounds safety information.
+  if (const auto *CE = dyn_cast_if_present<CallExpr>(VD->getInit())) {
+    const Expr *E = CE->getCallee()->IgnoreParenImpCasts();
+    if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+        PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef)
+      Diag(E->getExprLoc(),
+           diag::err_builtin_counted_by_ref_cannot_leak_reference)
+          << E->getSourceRange();
+  }
+
   checkAttributesAfterMerging(*this, *VD);
 
   if (VD->isStaticLocal())
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index dcf495b700540f..f1925bae9b91e3 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -3382,7 +3382,9 @@ ExprResult Sema::BuildDeclarationNameExpr(
   case Decl::Function: {
     if (unsigned BID = cast<FunctionDecl>(VD)->getBuiltinID()) {
       if (!Context.BuiltinInfo.isDirectlyAddressable(BID)) {
-        type = Context.BuiltinFnTy;
+        type = (BID == Builtin::BI__builtin_counted_by_ref)
+                   ? Context.BuiltinCountedByRefTy
+                   : Context.BuiltinFnTy;
         valueKind = VK_PRValue;
         break;
       }
@@ -4894,6 +4896,18 @@ ExprResult Sema::ActOnArraySubscriptExpr(Scope *S, Expr *base,
     return ExprError();
   }
 
+  // We cannot use __builtin_counted_by_ref in a binary expression. It's
+  // possible to leak the reference and violate bounds security.
+  Expr *E = base->IgnoreParenImpCasts();
+  if (auto *CE = dyn_cast<CallExpr>(E)) {
+    E = CE->getCallee()->IgnoreParenImpCasts();
+    if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+        PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+      Diag(E->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_use)
+          << 0 << E->getSourceRange();
+    }
+  }
+
   // Handle any non-overload placeholder types in the base and index
   // expressions.  We can't handle overloads here because the other
   // operand might be an overloadable type, in which case the overload
@@ -6188,6 +6202,7 @@ static bool isPlaceholderToRemoveAsArg(QualType type) {
   // These are always invalid as call arguments and should be reported.
   case BuiltinType::BoundMember:
   case BuiltinType::BuiltinFn:
+  case BuiltinType::BuiltinCountedByRef:
   case BuiltinType::IncompleteMatrixIdx:
   case BuiltinType::ArraySection:
   case BuiltinType::OMPArrayShaping:
@@ -6205,10 +6220,27 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
   for (size_t i = 0, e = args.size(); i != e; i++) {
     if (isPlaceholderToRemoveAsArg(args[i]->getType())) {
       ExprResult result = CheckPlaceholderExpr(args[i]);
-      if (result.isInvalid()) hasInvalid = true;
-      else args[i] = result.get();
+      if (result.isInvalid())
+        hasInvalid = true;
+      else
+        args[i] = result.get();
+    }
+
+    // The result of __builtin_counted_by_ref cannot be used as a function
+    // argument. It allows leaking and modification of bounds safety
+    // information.
+    if (const auto *CE = dyn_cast<CallExpr>(args[i])) {
+      const Expr *E = CE->getCallee()->IgnoreParenImpCasts();
+      if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+          PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+        hasInvalid = true;
+        Diag(E->getExprLoc(),
+             diag::err_builtin_counted_by_ref_cannot_leak_reference)
+            << E->getSourceRange();
+      }
     }
   }
+
   return hasInvalid;
 }
 
@@ -6770,7 +6802,9 @@ ExprResult Sema::BuildResolvedCallExpr(Expr *Fn, NamedDecl *NDecl,
   ExprResult Result;
   QualType ResultTy;
   if (BuiltinID &&
-      Fn->getType()->isSpecificBuiltinType(BuiltinType::BuiltinFn)) {
+      (Fn->getType()->isSpecificBuiltinType(BuiltinType::BuiltinFn) ||
+       Fn->getType()->isSpecificBuiltinType(
+           BuiltinType::BuiltinCountedByRef))) {
     // Extract the return type from the (builtin) function pointer type.
     // FIXME Several builtins still have setType in
     // Sema::CheckBuiltinFunctionCall. One should review their definitions in
@@ -9196,38 +9230,6 @@ Sema::CheckAssignmentConstraints(QualType LHSType, ExprResult &RHS,
   LHSType = Context.getCanonicalType(LHSType).getUnqualifiedType();
   RHSType = Context.getCanonicalType(RHSType).getUnqualifiedType();
 
-  // __builtin_counted_by_ref cannot be assigned to a variable, used in
-  // function call, or in a return.
-  auto FindBuiltinCountedByRefExpr = [&](Expr *E) -> CallExpr * {
-    struct BuiltinCountedByRefVisitor : DynamicRecursiveASTVisitor {
-      CallExpr *TheCall = nullptr;
-      bool VisitCallExpr(CallExpr *CE) override {
-        if (CE->getBuiltinCallee() == Builtin::BI__builtin_counted_by_ref) {
-          TheCall = CE;
-          return false;
-        }
-        return true;
-      }
-      bool
-      VisitUnaryExprOrTypeTraitExpr(UnaryExprOrTypeTraitExpr *UE) override {
-        // A UnaryExprOrTypeTraitExpr---e.g. sizeof, __alignof, etc.---isn't
-        // the same as a CallExpr, so if we find a __builtin_counted_by_ref()
-        // call in one, ignore it.
-        return false;
-      }
-    } V;
-    V.TraverseStmt(E);
-    return V.TheCall;
-  };
-  static llvm::SmallPtrSet<CallExpr *, 4> Diagnosed;
-  if (auto *CE = FindBuiltinCountedByRefExpr(RHS.get());
-      CE && !Diagnosed.count(CE)) {
-    Diagnosed.insert(CE);
-    Diag(CE->getExprLoc(),
-         diag::err_builtin_counted_by_ref_cannot_leak_reference)
-        << CE->getSourceRange();
-  }
-
   // Common case: no conversion required.
   if (LHSType == RHSType) {
     Kind = CK_NoOp;
@@ -13778,42 +13780,6 @@ QualType Sema::CheckAssignmentOperands(Expr *LHSExpr, ExprResult &RHS,
     ConvTy = CheckAssignmentConstraints(Loc, LHSType, RHSType);
   }
 
-  // __builtin_counted_by_ref can't be used in a binary expression or array
-  // subscript on the LHS.
-  int DiagOption = -1;
-  auto FindInvalidUseOfBoundsSafetyCounter = [&](Expr *E) -> CallExpr * {
-    struct BuiltinCountedByRefVisitor : DynamicRecursiveASTVisitor {
-      CallExpr *CE = nullptr;
-      bool InvalidUse = false;
-      int Option = -1;
-
-      bool VisitCallExpr(CallExpr *E) override {
-        if (E->getBuiltinCallee() == Builtin::BI__builtin_counted_by_ref) {
-          CE = E;
-          return false;
-        }
-        return true;
-      }
-
-      bool VisitArraySubscriptExpr(ArraySubscriptExpr *E) override {
-        InvalidUse = true;
-        Option = 0; // report 'array expression' in diagnostic.
-        return true;
-      }
-      bool VisitBinaryOperator(BinaryOperator *E) override {
-        InvalidUse = true;
-        Option = 1; // report 'binary expression' in diagnostic.
-        return true;
-      }
-    } V;
-    V.TraverseStmt(E);
-    DiagOption = V.Option;
-    return V.InvalidUse ? V.CE : nullptr;
-  };
-  if (auto *CE = FindInvalidUseOfBoundsSafetyCounter(LHSExpr))
-    Diag(CE->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_lhs_use)
-        << DiagOption << CE->getSourceRange();
-
   if (DiagnoseAssignmentResult(ConvTy, Loc, LHSType, RHSType, RHS.get(),
                                AssignmentAction::Assigning))
     return QualType();
@@ -15274,6 +15240,27 @@ ExprResult Sema::ActOnBinOp(Scope *S, SourceLocation TokLoc,
   if (Kind == tok::TokenKind::slash)
     DetectPrecisionLossInComplexDivision(*this, TokLoc, LHSExpr);
 
+  // We cannot use __builtin_counted_by_ref in a binary expression. It's
+  // possible to leak the reference and violate bounds security.
+  auto CheckBuiltinCountedByRefPlaceholder = [&](const Expr *E) {
+    if (const auto *CE = dyn_cast<CallExpr>(E->IgnoreParenImpCasts())) {
+      E = CE->getCallee()->IgnoreParenImpCasts();
+      if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+          PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+        if (BinaryOperator::isAssignmentOp(Opc))
+          Diag(E->getExprLoc(),
+               diag::err_builtin_counted_by_ref_cannot_leak_reference)
+              << E->getSourceRange();
+        else
+          Diag(E->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_use)
+              << 1 << E->getSourceRange();
+      }
+    }
+  };
+
+  CheckBuiltinCountedByRefPlaceholder(LHSExpr);
+  CheckBuiltinCountedByRefPlaceholder(RHSExpr);
+
   return BuildBinOp(S, TokLoc, Opc, LHSExpr, RHSExpr);
 }
 
@@ -21074,6 +21061,13 @@ ExprResult Sema::CheckPlaceholderExpr(Expr *E) {
     return ExprError();
   }
 
+  case BuiltinType::BuiltinCountedByRef: {
+    Diag(E->IgnoreParens()->getExprLoc(),
+         diag::err_builtin_counted_by_ref_cannot_leak_reference)
+        << E->IgnoreParens()->getSourceRange();
+    return ExprError();
+  }
+
   case BuiltinType::IncompleteMatrixIdx:
     Diag(cast<MatrixSubscriptExpr>(E->IgnoreParens())
              ->getRowIdx()
diff --git a/clang/lib/Sema/SemaStmt.cpp b/clang/lib/Sema/SemaStmt.cpp
index f3ee5211acdd11..024dc05f6c5636 100644
--- a/clang/lib/Sema/SemaStmt.cpp
+++ b/clang/lib/Sema/SemaStmt.cpp
@@ -3765,6 +3765,16 @@ Sema::ActOnReturnStmt(SourceLocation ReturnLoc, Expr *RetValExp,
         << FSI->getFirstCoroutineStmtKeyword();
   }
 
+  if (const auto *CE = dyn_cast_if_present<CallExpr>(RetVal.get())) {
+    const Expr *E = CE->getCallee()->IgnoreParenImpCasts();
+    if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+        PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+      Diag(E->getExprLoc(),
+           diag::err_builtin_counted_by_ref_cannot_leak_reference)
+          << E->getSourceRange();
+    }
+  }
+
   StmtResult R =
       BuildReturnStmt(ReturnLoc, RetVal.get(), /*AllowRecovery=*/true);
   if (R.isInvalid() || ExprEvalContexts.back().isDiscardedStatementContext())
diff --git a/clang/lib/Serialization/ASTCommon.cpp b/clang/lib/Serialization/ASTCommon.cpp
index ec18e84255ca8e..e7f54cf8839c70 100644
--- a/clang/lib/Serialization/ASTCommon.cpp
+++ b/clang/lib/Serialization/ASTCommon.cpp
@@ -274,6 +274,9 @@ serialization::TypeIdxFromBuiltin(const BuiltinType *BT) {
   case BuiltinType::BuiltinFn:
     ID = PREDEF_TYPE_BUILTIN_FN;
     break;
+  case BuiltinType::BuiltinCountedByRef:
+    ID = PREDEF_TYPE_BUILTIN_COUNTED_BY_REF;
+    break;
   case BuiltinType::IncompleteMatrixIdx:
     ID = PREDEF_TYPE_INCOMPLETE_MATRIX_IDX;
     break;
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index ec85fad3389a1c..76dfaa5481e424 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7458,6 +7458,9 @@ QualType ASTReader::GetType(TypeID ID) {
     case PREDEF_TYPE_BUILTIN_FN:
       T = Context.BuiltinFnTy;
       break;
+    case PREDEF_TYPE_BUILTIN_COUNTED_BY_REF:
+      T = Context.BuiltinCountedByRefTy;
+      break;
     case PREDEF_TYPE_INCOMPLETE_MATRIX_IDX:
       T = Context.IncompleteMatrixIdxTy;
       break;
diff --git a/clang/test/Modules/no-external-type-id.cppm b/clang/test/Modules/no-external-type-id.cppm
index d067e574e72e37..162300478632db 100644
--- a/clang/test/Modules/no-external-type-id.cppm
+++ b/clang/test/Modules/no-external-type-id.cppm
@@ -23,7 +23,7 @@ export module b;
 import a;
 export int b();
 
-// CHECK: <DECL_FUNCTION {{.*}} op8=4120
+// CHECK: <DECL_FUNCTION {{.*}} op8=4128
 // CHECK: <TYPE_FUNCTION_PROTO
 
 //--- a.v1.cppm
diff --git a/clang/test/Sema/builtin-counted-by-ref.c b/clang/test/Sema/builtin-counted-by-ref.c
index 5a7ecefcb78976..c21c7093468dde 100644
--- a/clang/test/Sema/builtin-counted-by-ref.c
+++ b/clang/test/Sema/builtin-counted-by-ref.c
@@ -11,62 +11,63 @@ struct fam_struct {
   int array[] __attribute__((counted_by(count)));
 };
 
-void test1(struct fam_struct *ptr, int size, int idx) {
-  size_t size_of = sizeof(__builtin_counted_by_ref(ptr->array)); // ok
-
-  *__builtin_counted_by_ref(ptr->array) = size;             // ok
+void g(char *);
+void h(char);
 
-  {
-      size_t __ignored_assignment;
-      *_Generic(__builtin_counted_by_ref(ptr->array),
-               void *: &__ignored_assignment,
-               default: __builtin_counted_by_ref(ptr->array)) = 42; // ok
-  }
+void test1(struct fam_struct *ptr, int size, int idx) {
+  size_t size_of = sizeof(__builtin_counted_by_ref(ptr->array));    // ok
+  int align_of = __alignof(__builtin_counted_by_ref(ptr->array));   // ok
+  size_t __ignored_assignment;
+
+  *__builtin_counted_by_ref(ptr->array) = size;                     // ok
+  *_Generic(__builtin_counted_by_ref(ptr->array),
+           void *: &__ignored_assignment,
+           default: __builtin_counted_by_ref(ptr->array)) = 42;     // ok
+  h(*__builtin_counted_by_ref(ptr->array));                         // ok
 }
 
 void test2(struct fam_struct *ptr, int idx) {
-  __builtin_counted_by_ref();                               // expected-error {{too few arguments to function call, expected 1, have 0}}
-  __builtin_counted_by_ref(ptr->array, ptr->x, ptr->count); // expected-error {{too many arguments to function call, expected 1, have 3}}
+  __builtin_counted_by_ref();                                       // expected-error {{too few arguments to function call, expected 1, have 0}}
+  __builtin_counted_by_ref(ptr->array, ptr->x, ptr->count);         // expected-error {{too many arguments to function call, expected 1, have 3}}
 }
 
 void test3(struct fam_struct *ptr, int idx) {
-  __builtin_counted_by_ref(&ptr->array[0]);                 // expected-error {{'__builtin_counted_by_ref' argument must reference a flexible array member}}
-  __builtin_counted_by_ref(&ptr->array[idx]);               // expected-error {{'__builtin_counted_by_ref' argument must reference a flexible array member}}
-  __builtin_counted_by_ref(&ptr->array);                    // expected-error {{'__builtin_counted_by_ref' argument must reference a flexible array member}}
-  __builtin_counted_by_re...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-lldb

Author: Bill Wendling (bwendling)

Changes

Implement the sema checks with a placeholder. We then check for that placeholder in all of the places we care to emit a diagnostic.

Fixes: #115520


Patch is 29.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116719.diff

17 Files Affected:

  • (modified) clang/include/clang/AST/ASTContext.h (+1)
  • (modified) clang/include/clang/AST/BuiltinTypes.def (+3)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+1-1)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+4-1)
  • (modified) clang/lib/AST/ASTContext.cpp (+4)
  • (modified) clang/lib/AST/NSAPI.cpp (+1)
  • (modified) clang/lib/AST/Type.cpp (+3)
  • (modified) clang/lib/AST/TypeLoc.cpp (+1)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+1)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+11)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+66-72)
  • (modified) clang/lib/Sema/SemaStmt.cpp (+10)
  • (modified) clang/lib/Serialization/ASTCommon.cpp (+3)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+3)
  • (modified) clang/test/Modules/no-external-type-id.cppm (+1-1)
  • (modified) clang/test/Sema/builtin-counted-by-ref.c (+39-38)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+2)
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 89fcb6789d880a..39cad95d911a33 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -1184,6 +1184,7 @@ class ASTContext : public RefCountedBase<ASTContext> {
   CanQualType DependentTy, OverloadTy, BoundMemberTy, UnresolvedTemplateTy,
       UnknownAnyTy;
   CanQualType BuiltinFnTy;
+  CanQualType BuiltinCountedByRefTy;
   CanQualType PseudoObjectTy, ARCUnbridgedCastTy;
   CanQualType ObjCBuiltinIdTy, ObjCBuiltinClassTy, ObjCBuiltinSelTy;
   CanQualType ObjCBuiltinBoolTy;
diff --git a/clang/include/clang/AST/BuiltinTypes.def b/clang/include/clang/AST/BuiltinTypes.def
index 444be4311a7431..4eae6962a46de6 100644
--- a/clang/include/clang/AST/BuiltinTypes.def
+++ b/clang/include/clang/AST/BuiltinTypes.def
@@ -314,6 +314,9 @@ PLACEHOLDER_TYPE(UnknownAny, UnknownAnyTy)
 
 PLACEHOLDER_TYPE(BuiltinFn, BuiltinFnTy)
 
+// A placeholder type for __builtin_counted_by_ref.
+PLACEHOLDER_TYPE(BuiltinCountedByRef, BuiltinCountedByRefTy)
+
 // The type of a cast which, in ARC, would normally require a
 // __bridge, but which might be okay depending on the immediate
 // context.
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 3caf471d3037f9..37fb44d4bf74cd 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6673,7 +6673,7 @@ def err_builtin_counted_by_ref_must_be_flex_array_member : Error<
 def err_builtin_counted_by_ref_cannot_leak_reference : Error<
   "value returned by '__builtin_counted_by_ref' cannot be assigned to a "
   "variable, have its address taken, or passed into or returned from a function">;
-def err_builtin_counted_by_ref_invalid_lhs_use : Error<
+def err_builtin_counted_by_ref_invalid_use : Error<
   "value returned by '__builtin_counted_by_ref' cannot be used in "
   "%select{an array subscript|a binary}0 expression">;
 def err_builtin_counted_by_ref_has_side_effects : Error<
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index fd834c14ce790f..f4b71861968e77 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -1111,6 +1111,9 @@ enum PredefinedTypeIDs {
   /// \brief The '__ibm128' type
   PREDEF_TYPE_IBM128_ID = 74,
 
+  /// \brief The placeholder type for __builtin_counted_by_ref.
+  PREDEF_TYPE_BUILTIN_COUNTED_BY_REF = 75,
+
 /// OpenCL image types with auto numeration
 #define IMAGE_TYPE(ImgType, Id, SingletonId, Access, Suffix)                   \
   PREDEF_TYPE_##Id##_ID,
@@ -1148,7 +1151,7 @@ enum PredefinedTypeIDs {
 ///
 /// Type IDs for non-predefined types will start at
 /// NUM_PREDEF_TYPE_IDs.
-const unsigned NUM_PREDEF_TYPE_IDS = 513;
+const unsigned NUM_PREDEF_TYPE_IDS = 514;
 
 // Ensure we do not overrun the predefined types we reserved
 // in the enum PredefinedTypeIDs above.
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 14fbadbc35ae5d..06226afaf23aab 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1390,6 +1390,10 @@ void ASTContext::InitBuiltinTypes(const TargetInfo &Target,
   if (LangOpts.MatrixTypes)
     InitBuiltinType(IncompleteMatrixIdxTy, BuiltinType::IncompleteMatrixIdx);
 
+  // Placeholder for __builtin_counted_by_ref().
+  if (!LangOpts.CPlusPlus)
+    InitBuiltinType(BuiltinCountedByRefTy, BuiltinType::BuiltinCountedByRef);
+
   // Builtin types for 'id', 'Class', and 'SEL'.
   InitBuiltinType(ObjCBuiltinIdTy, BuiltinType::ObjCId);
   InitBuiltinType(ObjCBuiltinClassTy, BuiltinType::ObjCClass);
diff --git a/clang/lib/AST/NSAPI.cpp b/clang/lib/AST/NSAPI.cpp
index 311fec32bbfa90..6a722b89af6205 100644
--- a/clang/lib/AST/NSAPI.cpp
+++ b/clang/lib/AST/NSAPI.cpp
@@ -466,6 +466,7 @@ NSAPI::getNSNumberFactoryMethodKind(QualType T) const {
   case BuiltinType::Half:
   case BuiltinType::PseudoObject:
   case BuiltinType::BuiltinFn:
+  case BuiltinType::BuiltinCountedByRef:
   case BuiltinType::IncompleteMatrixIdx:
   case BuiltinType::ArraySection:
   case BuiltinType::OMPArrayShaping:
diff --git a/clang/lib/AST/Type.cpp b/clang/lib/AST/Type.cpp
index b70f86ef31442d..1e44c2bd234fbb 100644
--- a/clang/lib/AST/Type.cpp
+++ b/clang/lib/AST/Type.cpp
@@ -3444,6 +3444,8 @@ StringRef BuiltinType::getName(const PrintingPolicy &Policy) const {
     return "<ARC unbridged cast type>";
   case BuiltinFn:
     return "<builtin fn type>";
+  case BuiltinCountedByRef:
+    return "<builtin counted by ref type>";
   case ObjCId:
     return "id";
   case ObjCClass:
@@ -4861,6 +4863,7 @@ bool Type::canHaveNullability(bool ResultIfUnknown) const {
 #define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
 #include "clang/Basic/HLSLIntangibleTypes.def"
     case BuiltinType::BuiltinFn:
+    case BuiltinType::BuiltinCountedByRef:
     case BuiltinType::NullPtr:
     case BuiltinType::IncompleteMatrixIdx:
     case BuiltinType::ArraySection:
diff --git a/clang/lib/AST/TypeLoc.cpp b/clang/lib/AST/TypeLoc.cpp
index fbb7fc5cd76902..088bd5c379d8c4 100644
--- a/clang/lib/AST/TypeLoc.cpp
+++ b/clang/lib/AST/TypeLoc.cpp
@@ -433,6 +433,7 @@ TypeSpecifierType BuiltinTypeLoc::getWrittenTypeSpec() const {
 #define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
 #include "clang/Basic/HLSLIntangibleTypes.def"
   case BuiltinType::BuiltinFn:
+  case BuiltinType::BuiltinCountedByRef:
   case BuiltinType::IncompleteMatrixIdx:
   case BuiltinType::ArraySection:
   case BuiltinType::OMPArrayShaping:
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 2d4a7cd287b70d..52b47a995624f4 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -755,6 +755,7 @@ static ExprResult BuiltinDumpStruct(Sema &S, CallExpr *TheCall) {
     case BuiltinType::PseudoObject:
     case BuiltinType::UnknownAny:
     case BuiltinType::BuiltinFn:
+    case BuiltinType::BuiltinCountedByRef:
       // This might be a callable.
       break;
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index a36ca61a1bef30..d4a90cf262b09f 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -14690,6 +14690,17 @@ void Sema::FinalizeDeclaration(Decl *ThisDecl) {
     }
   }
 
+  // The result of __builtin_counted_by_ref cannot be assigned to a variable.
+  // It allows leaking and modification of bounds safety information.
+  if (const auto *CE = dyn_cast_if_present<CallExpr>(VD->getInit())) {
+    const Expr *E = CE->getCallee()->IgnoreParenImpCasts();
+    if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+        PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef)
+      Diag(E->getExprLoc(),
+           diag::err_builtin_counted_by_ref_cannot_leak_reference)
+          << E->getSourceRange();
+  }
+
   checkAttributesAfterMerging(*this, *VD);
 
   if (VD->isStaticLocal())
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index dcf495b700540f..f1925bae9b91e3 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -3382,7 +3382,9 @@ ExprResult Sema::BuildDeclarationNameExpr(
   case Decl::Function: {
     if (unsigned BID = cast<FunctionDecl>(VD)->getBuiltinID()) {
       if (!Context.BuiltinInfo.isDirectlyAddressable(BID)) {
-        type = Context.BuiltinFnTy;
+        type = (BID == Builtin::BI__builtin_counted_by_ref)
+                   ? Context.BuiltinCountedByRefTy
+                   : Context.BuiltinFnTy;
         valueKind = VK_PRValue;
         break;
       }
@@ -4894,6 +4896,18 @@ ExprResult Sema::ActOnArraySubscriptExpr(Scope *S, Expr *base,
     return ExprError();
   }
 
+  // We cannot use __builtin_counted_by_ref in a binary expression. It's
+  // possible to leak the reference and violate bounds security.
+  Expr *E = base->IgnoreParenImpCasts();
+  if (auto *CE = dyn_cast<CallExpr>(E)) {
+    E = CE->getCallee()->IgnoreParenImpCasts();
+    if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+        PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+      Diag(E->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_use)
+          << 0 << E->getSourceRange();
+    }
+  }
+
   // Handle any non-overload placeholder types in the base and index
   // expressions.  We can't handle overloads here because the other
   // operand might be an overloadable type, in which case the overload
@@ -6188,6 +6202,7 @@ static bool isPlaceholderToRemoveAsArg(QualType type) {
   // These are always invalid as call arguments and should be reported.
   case BuiltinType::BoundMember:
   case BuiltinType::BuiltinFn:
+  case BuiltinType::BuiltinCountedByRef:
   case BuiltinType::IncompleteMatrixIdx:
   case BuiltinType::ArraySection:
   case BuiltinType::OMPArrayShaping:
@@ -6205,10 +6220,27 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
   for (size_t i = 0, e = args.size(); i != e; i++) {
     if (isPlaceholderToRemoveAsArg(args[i]->getType())) {
       ExprResult result = CheckPlaceholderExpr(args[i]);
-      if (result.isInvalid()) hasInvalid = true;
-      else args[i] = result.get();
+      if (result.isInvalid())
+        hasInvalid = true;
+      else
+        args[i] = result.get();
+    }
+
+    // The result of __builtin_counted_by_ref cannot be used as a function
+    // argument. It allows leaking and modification of bounds safety
+    // information.
+    if (const auto *CE = dyn_cast<CallExpr>(args[i])) {
+      const Expr *E = CE->getCallee()->IgnoreParenImpCasts();
+      if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+          PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+        hasInvalid = true;
+        Diag(E->getExprLoc(),
+             diag::err_builtin_counted_by_ref_cannot_leak_reference)
+            << E->getSourceRange();
+      }
     }
   }
+
   return hasInvalid;
 }
 
@@ -6770,7 +6802,9 @@ ExprResult Sema::BuildResolvedCallExpr(Expr *Fn, NamedDecl *NDecl,
   ExprResult Result;
   QualType ResultTy;
   if (BuiltinID &&
-      Fn->getType()->isSpecificBuiltinType(BuiltinType::BuiltinFn)) {
+      (Fn->getType()->isSpecificBuiltinType(BuiltinType::BuiltinFn) ||
+       Fn->getType()->isSpecificBuiltinType(
+           BuiltinType::BuiltinCountedByRef))) {
     // Extract the return type from the (builtin) function pointer type.
     // FIXME Several builtins still have setType in
     // Sema::CheckBuiltinFunctionCall. One should review their definitions in
@@ -9196,38 +9230,6 @@ Sema::CheckAssignmentConstraints(QualType LHSType, ExprResult &RHS,
   LHSType = Context.getCanonicalType(LHSType).getUnqualifiedType();
   RHSType = Context.getCanonicalType(RHSType).getUnqualifiedType();
 
-  // __builtin_counted_by_ref cannot be assigned to a variable, used in
-  // function call, or in a return.
-  auto FindBuiltinCountedByRefExpr = [&](Expr *E) -> CallExpr * {
-    struct BuiltinCountedByRefVisitor : DynamicRecursiveASTVisitor {
-      CallExpr *TheCall = nullptr;
-      bool VisitCallExpr(CallExpr *CE) override {
-        if (CE->getBuiltinCallee() == Builtin::BI__builtin_counted_by_ref) {
-          TheCall = CE;
-          return false;
-        }
-        return true;
-      }
-      bool
-      VisitUnaryExprOrTypeTraitExpr(UnaryExprOrTypeTraitExpr *UE) override {
-        // A UnaryExprOrTypeTraitExpr---e.g. sizeof, __alignof, etc.---isn't
-        // the same as a CallExpr, so if we find a __builtin_counted_by_ref()
-        // call in one, ignore it.
-        return false;
-      }
-    } V;
-    V.TraverseStmt(E);
-    return V.TheCall;
-  };
-  static llvm::SmallPtrSet<CallExpr *, 4> Diagnosed;
-  if (auto *CE = FindBuiltinCountedByRefExpr(RHS.get());
-      CE && !Diagnosed.count(CE)) {
-    Diagnosed.insert(CE);
-    Diag(CE->getExprLoc(),
-         diag::err_builtin_counted_by_ref_cannot_leak_reference)
-        << CE->getSourceRange();
-  }
-
   // Common case: no conversion required.
   if (LHSType == RHSType) {
     Kind = CK_NoOp;
@@ -13778,42 +13780,6 @@ QualType Sema::CheckAssignmentOperands(Expr *LHSExpr, ExprResult &RHS,
     ConvTy = CheckAssignmentConstraints(Loc, LHSType, RHSType);
   }
 
-  // __builtin_counted_by_ref can't be used in a binary expression or array
-  // subscript on the LHS.
-  int DiagOption = -1;
-  auto FindInvalidUseOfBoundsSafetyCounter = [&](Expr *E) -> CallExpr * {
-    struct BuiltinCountedByRefVisitor : DynamicRecursiveASTVisitor {
-      CallExpr *CE = nullptr;
-      bool InvalidUse = false;
-      int Option = -1;
-
-      bool VisitCallExpr(CallExpr *E) override {
-        if (E->getBuiltinCallee() == Builtin::BI__builtin_counted_by_ref) {
-          CE = E;
-          return false;
-        }
-        return true;
-      }
-
-      bool VisitArraySubscriptExpr(ArraySubscriptExpr *E) override {
-        InvalidUse = true;
-        Option = 0; // report 'array expression' in diagnostic.
-        return true;
-      }
-      bool VisitBinaryOperator(BinaryOperator *E) override {
-        InvalidUse = true;
-        Option = 1; // report 'binary expression' in diagnostic.
-        return true;
-      }
-    } V;
-    V.TraverseStmt(E);
-    DiagOption = V.Option;
-    return V.InvalidUse ? V.CE : nullptr;
-  };
-  if (auto *CE = FindInvalidUseOfBoundsSafetyCounter(LHSExpr))
-    Diag(CE->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_lhs_use)
-        << DiagOption << CE->getSourceRange();
-
   if (DiagnoseAssignmentResult(ConvTy, Loc, LHSType, RHSType, RHS.get(),
                                AssignmentAction::Assigning))
     return QualType();
@@ -15274,6 +15240,27 @@ ExprResult Sema::ActOnBinOp(Scope *S, SourceLocation TokLoc,
   if (Kind == tok::TokenKind::slash)
     DetectPrecisionLossInComplexDivision(*this, TokLoc, LHSExpr);
 
+  // We cannot use __builtin_counted_by_ref in a binary expression. It's
+  // possible to leak the reference and violate bounds security.
+  auto CheckBuiltinCountedByRefPlaceholder = [&](const Expr *E) {
+    if (const auto *CE = dyn_cast<CallExpr>(E->IgnoreParenImpCasts())) {
+      E = CE->getCallee()->IgnoreParenImpCasts();
+      if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+          PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+        if (BinaryOperator::isAssignmentOp(Opc))
+          Diag(E->getExprLoc(),
+               diag::err_builtin_counted_by_ref_cannot_leak_reference)
+              << E->getSourceRange();
+        else
+          Diag(E->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_use)
+              << 1 << E->getSourceRange();
+      }
+    }
+  };
+
+  CheckBuiltinCountedByRefPlaceholder(LHSExpr);
+  CheckBuiltinCountedByRefPlaceholder(RHSExpr);
+
   return BuildBinOp(S, TokLoc, Opc, LHSExpr, RHSExpr);
 }
 
@@ -21074,6 +21061,13 @@ ExprResult Sema::CheckPlaceholderExpr(Expr *E) {
     return ExprError();
   }
 
+  case BuiltinType::BuiltinCountedByRef: {
+    Diag(E->IgnoreParens()->getExprLoc(),
+         diag::err_builtin_counted_by_ref_cannot_leak_reference)
+        << E->IgnoreParens()->getSourceRange();
+    return ExprError();
+  }
+
   case BuiltinType::IncompleteMatrixIdx:
     Diag(cast<MatrixSubscriptExpr>(E->IgnoreParens())
              ->getRowIdx()
diff --git a/clang/lib/Sema/SemaStmt.cpp b/clang/lib/Sema/SemaStmt.cpp
index f3ee5211acdd11..024dc05f6c5636 100644
--- a/clang/lib/Sema/SemaStmt.cpp
+++ b/clang/lib/Sema/SemaStmt.cpp
@@ -3765,6 +3765,16 @@ Sema::ActOnReturnStmt(SourceLocation ReturnLoc, Expr *RetValExp,
         << FSI->getFirstCoroutineStmtKeyword();
   }
 
+  if (const auto *CE = dyn_cast_if_present<CallExpr>(RetVal.get())) {
+    const Expr *E = CE->getCallee()->IgnoreParenImpCasts();
+    if (const BuiltinType *PTy = E->getType()->getAsPlaceholderType();
+        PTy && PTy->getKind() == BuiltinType::BuiltinCountedByRef) {
+      Diag(E->getExprLoc(),
+           diag::err_builtin_counted_by_ref_cannot_leak_reference)
+          << E->getSourceRange();
+    }
+  }
+
   StmtResult R =
       BuildReturnStmt(ReturnLoc, RetVal.get(), /*AllowRecovery=*/true);
   if (R.isInvalid() || ExprEvalContexts.back().isDiscardedStatementContext())
diff --git a/clang/lib/Serialization/ASTCommon.cpp b/clang/lib/Serialization/ASTCommon.cpp
index ec18e84255ca8e..e7f54cf8839c70 100644
--- a/clang/lib/Serialization/ASTCommon.cpp
+++ b/clang/lib/Serialization/ASTCommon.cpp
@@ -274,6 +274,9 @@ serialization::TypeIdxFromBuiltin(const BuiltinType *BT) {
   case BuiltinType::BuiltinFn:
     ID = PREDEF_TYPE_BUILTIN_FN;
     break;
+  case BuiltinType::BuiltinCountedByRef:
+    ID = PREDEF_TYPE_BUILTIN_COUNTED_BY_REF;
+    break;
   case BuiltinType::IncompleteMatrixIdx:
     ID = PREDEF_TYPE_INCOMPLETE_MATRIX_IDX;
     break;
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index ec85fad3389a1c..76dfaa5481e424 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7458,6 +7458,9 @@ QualType ASTReader::GetType(TypeID ID) {
     case PREDEF_TYPE_BUILTIN_FN:
       T = Context.BuiltinFnTy;
       break;
+    case PREDEF_TYPE_BUILTIN_COUNTED_BY_REF:
+      T = Context.BuiltinCountedByRefTy;
+      break;
     case PREDEF_TYPE_INCOMPLETE_MATRIX_IDX:
       T = Context.IncompleteMatrixIdxTy;
       break;
diff --git a/clang/test/Modules/no-external-type-id.cppm b/clang/test/Modules/no-external-type-id.cppm
index d067e574e72e37..162300478632db 100644
--- a/clang/test/Modules/no-external-type-id.cppm
+++ b/clang/test/Modules/no-external-type-id.cppm
@@ -23,7 +23,7 @@ export module b;
 import a;
 export int b();
 
-// CHECK: <DECL_FUNCTION {{.*}} op8=4120
+// CHECK: <DECL_FUNCTION {{.*}} op8=4128
 // CHECK: <TYPE_FUNCTION_PROTO
 
 //--- a.v1.cppm
diff --git a/clang/test/Sema/builtin-counted-by-ref.c b/clang/test/Sema/builtin-counted-by-ref.c
index 5a7ecefcb78976..c21c7093468dde 100644
--- a/clang/test/Sema/builtin-counted-by-ref.c
+++ b/clang/test/Sema/builtin-counted-by-ref.c
@@ -11,62 +11,63 @@ struct fam_struct {
   int array[] __attribute__((counted_by(count)));
 };
 
-void test1(struct fam_struct *ptr, int size, int idx) {
-  size_t size_of = sizeof(__builtin_counted_by_ref(ptr->array)); // ok
-
-  *__builtin_counted_by_ref(ptr->array) = size;             // ok
+void g(char *);
+void h(char);
 
-  {
-      size_t __ignored_assignment;
-      *_Generic(__builtin_counted_by_ref(ptr->array),
-               void *: &__ignored_assignment,
-               default: __builtin_counted_by_ref(ptr->array)) = 42; // ok
-  }
+void test1(struct fam_struct *ptr, int size, int idx) {
+  size_t size_of = sizeof(__builtin_counted_by_ref(ptr->array));    // ok
+  int align_of = __alignof(__builtin_counted_by_ref(ptr->array));   // ok
+  size_t __ignored_assignment;
+
+  *__builtin_counted_by_ref(ptr->array) = size;                     // ok
+  *_Generic(__builtin_counted_by_ref(ptr->array),
+           void *: &__ignored_assignment,
+           default: __builtin_counted_by_ref(ptr->array)) = 42;     // ok
+  h(*__builtin_counted_by_ref(ptr->array));                         // ok
 }
 
 void test2(struct fam_struct *ptr, int idx) {
-  __builtin_counted_by_ref();                               // expected-error {{too few arguments to function call, expected 1, have 0}}
-  __builtin_counted_by_ref(ptr->array, ptr->x, ptr->count); // expected-error {{too many arguments to function call, expected 1, have 3}}
+  __builtin_counted_by_ref();                                       // expected-error {{too few arguments to function call, expected 1, have 0}}
+  __builtin_counted_by_ref(ptr->array, ptr->x, ptr->count);         // expected-error {{too many arguments to function call, expected 1, have 3}}
 }
 
 void test3(struct fam_struct *ptr, int idx) {
-  __builtin_counted_by_ref(&ptr->array[0]);                 // expected-error {{'__builtin_counted_by_ref' argument must reference a flexible array member}}
-  __builtin_counted_by_ref(&ptr->array[idx]);               // expected-error {{'__builtin_counted_by_ref' argument must reference a flexible array member}}
-  __builtin_counted_by_ref(&ptr->array);                    // expected-error {{'__builtin_counted_by_ref' argument must reference a flexible array member}}
-  __builtin_counted_by_re...
[truncated]

Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don’t think this is the right way of going about it—copy-pasting the check across 6 or so different places seems like it’s missing the point of using a placeholder in the first place.

Maybe I didn’t describe this accurately enough, but the idea was that the call to the builtin, not the builtin itself, should get placeholder type. Then, you need to update CheckPlaceholderExpr only (and maybe the initialisation code if it’s required there; I don’t remember if that ever checks for placeholders off the top of my head because initialisers are weird, but I hope it does), and then update any places where you want to allow this as a subexpression to check for an expression w/ that placeholder type (and change the placeholder type to be whatever the actual type of the expression is supposed to be) before it calls CheckPlaceholderExpr/CheckArgsForPlaceholders.

... that said, that was my (and I think Aaron’s) idea—if you’ve discovered a reason why that doesn’t work, then please let us know so we can think of a way around that. ;Þ

@bwendling
Copy link
Collaborator Author

Hmm, I don’t think this is the right way of going about it—copy-pasting the check across 6 or so different places seems like it’s missing the point of using a placeholder in the first place.

Could you point to a place in the code where it creates a placeholder? I'm unable to find any others that what I modified in SemaExpr.cpp. Many of the uses you mentioned would be invalid to perform on other __builtin functions didn't produce an error when I tried them out (except for using the AddrOf operator).

Maybe I didn’t describe this accurately enough, but the idea was that the call to the builtin, not the builtin itself, should get placeholder type. Then, you need to update CheckPlaceholderExpr only (and maybe the initialisation code if it’s required there; I don’t remember if that ever checks for placeholders off the top of my head because initialisers are weird, but I hope it does), and then update any places where you want to allow this as a subexpression to check for an expression w/ that placeholder type (and change the placeholder type to be whatever the actual type of the expression is supposed to be) before it calls CheckPlaceholderExpr/CheckArgsForPlaceholders.

Given that I did this by checking the CallExpr, perhaps we should just abandon using the placeholder scheme and go with what I have here, but without the placeholder stuff.

@bwendling
Copy link
Collaborator Author

I removed the Placeholder code. It's a much smaller patch and has the better timing profile. PTAL.

Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing I’m concerned about here is that I feel like there ought to be a better way of doing this than checking for and disallowing it in every place where we can have a subexpression in C.

CC @AaronBallman for opinions on this.

Also, if we’re doing it this way, then the check should definitely be factored out into a function in Sema.

@Sirraide
Copy link
Member

Could you point to a place in the code where it creates a placeholder?

I mean, e.g. CheckPointerToMemberOperands() can return BoundMemberTy as the type of a .* expression, and CreateBuiltinMatrixSubscriptExpr() creates a MatrixSubscriptExpr with type IncompleteMatrixIdxTy if that’s what you’re asking.

@bwendling
Copy link
Collaborator Author

The main thing I’m concerned about here is that I feel like there ought to be a better way of doing this than checking for and disallowing it in every place where we can have a subexpression in C.

Yeah, but we don't have such a method, partially due to the languages Clang supports. How is this any different from the vast array of other features that are checked in Sema?

Also, if we’re doing it this way, then the check should definitely be factored out into a function in Sema.

Okay.

@bwendling
Copy link
Collaborator Author

Could you point to a place in the code where it creates a placeholder?

I mean, e.g. CheckPointerToMemberOperands() can return BoundMemberTy as the type of a .* expression, and CreateBuiltinMatrixSubscriptExpr() creates a MatrixSubscriptExpr with type IncompleteMatrixIdxTy if that’s what you’re asking.

CheckPointerToMemberOperands only returns the placeholder type. The matrix one isn't a CallExpr, so it doesn't help me much. Anyway, this is irrelevant now as I'm going another way.

@bwendling bwendling requested a review from Endilll as a code owner November 20, 2024 21:40
@bwendling bwendling requested a review from Sirraide November 20, 2024 22:13
Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might honestly be fine the way it is now, but I’d like for e.g. @AaronBallman to take a look at this too before approving it since he’s more familiar w/ C than me.

@bwendling
Copy link
Collaborator Author

I think this might honestly be fine the way it is now

Yeah. It's going kind of the opposite way from what you suggested with the placeholder, just instead of allowing for different uses at various places in Sema we disallow without using the placeholder.

@bwendling
Copy link
Collaborator Author

I image @AaronBallman is on vacation this week. @Sirraide would you be willing to LGTM this so that I can move it along? It's certainly better than what we have and I can address any concerns Aaron has afterwards.

@Sirraide
Copy link
Member

I image @AaronBallman is on vacation this week. @Sirraide would you be willing to LGTM this so that I can move it along?

I mean, this is stricly a performance issue, so I’m not sure why we’d want to merge it this urgently...

@bwendling
Copy link
Collaborator Author

Yeah, but I'd like to get this off my plate. :-)

Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes seem fine to me now; not sure we’re handling this in all places where it needs to be handled though. I’d say maybe wait a day or two before merging in case anyone else wants to comment on that.

@Endilll
Copy link
Contributor

Endilll commented Nov 25, 2024

For the record, Aaron is not on vacation. So if you need his sign-off, you should wait.

@bwendling
Copy link
Collaborator Author

Ah, okay. Thanks.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a branch up for this on https://llvm-compile-time-tracker.com/ so that we've verified that this actually improves performance?

Comment on lines 14695 to 14698
if (IsBuiltinCountedByRef(VD->getInit()))
Diag(VD->getInit()->getExprLoc(),
diag::err_builtin_counted_by_ref_cannot_leak_reference)
<< VD->getInit()->getSourceRange();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we split this off into a helper function like bool CheckInvalidBuiltinCountedByRef(const Expr *E); ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to comment on that too, but ended up forgetting about it; it’s used in like 3–4 places, so I personally would prefer making this a helper function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree. The diagnostic isn't the same in all places. And one of them takes arguments not available from the Expr. I would say that if we were adding several more diagnostics then a separate function would make sense. But for four, it's probably a bit more trouble than it's worth.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same in every case. Each time, the logic is:

if (IsBuiltinCountedByRef(SomeExpr))
  Diag(SomeExpr->getExprLoc(), diag::err_builtin_counted_by_ref_cannot_leak_reference) << SomeExpr->getSourceRange();

This would require changing some later to this equivalent:

  auto CheckBuiltinCountedByRef = [&](const Expr *E) {
    if (BinaryOperator::isAssignmentOp(Opc))
      CheckInvalidBuiltinCountedByRef(E);
    else if (IsBuiltinCountedByRef(E))
        Diag(E->getExprLoc(), diag::err_builtin_counted_by_ref_invalid_use)
            << 1 << E->getSourceRange();
  };

@nikic
Copy link
Contributor

nikic commented Nov 26, 2024

Is there a branch up for this on https://llvm-compile-time-tracker.com/ so that we've verified that this actually improves performance?

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=105b7803ea22823a2fca2a82ee843d0884e9cbf3&to=dcb3e2b6b4333556fbfa3d7a5a45c145972b466f&stat=instructions:u Yes, this does recover the regression.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the request for a helper function (that two reviewers asked for the same thing is a pretty reasonable indication that we should probably have the helper, but if you feel strongly about it, I won't block the PR over it).

@bwendling
Copy link
Collaborator Author

Fine.

@bwendling bwendling merged commit b185b85 into llvm:main Nov 27, 2024
5 of 7 checks passed
@bwendling bwendling deleted the counted-by-ref-performance branch November 27, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category lldb
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compile time regression introduced by Sema checking for __builtin_counted_by_ref
7 participants