Skip to content

Respect the [[clang::unsafe_buffer_usage]] attribute for field and constructor initializers #91991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 30, 2024

Conversation

danakj
Copy link
Contributor

@danakj danakj commented May 13, 2024

This is built on top of #91777, and includes the commits there for now.

Issue #80482

@danakj danakj requested a review from haoNoQ May 13, 2024 16:25
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:analysis labels May 13, 2024
@llvmbot
Copy link
Member

llvmbot commented May 13, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-analysis

Author: Dana Jansens (danakj)

Changes

This is built on top of #91777, and includes the commits there for now.


Patch is 28.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91991.diff

5 Files Affected:

  • (modified) clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h (+5)
  • (modified) clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def (+1)
  • (modified) clang/lib/Analysis/UnsafeBufferUsage.cpp (+183-91)
  • (modified) clang/lib/Sema/AnalysisBasedWarnings.cpp (+26-4)
  • (modified) clang/test/SemaCXX/warn-unsafe-buffer-usage-function-attr.cpp (+105)
diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
index 5d16dcc824c50..228b4ae1e3e11 100644
--- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
+++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
@@ -106,6 +106,11 @@ class UnsafeBufferUsageHandler {
   virtual void handleUnsafeOperation(const Stmt *Operation,
                                      bool IsRelatedToDecl, ASTContext &Ctx) = 0;
 
+  /// Invoked when an unsafe operation with a std container is found.
+  virtual void handleUnsafeOperationInContainer(const Stmt *Operation,
+                                                bool IsRelatedToDecl,
+                                                ASTContext &Ctx) = 0;
+
   /// Invoked when a fix is suggested against a variable. This function groups
   /// all variables that must be fixed together (i.e their types must be changed
   /// to the same target type to prevent type mismatches) into a single fixit.
diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def
index 3273c642eed51..242ad763ba62b 100644
--- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def
+++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def
@@ -36,6 +36,7 @@ WARNING_GADGET(Decrement)
 WARNING_GADGET(ArraySubscript)
 WARNING_GADGET(PointerArithmetic)
 WARNING_GADGET(UnsafeBufferUsageAttr)
+WARNING_GADGET(UnsafeBufferUsageCtorAttr)
 WARNING_GADGET(DataInvocation)
 WARNING_CONTAINER_GADGET(SpanTwoParamConstructor) // Uses of `std::span(arg0, arg1)`
 FIXABLE_GADGET(ULCArraySubscript)          // `DRE[any]` in an Unspecified Lvalue Context
diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp
index c42e70d5b95ac..c7b4fcdce2132 100644
--- a/clang/lib/Analysis/UnsafeBufferUsage.cpp
+++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -492,7 +492,7 @@ class Gadget {
 #endif
 
   virtual bool isWarningGadget() const = 0;
-  virtual const Stmt *getBaseStmt() const = 0;
+  virtual SourceLocation getSourceLoc() const = 0;
 
   /// Returns the list of pointer-type variables on which this gadget performs
   /// its operation. Typically, there's only one variable. This isn't a list
@@ -513,6 +513,10 @@ class WarningGadget : public Gadget {
 
   static bool classof(const Gadget *G) { return G->isWarningGadget(); }
   bool isWarningGadget() const final { return true; }
+
+  virtual void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                                     bool IsRelatedToDecl,
+                                     ASTContext &Ctx) const = 0;
 };
 
 /// Fixable gadgets correspond to code patterns that aren't always unsafe but
@@ -572,7 +576,12 @@ class IncrementGadget : public WarningGadget {
             .bind(OpTag));
   }
 
-  const UnaryOperator *getBaseStmt() const override { return Op; }
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(Op, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override {
     SmallVector<const DeclRefExpr *, 2> Uses;
@@ -607,7 +616,12 @@ class DecrementGadget : public WarningGadget {
             .bind(OpTag));
   }
 
-  const UnaryOperator *getBaseStmt() const override { return Op; }
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(Op, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override {
     if (const auto *DRE =
@@ -648,7 +662,12 @@ class ArraySubscriptGadget : public WarningGadget {
     // clang-format on
   }
 
-  const ArraySubscriptExpr *getBaseStmt() const override { return ASE; }
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(ASE, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return ASE->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override {
     if (const auto *DRE =
@@ -696,7 +715,12 @@ class PointerArithmeticGadget : public WarningGadget {
                     .bind(PointerArithmeticTag));
   }
 
-  const Stmt *getBaseStmt() const override { return PA; }
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(PA, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return PA->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override {
     if (const auto *DRE = dyn_cast<DeclRefExpr>(Ptr->IgnoreParenImpCasts())) {
@@ -734,7 +758,12 @@ class SpanTwoParamConstructorGadget : public WarningGadget {
                     .bind(SpanTwoParamConstructorTag));
   }
 
-  const Stmt *getBaseStmt() const override { return Ctor; }
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperationInContainer(Ctor, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return Ctor->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override {
     // If the constructor call is of the form `std::span{var, n}`, `var` is
@@ -780,11 +809,8 @@ class PointerInitGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override {
-    // FIXME: This needs to be the entire DeclStmt, assuming that this method
-    // makes sense at all on a FixableGadget.
-    return PtrInitRHS;
+  SourceLocation getSourceLoc() const override {
+    return PtrInitRHS->getBeginLoc();
   }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
@@ -833,12 +859,7 @@ class PtrToPtrAssignmentGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override {
-    // FIXME: This should be the binary operator, assuming that this method
-    // makes sense at all on a FixableGadget.
-    return PtrLHS;
-  }
+  SourceLocation getSourceLoc() const override { return PtrLHS->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
     return DeclUseList{PtrLHS, PtrRHS};
@@ -888,12 +909,7 @@ class CArrayToPtrAssignmentGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override {
-    // FIXME: This should be the binary operator, assuming that this method
-    // makes sense at all on a FixableGadget.
-    return PtrLHS;
-  }
+  SourceLocation getSourceLoc() const override { return PtrLHS->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
     return DeclUseList{PtrLHS, PtrRHS};
@@ -921,10 +937,55 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget {
   }
 
   static Matcher matcher() {
-    return stmt(callExpr(callee(functionDecl(hasAttr(attr::UnsafeBufferUsage))))
-                    .bind(OpTag));
+    auto HasUnsafeFnDecl =
+        callee(functionDecl(hasAttr(attr::UnsafeBufferUsage)));
+    return stmt(callExpr(HasUnsafeFnDecl).bind(OpTag));
+  }
+
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(Op, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
+
+  DeclUseList getClaimedVarUseSites() const override { return {}; }
+};
+
+/// A call of a constructor that performs unchecked buffer operations
+/// over one of its pointer parameters, or constructs a class object that will
+/// perform buffer operations that depend on the correctness of the parameters.
+class UnsafeBufferUsageCtorAttrGadget : public WarningGadget {
+  constexpr static const char *const OpTag = "cxx_construct_expr";
+  const CXXConstructExpr *Op;
+
+public:
+  UnsafeBufferUsageCtorAttrGadget(const MatchFinder::MatchResult &Result)
+      : WarningGadget(Kind::UnsafeBufferUsageCtorAttr),
+        Op(Result.Nodes.getNodeAs<CXXConstructExpr>(OpTag)) {}
+
+  static bool classof(const Gadget *G) {
+    return G->getKind() == Kind::UnsafeBufferUsageCtorAttr;
+  }
+
+  static Matcher matcher() {
+    auto HasUnsafeCtorDecl =
+        hasDeclaration(cxxConstructorDecl(hasAttr(attr::UnsafeBufferUsage)));
+    // std::span(ptr, size) ctor is handled by SpanTwoParamConstructorGadget.
+    auto HasTwoParamSpanCtorDecl = hasDeclaration(
+        cxxConstructorDecl(hasDeclContext(isInStdNamespace()), hasName("span"),
+                           parameterCountIs(2)));
+    return stmt(
+        cxxConstructExpr(HasUnsafeCtorDecl, unless(HasTwoParamSpanCtorDecl))
+            .bind(OpTag));
+  }
+
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(Op, IsRelatedToDecl, Ctx);
   }
-  const Stmt *getBaseStmt() const override { return Op; }
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override { return {}; }
 };
@@ -953,7 +1014,13 @@ class DataInvocationGadget : public WarningGadget {
         explicitCastExpr(anyOf(has(callExpr), has(parenExpr(has(callExpr)))))
             .bind(OpTag));
   }
-  const Stmt *getBaseStmt() const override { return Op; }
+
+  void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
+                             bool IsRelatedToDecl,
+                             ASTContext &Ctx) const override {
+    Handler.handleUnsafeOperation(Op, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
 
   DeclUseList getClaimedVarUseSites() const override { return {}; }
 };
@@ -990,8 +1057,7 @@ class ULCArraySubscriptGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override { return Node; }
+  SourceLocation getSourceLoc() const override { return Node->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
     if (const auto *DRE =
@@ -1031,8 +1097,7 @@ class UPCStandalonePointerGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override { return Node; }
+  SourceLocation getSourceLoc() const override { return Node->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override { return {Node}; }
 };
@@ -1070,10 +1135,9 @@ class PointerDereferenceGadget : public FixableGadget {
     return {BaseDeclRefExpr};
   }
 
-  virtual const Stmt *getBaseStmt() const final { return Op; }
-
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
 };
 
 // Represents expressions of the form `&DRE[any]` in the Unspecified Pointer
@@ -1108,8 +1172,7 @@ class UPCAddressofArraySubscriptGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &) const override;
-
-  virtual const Stmt *getBaseStmt() const override { return Node; }
+  SourceLocation getSourceLoc() const override { return Node->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
     const auto *ArraySubst = cast<ArraySubscriptExpr>(Node->getSubExpr());
@@ -1218,8 +1281,7 @@ class UPCPreIncrementGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override { return Node; }
+  SourceLocation getSourceLoc() const override { return Node->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
     return {dyn_cast<DeclRefExpr>(Node->getSubExpr())};
@@ -1264,8 +1326,7 @@ class UUCAddAssignGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &S) const override;
-
-  virtual const Stmt *getBaseStmt() const override { return Node; }
+  SourceLocation getSourceLoc() const override { return Node->getBeginLoc(); }
 
   virtual DeclUseList getClaimedVarUseSites() const override {
     return {dyn_cast<DeclRefExpr>(Node->getLHS())};
@@ -1315,9 +1376,9 @@ class DerefSimplePtrArithFixableGadget : public FixableGadget {
 
   virtual std::optional<FixItList>
   getFixits(const FixitStrategy &s) const final;
-
-  // TODO remove this method from FixableGadget interface
-  virtual const Stmt *getBaseStmt() const final { return nullptr; }
+  SourceLocation getSourceLoc() const override {
+    return DerefOp->getBeginLoc();
+  }
 
   virtual DeclUseList getClaimedVarUseSites() const final {
     return {BaseDeclRefExpr};
@@ -1326,8 +1387,8 @@ class DerefSimplePtrArithFixableGadget : public FixableGadget {
 
 /// Scan the function and return a list of gadgets found with provided kits.
 static std::tuple<FixableGadgetList, WarningGadgetList, DeclUseTracker>
-findGadgets(const Decl *D, const UnsafeBufferUsageHandler &Handler,
-            bool EmitSuggestions) {
+findGadgets(const Stmt *S, ASTContext &Ctx,
+            const UnsafeBufferUsageHandler &Handler, bool EmitSuggestions) {
 
   struct GadgetFinderCallback : MatchFinder::MatchCallback {
     FixableGadgetList FixableGadgets;
@@ -1422,7 +1483,7 @@ findGadgets(const Decl *D, const UnsafeBufferUsageHandler &Handler,
     // clang-format on
   }
 
-  M.match(*D->getBody(), D->getASTContext());
+  M.match(*S, Ctx);
   return {std::move(CB.FixableGadgets), std::move(CB.WarningGadgets),
           std::move(CB.Tracker)};
 }
@@ -2070,7 +2131,7 @@ UUCAddAssignGadget::getFixits(const FixitStrategy &S) const {
     if (S.lookup(VD) == FixitStrategy::Kind::Span) {
       FixItList Fixes;
 
-      const Stmt *AddAssignNode = getBaseStmt();
+      const Stmt *AddAssignNode = Node;
       StringRef varName = VD->getName();
       const ASTContext &Ctx = VD->getASTContext();
 
@@ -2112,7 +2173,6 @@ UPCPreIncrementGadget::getFixits(const FixitStrategy &S) const {
     if (S.lookup(VD) == FixitStrategy::Kind::Span) {
       FixItList Fixes;
       std::stringstream SS;
-      const Stmt *PreIncNode = getBaseStmt();
       StringRef varName = VD->getName();
       const ASTContext &Ctx = VD->getASTContext();
 
@@ -2120,12 +2180,12 @@ UPCPreIncrementGadget::getFixits(const FixitStrategy &S) const {
       SS << "(" << varName.data() << " = " << varName.data()
          << ".subspan(1)).data()";
       std::optional<SourceLocation> PreIncLocation =
-          getEndCharLoc(PreIncNode, Ctx.getSourceManager(), Ctx.getLangOpts());
+          getEndCharLoc(Node, Ctx.getSourceManager(), Ctx.getLangOpts());
       if (!PreIncLocation)
         return std::nullopt;
 
       Fixes.push_back(FixItHint::CreateReplacement(
-          SourceRange(PreIncNode->getBeginLoc(), *PreIncLocation), SS.str()));
+          SourceRange(Node->getBeginLoc(), *PreIncLocation), SS.str()));
       return Fixes;
     }
   }
@@ -2856,7 +2916,7 @@ getFixIts(FixableGadgetSets &FixablesForAllVars, const FixitStrategy &S,
       }
 #ifndef NDEBUG
       Handler.addDebugNoteForVar(
-          VD, F->getBaseStmt()->getBeginLoc(),
+          VD, F->getSourceLoc(),
           ("gadget '" + F->getDebugName() + "' refused to produce a fix")
               .str());
 #endif
@@ -2970,46 +3030,16 @@ class VariableGroupsManagerImpl : public VariableGroupsManager {
   }
 };
 
-void clang::checkUnsafeBufferUsage(const Decl *D,
-                                   UnsafeBufferUsageHandler &Handler,
-                                   bool EmitSuggestions) {
-#ifndef NDEBUG
-  Handler.clearDebugNotes();
-#endif
-
-  assert(D && D->getBody());
-  // We do not want to visit a Lambda expression defined inside a method
-  // independently. Instead, it should be visited along with the outer method.
-  // FIXME: do we want to do the same thing for `BlockDecl`s?
-  if (const auto *fd = dyn_cast<CXXMethodDecl>(D)) {
-    if (fd->getParent()->isLambda() && fd->getParent()->isLocalClass())
-      return;
-  }
-
-  // Do not emit fixit suggestions for functions declared in an
-  // extern "C" block.
-  if (const auto *FD = dyn_cast<FunctionDecl>(D)) {
-    for (FunctionDecl *FReDecl : FD->redecls()) {
-      if (FReDecl->isExternC()) {
-        EmitSuggestions = false;
-        break;
-      }
-    }
-  }
-
-  WarningGadgetSets UnsafeOps;
-  FixableGadgetSets FixablesForAllVars;
-
-  auto [FixableGadgets, WarningGadgets, Tracker] =
-      findGadgets(D, Handler, EmitSuggestions);
-
+void applyGadgets(const Decl *D, FixableGadgetList FixableGadgets,
+                  WarningGadgetList WarningGadgets, DeclUseTracker Tracker,
+                  UnsafeBufferUsageHandler &Handler, bool EmitSuggestions) {
   if (!EmitSuggestions) {
     // Our job is very easy without suggestions. Just warn about
     // every problematic operation and consider it done. No need to deal
     // with fixable gadgets, no need to group operations by variable.
     for (const auto &G : WarningGadgets) {
-      Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/false,
-                                    D->getASTContext());
+      G->handleUnsafeOperation(Handler, /*IsRelatedToDecl=*/false,
+                               D->getASTContext());
     }
 
     // This return guarantees that most of the machine doesn't run when
@@ -3046,8 +3076,10 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
   if (WarningGadgets.empty())
     return;
 
-  UnsafeOps = groupWarningGadgetsByVar(std::move(WarningGadgets));
-  FixablesForAllVars = groupFixablesByVar(std::move(FixableGadgets));
+  WarningGadgetSets UnsafeOps =
+      groupWarningGadgetsByVar(std::move(WarningGadgets));
+  FixableGadgetSets FixablesForAllVars =
+      groupFixablesByVar(std::move(FixableGadgets));
 
   std::map<const VarDecl *, FixItList> FixItsForVariableGroup;
 
@@ -3251,8 +3283,8 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
                   Tracker, Handler, VarGrpMgr);
 
   for (const auto &G : UnsafeOps.noVar) {
-    Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/false,
-                                  D->getASTContext());
+    G->handleUnsafeOperation(Handler, /*IsRelatedToDecl=*/false,
+                             D->getASTContext());
   }
 
   for (const auto &[VD, WarningGadgets] : UnsafeOps.byVar) {
@@ -3263,8 +3295,68 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
                                           : FixItList{},
                                       D, NaiveStrategy);
     for (const auto &G : WarningGadgets) {
-      Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/true,
-                                    D->getASTContext());
+      G->handleUnsafeOperation(Handler, /*IsRelatedToDecl=*/true,
+                               D->getASTContext());
+    }
+  }
+}
+
+void clang::checkUnsafeBufferUsage(const Decl *D,
+                                   UnsafeBufferUsageHandler &Handler,
+                                   bool EmitSuggestions) {
+#ifndef NDEBUG
+  Handler.clearDebugNotes();
+#endif
+
+  assert(D);
+
+  SmallVector<Stmt*> Stmts;
+
+  // We do not want to visit a Lambda expression defined inside a method
+  // independently. Instead, it should be visited along with the outer method.
+  // FIXME: do we want to do the same ...
[truncated]

Copy link

github-actions bot commented May 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@danakj
Copy link
Contributor Author

danakj commented May 14, 2024

These lit test failures are indeed from this PR's last commit, I will dig into why and correct what's wrong.

Copy link
Collaborator

@haoNoQ haoNoQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha makes sense! Thanks for catching these false negatives!!

for (Stmt *S : Stmts) {
auto [FixableGadgets, WarningGadgets, Tracker] =
findGadgets(S, D->getASTContext(), Handler, EmitSuggestions);
applyGadgets(D, std::move(FixableGadgets), std::move(WarningGadgets),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think applyGadgets() should be invoked only once per function. For instance, when we're emitting a fixit for a function parameter, we emit a single fixit that covers every use of that parameter. We can't do that separately for the use of that parameter in a CXXCtorInitializer and then independently do the same for the rest of the function body. That'd result in conflicting fixes.

The tracker should also be re-used for the entire function. We need to know if we can fix the use inside the CXXCtorInitializer at all before emitting a fixit for all other uses.

(Deep inside their hearts these data structures dream of becoming global for the entire translation unit.)

So probably [FixableGadgets, WarningGadgets, Tracker] should be passed by reference, to push_back() gadgets directly into them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, we'd still collect a list of gadgets+trackers in the same way, but we'd pass them all at once to applyGadgets() so it could do some kind of global analysis over the full set if it wanted to. Though it will just need to iterate over them for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, I guess we'd have lists of gadget, but the difference here is we'd have only a single tracker? So applyGadgets actually just walks the global list of gadgets (it already does this for a list of gadgets) and uses the one tracker... I will try this out.

}
}

if (const auto *FD = dyn_cast<FieldDecl>(D)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rest of the machine may be super unprepared for the scenario when there's no complete "function" under analysis. We obviously want to find unsafe operations in such code but I think it's a good idea to temporarily suppress suggestion/fixit generation in this scenario (like hard-assign EmitSuggestions to false here), until we carefully confirm that the suggestion/fixit machine actually works. It's unlikely to produce useful suggestions or fixits anyway, unless the initializer somehow contains an entire compound statement with local variables or nested functions or classes inside. But I think it's very likely to step on some assertions and crash because we haven't considered this scenario at all until recently.

Global variable initializers are probably in the same situation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, do we even need to handle default field initializers explicitly? I think the AST copy-pastes them into CXXCtorInitializers in every constructor with the help of CXXDefaultInitExpr. So as long as you handle CXXCtorInitializer you get coverage of default initializers too. Unless there are zero constructors that respect that initializer. But then it's just dead code so we might as well leave it be.

Copy link
Contributor Author

@danakj danakj May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, CXXCtorInitializers do not catch everything. This test fails:

struct S {
    S()
        : FromCtor(3),  // expected-warning{{function introduces unsafe buffer manipulation}}
          FromCtor2{3}  // expected-warning{{function introduces unsafe buffer manipulation}}
    {}

    UnsafeMembers FromCtor;
    UnsafeMembers FromCtor2;
    UnsafeMembers FromField{3};  // expected-warning{{function introduces unsafe buffer manipulation}}
};

Where the int ctor of UnsafeMembers is [[clang::unsafe_buffer_usage]].

This also fails:

struct AggregateUnsafeMembers {
    UnsafeMembers f1;
    UnsafeMembers f2{3};  // expected-warning{{function introduces unsafe buffer manipulation}}
};

Edit: On a whim I tried turning on CallableVisitor::shouldVisitImplicitCode() but it did not change anything here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I debugged this a little bit and it looks like you're blaming the wrong visitor. The CXXCtorInitializer is there and it gets correctly visited. However, MatchDescendantVisitor gets stuck because it considers CXXDefaultInitExpr implicit code. Enabling MatchDescendantVisitor::shouldVisitImplicitCode() takes care of your issue!

Not sure it's a good solution though, might be a good idea to add an explicit TraverseCXXDefaultInitExpr() because I'm not sure we want to visit other kinds of implicit code. (Probably deserves a FIXME.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I will have a look at that on Monday then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your new test the entire constructor is missing. The class is an aggregate, it has no constructor: https://godbolt.org/z/aa869TKsf - so I hope you'll see the warning whenever you aggregate-initialize an instance of such class. Which may or may not mean that there's still no direct need to warn at the definition of the field.

https://godbolt.org/z/qjqaW3Eze here is an example of the 3 ways to call the aggregate ctor.

    |-DeclStmt <line:12:5, col:26>
    | `-VarDecl <col:5, col:24> col:24 m1 'HoldsUnsafeMembers' callinit
    |   `-CXXConstructExpr <col:24> 'HoldsUnsafeMembers' 'void () noexcept(false)'
    |-DeclStmt <line:13:5, col:35>
    | `-VarDecl <col:5, col:34> col:10 m2 'HoldsUnsafeMembers' cinit
    |   `-CXXTemporaryObjectExpr <col:15, col:34> 'HoldsUnsafeMembers' 'void () noexcept(false)' zeroing
    `-DeclStmt <line:14:5, col:35>
      `-VarDecl <col:5, col:34> col:10 m3 'HoldsUnsafeMembers' cinit
        `-CXXFunctionalCastExpr <col:15, col:34> 'HoldsUnsafeMembers' functional cast to HoldsUnsafeMembers <NoOp>
          `-InitListExpr <col:33, col:34> 'HoldsUnsafeMembers'
            `-CXXDefaultInitExpr <col:34> 'UnsafeMembers' has rewritten init
              `-CXXConstructExpr <line:8:28, col:30> 'UnsafeMembers' 'void (int)' list
                `-IntegerLiteral <col:29> 'int' 3

We do get a CXXDefaultInitExpr if we construct with {} but not otherwise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT we are required to visit FieldDecl to handle the 0-ctor class use case.

I really want us to momentarily stop being preoccupied with "can" and focus on "should". This isn't a technical problem, this is a UI/UX problem.

There's no constructor. There's no binary code for constructor. The unsafe function is not invoked from any binary code generated from your example. It's the responsibility of the code that aggregate-initializes your aggregate to generate the call to the default initializer, so naturally it doesn't show up just yet.

The class is safe to use whenever the field is initialized explicitly. If they remove the field initializer, that'll break backwards compatibility. Maybe it's ok to keep the class as-is and use the attribute only as a reminder to the callers to call a better constructor.

Ultimately I think you're right and in most of such cases the best solution is to change the class to use a different, safe default initializer for the field. Which presumably exists, given that the attribute was placed on the currently used constructor. I think it's true that more users would complaint about "false negative" in the aggregate definition, or an awkwardly placed / hard-to-read warning at the curly braces (especially if the unsafe initializer is layered into several other layers of aggregates) than complain about false positive under "the class is intended to be safe as long as the field initializer is passed explicitly".

But this is something we should occasionally think about, and treat this as a very very special cornercase. Which doesn't even necessarily require an immediate solution - given that all unsafe code will eventually be flagged regardless. So I think if we go for explicitly supporting fields, we should probably exclude all non-aggregate classes from the callback, and leave an irritated comment explaining why we eventually agreed that it's a good idea 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no constructor. There's no binary code for constructor. The unsafe function is not invoked from any binary code generated from your example.

Ah yes, this helped me figure out what's going on with the 3 different initialization paths.

If you construct the aggregate like A a; or auto a = A(); then we do see a call to a constructor. So where is it? Now it appears in the AST for the class:

| |-CXXRecordDecl <col:1, col:8> col:8 implicit struct HoldsUnsafeMembers
| |-FieldDecl <line:8:5, col:30> col:19 FromField 'UnsafeMembers'
| | `-CXXConstructExpr <col:28, col:30> 'UnsafeMembers' 'void (int)' list
| |   `-IntegerLiteral <col:29> 'int' 3
| |-CXXConstructorDecl <line:7:8> col:8 implicit used constexpr HoldsUnsafeMembers 'void () noexcept(false)' inline default
| | |-CXXCtorInitializer Field 0xbc32670 'FromField' 'UnsafeMembers'
| | | `-CXXDefaultInitExpr <col:8> 'UnsafeMembers' has rewritten init
| | |   `-CXXConstructExpr <line:8:28, col:30> 'UnsafeMembers' 'void (int)' list
| | |     `-IntegerLiteral <col:29> 'int' 3

If you construct it as auto a = A{}; then yes there is no ctor as you say, but it does appear at the call site. So okay I agree there's nothing to do for an aggregate that is never used. And maybe I can get TraverseCXXDefaultInitExpr to work with that information.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok here's how I think this should go:

diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp
index bfbf84a03b2b..cf0194d29f8c 100644
--- a/clang/lib/Analysis/UnsafeBufferUsage.cpp
+++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -170,6 +170,12 @@ public:
     return VisitorBase::TraverseCXXTypeidExpr(Node);
   }
 
+  bool TraverseCXXDefaultInitExpr(CXXDefaultInitExpr *Node) {
+    if (!TraverseStmt(Node->getExpr()))
+      return false;
+    return VisitorBase::TraverseCXXDefaultInitExpr(Node);
+  }
+
   bool TraverseStmt(Stmt *Node, DataRecursionQueue *Queue = nullptr) {
     if (!Node)
       return true;
@@ -3343,16 +3349,6 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
         Stmts.push_back(CI->getInit());
       }
     }
-  } else if (const auto *FlD = dyn_cast<FieldDecl>(D)) {
-    // Visit in-class initializers for fields.
-    if (!FlD->hasInClassInitializer())
-      return;
-    Stmts.push_back(FlD->getInClassInitializer());
-    // In a FieldDecl there is no function body, there is only a single
-    // statement, and the suggestions machinery is not set up to handle that
-    // kind of structure yet and would give poor suggestions or likely even hit
-    // asserts.
-    EmitSuggestions = false;
   } else if (isa<BlockDecl>(D) || isa<ObjCMethodDecl>(D)) {
     Stmts.push_back(D->getBody());
   }
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index e356c06ac8c4..b9d0b59ef1db 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -2471,38 +2471,6 @@ public:
   CallableVisitor(llvm::function_ref<void(const Decl *)> Callback)
       : Callback(Callback) {}
 
-  bool VisitFieldDecl(FieldDecl *Node) {
-    DeclContext *DeclCtx;
-    if (auto *ID = dyn_cast<ObjCIvarDecl>(Node)) {
-      DeclCtx = cast<DeclContext>(ID->getContainingInterface());
-    } else {
-      RecordDecl *RD = Node->getParent();
-      if (auto *CCRD = dyn_cast<CXXRecordDecl>(RD);
-          CCRD && CCRD->isAggregate()) {
-        // Aggregates have implicitly generated constructors which are not
-        // traversed by our AST visitors at this time. The field initializers in
-        // an aggregate may never be used if the callers always explicitly
-        // initialize them, in which case we do not need to warn on unsafe
-        // buffer usage in the initializer.
-        //
-        // FIXME: We should go through the implicit aggregate initialization
-        // (either an implicit CXXConstructExpr for value-init or an implicit
-        // ListInitExpr for aggregate-init) to determine if a field's default
-        // initializer (CXXDefaultInitExpr) is actually used. If it is, then we
-        // should visit it, while retaining a reference to the caller for
-        // showing the path to the use of the default initializer in the
-        // warning.
-        return true;
-      }
-      DeclCtx = cast<DeclContext>(RD);
-    }
-    if (DeclCtx->isDependentContext())
-      return true; // Not to analyze dependent decl
-    if (Node->hasInClassInitializer())
-      Callback(Node);
-    return true;
-  }
-
   bool VisitFunctionDecl(FunctionDecl *Node) {
     if (cast<DeclContext>(Node)->isDependentContext())
       return true; // Not to analyze dependent decl
diff --git a/clang/test/SemaCXX/warn-unsafe-buffer-usage-function-attr.cpp b/clang/test/SemaCXX/warn-unsafe-buffer-usage-function-attr.cpp
index b9e517a35de7..724d444638b5 100644
--- a/clang/test/SemaCXX/warn-unsafe-buffer-usage-function-attr.cpp
+++ b/clang/test/SemaCXX/warn-unsafe-buffer-usage-function-attr.cpp
@@ -128,7 +128,7 @@ struct HoldsUnsafeMembers {
 
     UnsafeMembers FromCtor;
     UnsafeMembers FromCtor2;
-    UnsafeMembers FromField{3};  // expected-warning{{function introduces unsafe buffer manipulation}}
+    UnsafeMembers FromField{3};  // expected-warning 2{{function introduces unsafe buffer manipulation}}
 };
 
 struct SubclassUnsafeMembers : public UnsafeMembers {
diff --git a/clang/test/SemaCXX/warn-unsafe-buffer-usage-in-container-span-construct.cpp b/clang/test/SemaCXX/warn-unsafe-buffer-usage-in-container-span-construct.cpp
index 945a4e06d6af..22daac83b879 100644
--- a/clang/test/SemaCXX/warn-unsafe-buffer-usage-in-container-span-construct.cpp
+++ b/clang/test/SemaCXX/warn-unsafe-buffer-usage-in-container-span-construct.cpp
@@ -158,9 +158,19 @@ namespace test_flag {
 struct HoldsStdSpan {
   char* Ptr;
   unsigned Size;
-  std::span<char> Span{Ptr, Size};  // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
+  std::span<char> Span{Ptr, Size}; // no-warning (this code is unreachable)
 
   HoldsStdSpan(char* P, unsigned S)
       : Span(P, S)  // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
   {}
 };
+
+struct HoldsStdSpan2 {
+  char* Ptr;
+  unsigned Size;
+  std::span<char> Span{Ptr, Size}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
+
+  HoldsStdSpan2(char* P, unsigned S)
+      : Ptr(P), Size(S)
+  {}
+};

This passes all tests except two. In one of those we aren't getting a warning but that's ok because the code is obviously dead. I added another test that demonstrates that we do get a warning when it actually matters.

In the other failure we're getting a duplicate warning because we're visiting the same field twice (from two different constructors). This isn't ideal and ultimately we'd need to either get rid of one of them or at least add notes to connect them back to the constructor, but that's not a deal breaker. We may also occasionally emit conflicting fixits because of that (attached to separate warnings) but that's probably still ok; eventually we'll get them deduplicated.

So I basically think that all of this is still better than analyzing field initializers separately; we'd never want to analyze them separately. Also more concise.

The visitor is a bit confusing; the match(*Node) invocation should have been put into the Visit() methods, not into Traverse() methods. Traverse methods are hard to override because their default implementation has tons of logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have (finally) applied that delta, and got the same test results. I also didn't know the 2 trick for duplicate warnings, that's helpful!

@danakj danakj requested a review from haoNoQ May 31, 2024 17:20
@danakj
Copy link
Contributor Author

danakj commented May 31, 2024

I have rebased (which dropped the previous PR's commits) and applied the requested changes in a second commit. I also fixed the test failures in that second commit.

@danakj danakj force-pushed the unsafe-inits branch 2 times, most recently from 862251b to c0e634d Compare May 31, 2024 17:35
@danakj
Copy link
Contributor Author

danakj commented Jun 6, 2024

bump for review?

@jkorous-apple
Copy link
Contributor

@haoNoQ Do you intend to review the patch?

@haoNoQ
Copy link
Collaborator

haoNoQ commented Oct 21, 2024

I did. The latest comment is #91991 (comment)

@jkorous-apple
Copy link
Contributor

Sorry, I should've asked - do you intend to review the changes that followed your comment?

@haoNoQ
Copy link
Collaborator

haoNoQ commented Oct 22, 2024

Please look at the dates.

@danakj
Copy link
Contributor Author

danakj commented Oct 22, 2024

Sorry I just haven't gotten back to making the fixes yet. If someone else wants to apply them and land it I won't be upset, otherwise I will get to this eventually..

@haoNoQ
Copy link
Collaborator

haoNoQ commented Oct 22, 2024

No worries! I've just seen folks independently rediscover some of your work so I wanted them to see if they want to deduplicate the efforts.

@ziqingluo-90
Copy link
Contributor

Hi @danakj , we are really looking forward to this change! Will you finish it soon? Or I can pick it up and land it on behalf of you?

Field initializers must be found by visiting FieldDecl and then
running the warning checks/matchers against the in-class initializer,
which is itself a Stmt. This catches warnings in places such as:

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};
};

CXXCtorInitializers are not statements either, but they point to an
initializer expression which is. When visiting a FunctionDecl, also walk
through any constructor initializers and run the warning checks/matchers
against their initializer expressions. This catches warnings for
initializing fields and calling other constructors, such as:

struct C {
  C(P* Ptr) : AnUnsafeCtor(Ptr) {}
}

We add tests for explicit construction, for field initialization, base
class constructor calls, delegated constructor calls, and aggregate
initialization.

Note that aggregate initialization through `()` is treated differently
today by the AST matchers than `{}`. The former is not considered as
calling an implicit constructor, while the latter is.
MatchDescendantVisitor::shouldVisitImplicitCode() returns false with a
TODO, which means we do not catch warnings of the form:

struct AggregateInitType { AnUnsafeCtor U; }
AggregateInitType{Ptr};

But we do already catch them when written as (in C++20):

struct AggregateInitType { AnUnsafeCtor U; }
AggregateInitType(Ptr);

Returning true from MatchDescendantVisitor::shouldVisitImplicitCode(),
however, breaks expectations for field in-class initializers by moving
the SourceLocation, possibly to inside the implicit ctor instead of on
the line where the field initialization happens.

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};  // expected-warning{{this is never seen then}}
};

Tests are also added for std::span(ptr, size) constructor being called
from a field initializer and a constructor initializer.
findGadgets() collects all gadgets into a single list, using a
single DeclUseTracker. Then applyGadgets() is called once with the
full list for a method (including for a ctor with initializers).

Handle ObjC interface variables as they return null from getParent().
Instead go through getContainingInterface() to get the DeclContext.

Disable fixits in field initializers as the machinery will not expect
a statement that is not a function body at this time.
The aggregate test case generates an AST without any ctor and
thus without any ctor initializers, inlcuding no CXXDefaultInitExpr.
In aggregates, the field initializer may be never invoked in which
case we do not need to visit and warn on it.

FIXMEs are left in place for aggregates where the field initializer
is used, but there's currently no warning.
We can visit CXXDefaultInitExprs to find construction through
[[unsafe_buffer_usage]] constructors of field initializers.
@danakj
Copy link
Contributor Author

danakj commented Oct 29, 2024

The suggested fixes were applied, thanks for them. It was a bit of work to page all this back in after a while.

@ziqingluo-90 ziqingluo-90 self-requested a review October 30, 2024 00:22

Stmts.push_back(FD->getBody());

if (const auto *ID = dyn_cast<CXXConstructorDecl>(D)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this equivalent to override traverseCXXCtorInitializer in MatchDescendantVisitor above (just like how you override TraverseCXXDefaultInitExpr)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have that option, because they are not statements:

def CXXDefaultArgExpr : StmtNode<Expr>;
def CXXDefaultInitExpr : StmtNode<Expr>;

That's why this code has to pull the getInit() out of it to find a statement.

Copy link
Contributor

@ziqingluo-90 ziqingluo-90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you @danakj for the continuous effort!

@danakj danakj merged commit a518ed2 into llvm:main Oct 30, 2024
8 checks passed
@danakj
Copy link
Contributor Author

danakj commented Oct 30, 2024

Thanks for reviewing!

smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
…nstructor initializers (llvm#91991)

CXXCtorInitializers are not statements , but they point to an
initializer expression which is. When visiting a FunctionDecl, also
walk through any constructor initializers and run the warning
checks/matchers against their initializer expressions. This catches
warnings for initializing fields and calling other constructors, such
as:
    
struct C {
  C(P* Ptr) : AnUnsafeCtor(Ptr) {}
}

Field initializers can be found by traversing CXXDefaultInitExprs. This
catches warnings in places such as:
    
struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};
};

We add tests for explicit construction, for field initialization, base
class constructor calls, delegated constructor calls, and aggregate
initialization.

Note that aggregate initialization is not fully covered where a field
specifies an initializer and it's not overridden in the aggregate initialization,
such as in:

struct AggregateViaValueInit {
    UnsafeMembers f1;
    // FIXME: A construction of this class does initialize the field
    // through this initializer, so it should warn. Ideally it should
    // also point to where the site of the construction is in
    // testAggregateViaValueInit().
    UnsafeMembers f2{3};
};

void testAggregateViaValueInit() {
    auto A = AggregateViaValueInit();
};

There are 3 tests for different types of aggregate initialization with
FIXMEs documenting this future work.

One attempt to fix this involved returning true from
MatchDescendantVisitor::shouldVisitImplicitCode(), however, it breaks expectations
for field in-class initializers by moving the SourceLocation, possibly
to inside the implicit ctor instead of on the line where the field
initialization happens.

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};  // expected-warning{{this is never seen then}}
};

Tests are also added for std::span(ptr, size) constructor being called
from a field initializer and a constructor initializer.

Issue llvm#80482
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…nstructor initializers (llvm#91991)

CXXCtorInitializers are not statements , but they point to an
initializer expression which is. When visiting a FunctionDecl, also
walk through any constructor initializers and run the warning
checks/matchers against their initializer expressions. This catches
warnings for initializing fields and calling other constructors, such
as:
    
struct C {
  C(P* Ptr) : AnUnsafeCtor(Ptr) {}
}

Field initializers can be found by traversing CXXDefaultInitExprs. This
catches warnings in places such as:
    
struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};
};

We add tests for explicit construction, for field initialization, base
class constructor calls, delegated constructor calls, and aggregate
initialization.

Note that aggregate initialization is not fully covered where a field
specifies an initializer and it's not overridden in the aggregate initialization,
such as in:

struct AggregateViaValueInit {
    UnsafeMembers f1;
    // FIXME: A construction of this class does initialize the field
    // through this initializer, so it should warn. Ideally it should
    // also point to where the site of the construction is in
    // testAggregateViaValueInit().
    UnsafeMembers f2{3};
};

void testAggregateViaValueInit() {
    auto A = AggregateViaValueInit();
};

There are 3 tests for different types of aggregate initialization with
FIXMEs documenting this future work.

One attempt to fix this involved returning true from
MatchDescendantVisitor::shouldVisitImplicitCode(), however, it breaks expectations
for field in-class initializers by moving the SourceLocation, possibly
to inside the implicit ctor instead of on the line where the field
initialization happens.

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};  // expected-warning{{this is never seen then}}
};

Tests are also added for std::span(ptr, size) constructor being called
from a field initializer and a constructor initializer.

Issue llvm#80482
ziqingluo-90 pushed a commit to ziqingluo-90/apple-llvm-project that referenced this pull request Nov 7, 2024
…nstructor initializers (llvm#91991)

CXXCtorInitializers are not statements , but they point to an
initializer expression which is. When visiting a FunctionDecl, also
walk through any constructor initializers and run the warning
checks/matchers against their initializer expressions. This catches
warnings for initializing fields and calling other constructors, such
as:

struct C {
  C(P* Ptr) : AnUnsafeCtor(Ptr) {}
}

Field initializers can be found by traversing CXXDefaultInitExprs. This
catches warnings in places such as:

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};
};

We add tests for explicit construction, for field initialization, base
class constructor calls, delegated constructor calls, and aggregate
initialization.

Note that aggregate initialization is not fully covered where a field
specifies an initializer and it's not overridden in the aggregate initialization,
such as in:

struct AggregateViaValueInit {
    UnsafeMembers f1;
    // FIXME: A construction of this class does initialize the field
    // through this initializer, so it should warn. Ideally it should
    // also point to where the site of the construction is in
    // testAggregateViaValueInit().
    UnsafeMembers f2{3};
};

void testAggregateViaValueInit() {
    auto A = AggregateViaValueInit();
};

There are 3 tests for different types of aggregate initialization with
FIXMEs documenting this future work.

One attempt to fix this involved returning true from
MatchDescendantVisitor::shouldVisitImplicitCode(), however, it breaks expectations
for field in-class initializers by moving the SourceLocation, possibly
to inside the implicit ctor instead of on the line where the field
initialization happens.

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};  // expected-warning{{this is never seen then}}
};

Tests are also added for std::span(ptr, size) constructor being called
from a field initializer and a constructor initializer.

Issue llvm#80482

(cherry picked from commit a518ed2)
aarongable pushed a commit to chromium/chromium that referenced this pull request Nov 8, 2024
An upcoming Clang change
(llvm/llvm-project#91991) makes
-Wunsafe-buffer-usage warn on member initializers and member initializer
list, so this patch silences those warnings.

Bug: 377326291
Change-Id: I74206f5675f64f4f754793cff2bdb388ef8965cf
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6002984
Code-Coverage: [email protected] <[email protected]>
Reviewed-by: Nico Weber <[email protected]>
Commit-Queue: Nico Weber <[email protected]>
Owners-Override: Nico Weber <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1380096}
ziqingluo-90 added a commit to swiftlang/llvm-project that referenced this pull request Nov 8, 2024
…or initializers (llvm#91991) (#9537)

CXXCtorInitializers are not statements , but they point to an
initializer expression which is. When visiting a FunctionDecl, also
walk through any constructor initializers and run the warning
checks/matchers against their initializer expressions. This catches
warnings for initializing fields and calling other constructors, such
as:

struct C {
  C(P* Ptr) : AnUnsafeCtor(Ptr) {}
}

Field initializers can be found by traversing CXXDefaultInitExprs. This
catches warnings in places such as:

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};
};

We add tests for explicit construction, for field initialization, base
class constructor calls, delegated constructor calls, and aggregate
initialization.

Note that aggregate initialization is not fully covered where a field
specifies an initializer and it's not overridden in the aggregate initialization,
such as in:

struct AggregateViaValueInit {
    UnsafeMembers f1;
    // FIXME: A construction of this class does initialize the field
    // through this initializer, so it should warn. Ideally it should
    // also point to where the site of the construction is in
    // testAggregateViaValueInit().
    UnsafeMembers f2{3};
};

void testAggregateViaValueInit() {
    auto A = AggregateViaValueInit();
};

There are 3 tests for different types of aggregate initialization with
FIXMEs documenting this future work.

One attempt to fix this involved returning true from
MatchDescendantVisitor::shouldVisitImplicitCode(), however, it breaks expectations
for field in-class initializers by moving the SourceLocation, possibly
to inside the implicit ctor instead of on the line where the field
initialization happens.

struct C {
  P* Ptr;
  AnUnsafeCtor U{Ptr};  // expected-warning{{this is never seen then}}
};

Tests are also added for std::span(ptr, size) constructor being called
from a field initializer and a constructor initializer.

Issue llvm#80482

(cherry picked from commit a518ed2)

Co-authored-by: Dana Jansens <[email protected]>

(rdar://137999201)
@ZequanWu
Copy link
Contributor

Hi, we have found this causing clang to crash with -Wunsafe-buffer-usage: https://godbolt.org/z/nvWxW43fK

@ZequanWu
Copy link
Contributor

It's actually an existing crash not caused by this one.

@ZequanWu
Copy link
Contributor

Confirmed that this change causes crash with -Wunsafe-buffer-usage: https://godbolt.org/z/vcv8d6sYr.

My previously manual-reduced repro in #91991 (comment) happens to crash before this change.

@ziqingluo-90
Copy link
Contributor

@ZequanWu Thanks for the bug-catching! I will look into it!

@danakj danakj deleted the unsafe-inits branch November 18, 2024 16:48
@zmodem
Copy link
Collaborator

zmodem commented Nov 27, 2024

@ZequanWu Based on the discussion on #116286 it sounds like the crashes are now fixed? Can you double check, and re-land this if so?

@ZequanWu
Copy link
Contributor

Verified that it no longer crashes.

ZequanWu added a commit that referenced this pull request Nov 27, 2024
…ld and constructor initializers (#91991)"

It was originally reverted due to an [existing bug](#116286). Relanding it as the bug was fixed by #116433.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants