Skip to content

[analyzer] Explicitly register NoStoreFuncVisitor from alpha.unix.cst… #108373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 19, 2024

Conversation

Szelethus
Copy link
Contributor

@Szelethus Szelethus commented Sep 12, 2024

…ring.UninitRead

This is a drastic simplification of #106982. If you read that patch, this is the same thing with all BugReporterVisitors.cpp and SValBuilder.cpp changes removed! (since all replies came regarding changed to those files, I felt the new PR was justified)

The patch was inspired by a pretty poor bug report on FFMpeg:

image

In this bug report, block is uninitialized, hence the bug report that it should not have been passed to memcpy. The confusing part is in line 93, where block was passed as a non-const pointer to seq_unpack_rle_block, which was obviously meant to initialize block. As developers, we know that clang likely didn't skip this function and found a path of execution on which this initialization failed, but NoStoreFuncVisitor failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the actual element which was found to be uninitialized (Remember, we heuristically only check if the first and last-to-access element is initialized, not the entire array). This is how the bug report looks now, with 'seq_unpack_rle_block' having notes describing the path of execution and lack of a value change:

image

image

Since NoStoreFuncVisitor was a TU-local class, I moved it back to BugReporterVisitors.h, and registered it manually in CStringChecker.cpp. This was done because we don't have a good trackRegionValue() function, only a trackExpressionValue() function. We have an expression for the array, but not for its first (or last-to-access) element, so I only had a MemRegion on hand.

…ring.UninitRead

This is a drastic simplification of llvm#106982.

The patch was inspired by a pretty poor bug report on FFMpeg:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5766433&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5768860&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Kristóf Umann (Szelethus)

Changes

…ring.UninitRead

This is a drastic simplification of #106982. If you read that patch, this is the same thing with all BugReporterVisitors.cpp and SValBuilder.cpp changes removed! (since all replies came regarding changed to those files, I felt the new PR was justified)

The patch was inspired by a pretty poor bug report on FFMpeg:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5766433&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

In this bug report, block is uninitialized, hence the bug report that it should not have been passed to memcpy. The confusing part is in line 93, where block was passed as a non-const pointer to seq_unpack_rle_block, which was obviously meant to initialize block. As developers, we know that clang likely didn't skip this function and found a path of execution on which this initialization failed, but NoStoreFuncVisitor failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the actual element which was found to be uninitialized (Remember, we heuristically only check if the first and last-to-access element is initialized, not the entire array). This is how the bug report looks now:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5768860&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

Since NoStoreFuncVisitor was a TU-local class, I moved it back to BugReporterVisitors.h, and registered it manually in CStringChecker.cpp. This was done because we don't have a good trackRegionValue() function, only a trackExpressionValue() function. We have an expression for the array, but not for its first (or last-to-access) element, so I only had a MemRegion on hand.


Full diff: https://github.com/llvm/llvm-project/pull/108373.diff

7 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h (+85)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h (+1)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h (+2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp (+9-4)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp (-89)
  • (modified) clang/lib/StaticAnalyzer/Core/SVals.cpp (+17)
  • (added) clang/test/Analysis/cstring-uninitread-notes.c (+25)
diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
index f97514955a5913..305622d81dc3e7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
@@ -718,6 +718,91 @@ class NoStateChangeFuncVisitor : public BugReporterVisitor {
                                    PathSensitiveBugReport &R) final;
 };
 
+/// Put a diagnostic on return statement of all inlined functions
+/// for which  the region of interest \p RegionOfInterest was passed into,
+/// but not written inside, and it has caused an undefined read or a null
+/// pointer dereference outside.
+class NoStoreFuncVisitor final : public NoStateChangeFuncVisitor {
+  const SubRegion *RegionOfInterest;
+  MemRegionManager &MmrMgr;
+  const SourceManager &SM;
+  const PrintingPolicy &PP;
+
+  /// Recursion limit for dereferencing fields when looking for the
+  /// region of interest.
+  /// The limit of two indicates that we will dereference fields only once.
+  static const unsigned DEREFERENCE_LIMIT = 2;
+
+  using RegionVector = SmallVector<const MemRegion *, 5>;
+
+public:
+  NoStoreFuncVisitor(
+      const SubRegion *R,
+      bugreporter::TrackingKind TKind = bugreporter::TrackingKind::Thorough)
+      : NoStateChangeFuncVisitor(TKind), RegionOfInterest(R),
+        MmrMgr(R->getMemRegionManager()),
+        SM(MmrMgr.getContext().getSourceManager()),
+        PP(MmrMgr.getContext().getPrintingPolicy()) {}
+
+  void Profile(llvm::FoldingSetNodeID &ID) const override {
+    static int Tag = 0;
+    ID.AddPointer(&Tag);
+    ID.AddPointer(RegionOfInterest);
+  }
+
+private:
+  /// \return Whether \c RegionOfInterest was modified at \p CurrN compared to
+  /// the value it holds in \p CallExitBeginN.
+  bool wasModifiedBeforeCallExit(const ExplodedNode *CurrN,
+                                 const ExplodedNode *CallExitBeginN) override;
+
+  /// Attempts to find the region of interest in a given record decl,
+  /// by either following the base classes or fields.
+  /// Dereferences fields up to a given recursion limit.
+  /// Note that \p Vec is passed by value, leading to quadratic copying cost,
+  /// but it's OK in practice since its length is limited to DEREFERENCE_LIMIT.
+  /// \return A chain fields leading to the region of interest or std::nullopt.
+  const std::optional<RegionVector>
+  findRegionOfInterestInRecord(const RecordDecl *RD, ProgramStateRef State,
+                               const MemRegion *R, const RegionVector &Vec = {},
+                               int depth = 0);
+
+  // Region of interest corresponds to an IVar, exiting a method
+  // which could have written into that IVar, but did not.
+  PathDiagnosticPieceRef maybeEmitNoteForObjCSelf(PathSensitiveBugReport &R,
+                                                  const ObjCMethodCall &Call,
+                                                  const ExplodedNode *N) final;
+
+  PathDiagnosticPieceRef maybeEmitNoteForCXXThis(PathSensitiveBugReport &R,
+                                                 const CXXConstructorCall &Call,
+                                                 const ExplodedNode *N) final;
+
+  PathDiagnosticPieceRef
+  maybeEmitNoteForParameters(PathSensitiveBugReport &R, const CallEvent &Call,
+                             const ExplodedNode *N) final;
+
+  /// Consume the information on the no-store stack frame in order to
+  /// either emit a note or suppress the report enirely.
+  /// \return Diagnostics piece for region not modified in the current function,
+  /// if it decides to emit one.
+  PathDiagnosticPieceRef
+  maybeEmitNote(PathSensitiveBugReport &R, const CallEvent &Call,
+                const ExplodedNode *N, const RegionVector &FieldChain,
+                const MemRegion *MatchedRegion, StringRef FirstElement,
+                bool FirstIsReferenceType, unsigned IndirectionLevel);
+
+  bool prettyPrintRegionName(const RegionVector &FieldChain,
+                             const MemRegion *MatchedRegion,
+                             StringRef FirstElement, bool FirstIsReferenceType,
+                             unsigned IndirectionLevel,
+                             llvm::raw_svector_ostream &os);
+
+  StringRef prettyPrintFirstElement(StringRef FirstElement,
+                                    bool MoreItemsExpected,
+                                    int IndirectionLevel,
+                                    llvm::raw_svector_ostream &os);
+};
+
 } // namespace ento
 } // namespace clang
 
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
index 0365f9e41312df..168983fd5cb686 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
@@ -69,6 +69,7 @@ class CheckerContext {
   /// the state of the program before the checker ran. Note, checkers should
   /// not retain the node in their state since the nodes might get invalidated.
   ExplodedNode *getPredecessor() { return Pred; }
+  const ProgramPoint getLocation() const { return Location; }
   const ProgramStateRef &getState() const { return Pred->getState(); }
 
   /// Check if the checker changed the state of the execution; ex: added
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index def2970d448d48..a054a819a15a85 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -89,6 +89,8 @@ class SVal {
 
   SValKind getKind() const { return Kind; }
 
+  StringRef getKindStr() const;
+
   // This method is required for using SVal in a FoldingSetNode.  It
   // extracts a unique signature for this SVal object.
   void Profile(llvm::FoldingSetNodeID &ID) const {
diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
index 8dd08f14b2728b..d8b818eead26be 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -16,6 +16,7 @@
 #include "clang/Basic/Builtins.h"
 #include "clang/Basic/CharInfo.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
 #include "clang/StaticAnalyzer/Core/CheckerManager.h"
@@ -337,7 +338,8 @@ class CStringChecker : public Checker< eval::Call,
                          const Stmt *S, StringRef WarningMsg) const;
   void emitAdditionOverflowBug(CheckerContext &C, ProgramStateRef State) const;
   void emitUninitializedReadBug(CheckerContext &C, ProgramStateRef State,
-                                const Expr *E, StringRef Msg) const;
+                                const Expr *E, const MemRegion *R,
+                                StringRef Msg) const;
   ProgramStateRef checkAdditionOverflow(CheckerContext &C,
                                             ProgramStateRef state,
                                             NonLoc left,
@@ -474,7 +476,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
     OS << "The first element of the ";
     printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
     OS << " argument is undefined";
-    emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
+    emitUninitializedReadBug(C, State, Buffer.Expression,
+                             FirstElementVal->getAsRegion(), OS.str());
     return nullptr;
   }
 
@@ -538,7 +541,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
     OS << ") in the ";
     printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
     OS << " argument is undefined";
-    emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
+    emitUninitializedReadBug(C, State, Buffer.Expression,
+                             FirstElementVal->getAsRegion(), OS.str());
     return nullptr;
   }
   return State;
@@ -818,7 +822,7 @@ void CStringChecker::emitNullArgBug(CheckerContext &C, ProgramStateRef State,
 
 void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                                               ProgramStateRef State,
-                                              const Expr *E,
+                                              const Expr *E, const MemRegion *R,
                                               StringRef Msg) const {
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_UninitRead)
@@ -831,6 +835,7 @@ void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                     Report->getLocation());
     Report->addRange(E->getSourceRange());
     bugreporter::trackExpressionValue(N, E, *Report);
+    Report->addVisitor<NoStoreFuncVisitor>(R->castAs<SubRegion>());
     C.emitReport(std::move(Report));
   }
 }
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index 7102bf51a57e8b..68c8a8dc682507 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -522,95 +522,6 @@ PathDiagnosticPieceRef NoStateChangeFuncVisitor::VisitNode(
   return maybeEmitNoteForParameters(R, *Call, N);
 }
 
-//===----------------------------------------------------------------------===//
-// Implementation of NoStoreFuncVisitor.
-//===----------------------------------------------------------------------===//
-
-namespace {
-/// Put a diagnostic on return statement of all inlined functions
-/// for which  the region of interest \p RegionOfInterest was passed into,
-/// but not written inside, and it has caused an undefined read or a null
-/// pointer dereference outside.
-class NoStoreFuncVisitor final : public NoStateChangeFuncVisitor {
-  const SubRegion *RegionOfInterest;
-  MemRegionManager &MmrMgr;
-  const SourceManager &SM;
-  const PrintingPolicy &PP;
-
-  /// Recursion limit for dereferencing fields when looking for the
-  /// region of interest.
-  /// The limit of two indicates that we will dereference fields only once.
-  static const unsigned DEREFERENCE_LIMIT = 2;
-
-  using RegionVector = SmallVector<const MemRegion *, 5>;
-
-public:
-  NoStoreFuncVisitor(const SubRegion *R, bugreporter::TrackingKind TKind)
-      : NoStateChangeFuncVisitor(TKind), RegionOfInterest(R),
-        MmrMgr(R->getMemRegionManager()),
-        SM(MmrMgr.getContext().getSourceManager()),
-        PP(MmrMgr.getContext().getPrintingPolicy()) {}
-
-  void Profile(llvm::FoldingSetNodeID &ID) const override {
-    static int Tag = 0;
-    ID.AddPointer(&Tag);
-    ID.AddPointer(RegionOfInterest);
-  }
-
-private:
-  /// \return Whether \c RegionOfInterest was modified at \p CurrN compared to
-  /// the value it holds in \p CallExitBeginN.
-  bool wasModifiedBeforeCallExit(const ExplodedNode *CurrN,
-                                 const ExplodedNode *CallExitBeginN) override;
-
-  /// Attempts to find the region of interest in a given record decl,
-  /// by either following the base classes or fields.
-  /// Dereferences fields up to a given recursion limit.
-  /// Note that \p Vec is passed by value, leading to quadratic copying cost,
-  /// but it's OK in practice since its length is limited to DEREFERENCE_LIMIT.
-  /// \return A chain fields leading to the region of interest or std::nullopt.
-  const std::optional<RegionVector>
-  findRegionOfInterestInRecord(const RecordDecl *RD, ProgramStateRef State,
-                               const MemRegion *R, const RegionVector &Vec = {},
-                               int depth = 0);
-
-  // Region of interest corresponds to an IVar, exiting a method
-  // which could have written into that IVar, but did not.
-  PathDiagnosticPieceRef maybeEmitNoteForObjCSelf(PathSensitiveBugReport &R,
-                                                  const ObjCMethodCall &Call,
-                                                  const ExplodedNode *N) final;
-
-  PathDiagnosticPieceRef maybeEmitNoteForCXXThis(PathSensitiveBugReport &R,
-                                                 const CXXConstructorCall &Call,
-                                                 const ExplodedNode *N) final;
-
-  PathDiagnosticPieceRef
-  maybeEmitNoteForParameters(PathSensitiveBugReport &R, const CallEvent &Call,
-                             const ExplodedNode *N) final;
-
-  /// Consume the information on the no-store stack frame in order to
-  /// either emit a note or suppress the report enirely.
-  /// \return Diagnostics piece for region not modified in the current function,
-  /// if it decides to emit one.
-  PathDiagnosticPieceRef
-  maybeEmitNote(PathSensitiveBugReport &R, const CallEvent &Call,
-                const ExplodedNode *N, const RegionVector &FieldChain,
-                const MemRegion *MatchedRegion, StringRef FirstElement,
-                bool FirstIsReferenceType, unsigned IndirectionLevel);
-
-  bool prettyPrintRegionName(const RegionVector &FieldChain,
-                             const MemRegion *MatchedRegion,
-                             StringRef FirstElement, bool FirstIsReferenceType,
-                             unsigned IndirectionLevel,
-                             llvm::raw_svector_ostream &os);
-
-  StringRef prettyPrintFirstElement(StringRef FirstElement,
-                                    bool MoreItemsExpected,
-                                    int IndirectionLevel,
-                                    llvm::raw_svector_ostream &os);
-};
-} // namespace
-
 /// \return Whether the method declaration \p Parent
 /// syntactically has a binary operation writing into the ivar \p Ivar.
 static bool potentiallyWritesIntoIvar(const Decl *Parent,
diff --git a/clang/lib/StaticAnalyzer/Core/SVals.cpp b/clang/lib/StaticAnalyzer/Core/SVals.cpp
index 291e4fa752a8f7..84e7e033404c03 100644
--- a/clang/lib/StaticAnalyzer/Core/SVals.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SVals.cpp
@@ -263,6 +263,23 @@ bool SVal::isZeroConstant() const {
 // Pretty-Printing.
 //===----------------------------------------------------------------------===//
 
+StringRef SVal::getKindStr() const {
+  switch (getKind()) {
+#define BASIC_SVAL(Id, Parent)                                                 \
+  case Id##Kind:                                                               \
+    return #Id;
+#define LOC_SVAL(Id, Parent)                                                   \
+  case Loc##Id##Kind:                                                          \
+    return #Id;
+#define NONLOC_SVAL(Id, Parent)                                                \
+  case NonLoc##Id##Kind:                                                       \
+    return #Id;
+#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.def"
+#undef REGION
+  }
+  llvm_unreachable("Unkown kind!");
+}
+
 LLVM_DUMP_METHOD void SVal::dump() const { dumpToStream(llvm::errs()); }
 
 void SVal::printJson(raw_ostream &Out, bool AddQuotes) const {
diff --git a/clang/test/Analysis/cstring-uninitread-notes.c b/clang/test/Analysis/cstring-uninitread-notes.c
new file mode 100644
index 00000000000000..b62519a85c8cc9
--- /dev/null
+++ b/clang/test/Analysis/cstring-uninitread-notes.c
@@ -0,0 +1,25 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -analyzer-checker=core,alpha.unix.cstring \
+// RUN:   -analyzer-output=text
+
+#include "Inputs/system-header-simulator.h"
+
+// Inspired by a report on ffmpeg, libavcodec/tiertexseqv.c, seq_decode_op1().
+int coin();
+
+void maybeWrite(const char *src, unsigned size, int *dst) {
+  if (coin()) // expected-note{{Assuming the condition is false}}
+              // expected-note@-1{{Taking false branch}}
+    memcpy(dst, src, size);
+} // expected-note{{Returning without writing to '*dst'}}
+
+void returning_without_writing_to_memcpy(const char *src, unsigned size) {
+  int block[8 * 8]; // expected-note{{'block' initialized here}}
+                                // expected-note@+1{{Calling 'maybeWrite'}}
+  maybeWrite(src, size, block); // expected-note{{Returning from 'maybeWrite'}}
+
+  int buf[8 * 8];
+  memcpy(buf, &block[0], 8); // expected-warning{{The first element of the 2nd argument is undefined [alpha.unix.cstring.UninitializedRead]}}
+                             // expected-note@-1{{The first element of the 2nd argument is undefined}}
+                             // expected-note@-2{{Other elements might also be undefined}}
+}

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this simplified variant, it's really straightforward now. I added a few inline comments (mostly copied from #106982), but after that I'm satisfied with the change.

@Szelethus Szelethus requested a review from NagyDonat September 16, 2024 12:03
Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me and as it is fairly innocent, I think we can merge it soon (assuming that we don't get additional remarks from somebody else).

When you're merging this change please write a commit message that is self-contained and doesn't reference our codechecker server (because those links will eventually succumb to link rot, while the git commit log is eternal). I know that the graphical representation of the execution path is more readable than plain-text alternatives but you could explain the goals of the change e.g. by referencing the test file that you added.

@Szelethus Szelethus merged commit 752e103 into llvm:main Sep 19, 2024
8 checks passed
tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
llvm#108373)

…ring.UninitRead

This is a drastic simplification of llvm#106982. If you read that patch,
this is the same thing with all BugReporterVisitors.cpp and
SValBuilder.cpp changes removed! (since all replies came regarding
changed to those files, I felt the new PR was justified)

The patch was inspired by a pretty poor bug report on FFMpeg:

![image](https://github.com/user-attachments/assets/8f4e03d8-45a4-4ea2-a63d-3ab78d097be9)

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now, with 'seq_unpack_rle_block' having notes describing the path of
execution and lack of a value change:

![image](https://github.com/user-attachments/assets/8de5d101-052e-4ecb-9cd9-7c29724333d2)

![image](https://github.com/user-attachments/assets/8bf52a95-62de-44e7-aef8-03a46a3fa08e)

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants