Skip to content

[analyzer] Implement binary operations on LazyCompoundVals #106982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Szelethus
Copy link
Contributor

@Szelethus Szelethus commented Sep 2, 2024

SValBuilder used to early exit when we did binary operations when one of the operands was a LazyCompoundVal. Riding on the new docs patch (#97407), this patch extracts the actual value behind the LazyCompoundVal, and proceeds with the operation.

This patch is unfortunately much larger than that, and initially, I'm only looking for feedback if I should split it up a bit more.

The goal of this patch was the following:

  1. Implement binary operations on LazyCompoundVals,
  2. Use this to be able to check if two LazyCompoundVals are equal,
  3. Which can be used in NoStoreFuncVisitor to attach "Returning without writing to x" notes even when the tracked variable is a LazyCompoundVal,
  4. Which is needed for alpha.unix.UninitializedRead, as elements of an array are often LazyCompoundVals.

This final point was inspired by a pretty poor bug report on FFMpeg:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5766433&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

image

In this bug report, block is uninitialized, hence the bug report that it should not have been passed to memcpy. The confusing part is in line 93, where block was passed as a non-const pointer to seq_unpack_rle_block, which was obviously meant to initialize block. As developers, we know that clang likely didn't skip this function and found a path of execution on which this initialization failed, but NoStoreFuncVisitor failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the actual element which was found to be uninitialized (Remember, we heuristically only check if the first and last-to-access element is initialized, not the entire array). This is how the bug report looks now:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5768860&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

image

image

Since NoStoreFuncVisitor was a TU-local class, I moved it back to BugReporterVisitors.h, and registered it manually in CStringChecker.cpp. This was done because we don't have a good trackRegionValue() function, only a trackExpressionValue() function. We have an expression for the array, but not for its first (or last-to-access) element, so I only had a MemRegion on hand.

NoStoreFuncVisitor works by checking if the value of a region-of-interest changed in a function; if it wasn't, it emits a helpful diagnostic message. This relied on SValBuilder comparing values. Since it didn't support the comparison of LazyCompoundVals, which are often formed around C++ objects, it wasn't really prepped to deal with them. After my fix, I needed to iron out some kinds around constructors, hence the non-trivial changes made in BugReporterVisitors.cpp.

SValBuilder used to early exit when we did binary operations when one of
the operands was a LazyCompoundVal. Riding on the new docs patch
(llvm#97407), this patch extracts the actual value behind the
LazyCompoundVal, and proceeds with the operation.

This patch is unfortunately much larger than that, and initially, I'm
only looking for feedback if I should split it up a bit more.

The goal of this patch was the following:
1. Implement binary operations on LazyCompoundVals,
2. Use this to be able to check if two LazyCompoundVals are equal,
3. Which can be used in NoStoreFuncVisitor to attach "Returning without
   writing to x" notes even when the tracked variable is a
   LazyCompoundVal,
4. Which is needed for alpha.unix.UninitializedRead, as elements of an
   array are often LazyCompoundVals.

This final point was inspired by a pretty poor bug report on FFMpeg:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5766433&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5768860&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.

NoStoreFuncVisitor works by checking if the value of a
region-of-interest changed in a function; if it wasn't, it emits a
helpful diagnostic message. This relied on SValBuilder comparing values.
Since it didn't support the comparison of LazyCompoundVals, which are
often formed around C++ objects, it wasn't really prepped to deal with
them. After my fix, I needed to iron out some kinds around constructors,
hence the non-trivial changes made in BugReporterVisitors.cpp.
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 2, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 2, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Kristóf Umann (Szelethus)

Changes

SValBuilder used to early exit when we did binary operations when one of the operands was a LazyCompoundVal. Riding on the new docs patch (#97407), this patch extracts the actual value behind the LazyCompoundVal, and proceeds with the operation.

This patch is unfortunately much larger than that, and initially, I'm only looking for feedback if I should split it up a bit more.

The goal of this patch was the following:

  1. Implement binary operations on LazyCompoundVals,
  2. Use this to be able to check if two LazyCompoundVals are equal,
  3. Which can be used in NoStoreFuncVisitor to attach "Returning without writing to x" notes even when the tracked variable is a LazyCompoundVal,
  4. Which is needed for alpha.unix.UninitializedRead, as elements of an array are often LazyCompoundVals.

This final point was inspired by a pretty poor bug report on FFMpeg:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5766433&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

In this bug report, block is uninitialized, hence the bug report that it should not have been passed to memcpy. The confusing part is in line 93, where block was passed as a non-const pointer to seq_unpack_rle_block, which was obviously meant to initialize block. As developers, we know that clang likely didn't skip this function and found a path of execution on which this initialization failed, but NoStoreFuncVisitor failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the actual element which was found to be uninitialized (Remember, we heuristically only check if the first and last-to-access element is initialized, not the entire array). This is how the bug report looks now:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5768860&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

Since NoStoreFuncVisitor was a TU-local class, I moved it back to BugReporterVisitors.h, and registered it manually in CStringChecker.cpp. This was done because we don't have a good trackRegionValue() function, only a trackExpressionValue() function. We have an expression for the array, but not for its first (or last-to-access) element, so I only had a MemRegion on hand.

NoStoreFuncVisitor works by checking if the value of a region-of-interest changed in a function; if it wasn't, it emits a helpful diagnostic message. This relied on SValBuilder comparing values. Since it didn't support the comparison of LazyCompoundVals, which are often formed around C++ objects, it wasn't really prepped to deal with them. After my fix, I needed to iron out some kinds around constructors, hence the non-trivial changes made in BugReporterVisitors.cpp.


Patch is 23.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106982.diff

8 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h (+85)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h (+1)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h (+2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp (+9-4)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp (+73-86)
  • (modified) clang/lib/StaticAnalyzer/Core/SValBuilder.cpp (+56-2)
  • (modified) clang/lib/StaticAnalyzer/Core/SVals.cpp (+17)
  • (added) clang/test/Analysis/cstring-uninitread-notes.c (+25)
diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
index f97514955a5913..305622d81dc3e7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
@@ -718,6 +718,91 @@ class NoStateChangeFuncVisitor : public BugReporterVisitor {
                                    PathSensitiveBugReport &R) final;
 };
 
+/// Put a diagnostic on return statement of all inlined functions
+/// for which  the region of interest \p RegionOfInterest was passed into,
+/// but not written inside, and it has caused an undefined read or a null
+/// pointer dereference outside.
+class NoStoreFuncVisitor final : public NoStateChangeFuncVisitor {
+  const SubRegion *RegionOfInterest;
+  MemRegionManager &MmrMgr;
+  const SourceManager &SM;
+  const PrintingPolicy &PP;
+
+  /// Recursion limit for dereferencing fields when looking for the
+  /// region of interest.
+  /// The limit of two indicates that we will dereference fields only once.
+  static const unsigned DEREFERENCE_LIMIT = 2;
+
+  using RegionVector = SmallVector<const MemRegion *, 5>;
+
+public:
+  NoStoreFuncVisitor(
+      const SubRegion *R,
+      bugreporter::TrackingKind TKind = bugreporter::TrackingKind::Thorough)
+      : NoStateChangeFuncVisitor(TKind), RegionOfInterest(R),
+        MmrMgr(R->getMemRegionManager()),
+        SM(MmrMgr.getContext().getSourceManager()),
+        PP(MmrMgr.getContext().getPrintingPolicy()) {}
+
+  void Profile(llvm::FoldingSetNodeID &ID) const override {
+    static int Tag = 0;
+    ID.AddPointer(&Tag);
+    ID.AddPointer(RegionOfInterest);
+  }
+
+private:
+  /// \return Whether \c RegionOfInterest was modified at \p CurrN compared to
+  /// the value it holds in \p CallExitBeginN.
+  bool wasModifiedBeforeCallExit(const ExplodedNode *CurrN,
+                                 const ExplodedNode *CallExitBeginN) override;
+
+  /// Attempts to find the region of interest in a given record decl,
+  /// by either following the base classes or fields.
+  /// Dereferences fields up to a given recursion limit.
+  /// Note that \p Vec is passed by value, leading to quadratic copying cost,
+  /// but it's OK in practice since its length is limited to DEREFERENCE_LIMIT.
+  /// \return A chain fields leading to the region of interest or std::nullopt.
+  const std::optional<RegionVector>
+  findRegionOfInterestInRecord(const RecordDecl *RD, ProgramStateRef State,
+                               const MemRegion *R, const RegionVector &Vec = {},
+                               int depth = 0);
+
+  // Region of interest corresponds to an IVar, exiting a method
+  // which could have written into that IVar, but did not.
+  PathDiagnosticPieceRef maybeEmitNoteForObjCSelf(PathSensitiveBugReport &R,
+                                                  const ObjCMethodCall &Call,
+                                                  const ExplodedNode *N) final;
+
+  PathDiagnosticPieceRef maybeEmitNoteForCXXThis(PathSensitiveBugReport &R,
+                                                 const CXXConstructorCall &Call,
+                                                 const ExplodedNode *N) final;
+
+  PathDiagnosticPieceRef
+  maybeEmitNoteForParameters(PathSensitiveBugReport &R, const CallEvent &Call,
+                             const ExplodedNode *N) final;
+
+  /// Consume the information on the no-store stack frame in order to
+  /// either emit a note or suppress the report enirely.
+  /// \return Diagnostics piece for region not modified in the current function,
+  /// if it decides to emit one.
+  PathDiagnosticPieceRef
+  maybeEmitNote(PathSensitiveBugReport &R, const CallEvent &Call,
+                const ExplodedNode *N, const RegionVector &FieldChain,
+                const MemRegion *MatchedRegion, StringRef FirstElement,
+                bool FirstIsReferenceType, unsigned IndirectionLevel);
+
+  bool prettyPrintRegionName(const RegionVector &FieldChain,
+                             const MemRegion *MatchedRegion,
+                             StringRef FirstElement, bool FirstIsReferenceType,
+                             unsigned IndirectionLevel,
+                             llvm::raw_svector_ostream &os);
+
+  StringRef prettyPrintFirstElement(StringRef FirstElement,
+                                    bool MoreItemsExpected,
+                                    int IndirectionLevel,
+                                    llvm::raw_svector_ostream &os);
+};
+
 } // namespace ento
 } // namespace clang
 
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
index 0365f9e41312df..168983fd5cb686 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
@@ -69,6 +69,7 @@ class CheckerContext {
   /// the state of the program before the checker ran. Note, checkers should
   /// not retain the node in their state since the nodes might get invalidated.
   ExplodedNode *getPredecessor() { return Pred; }
+  const ProgramPoint getLocation() const { return Location; }
   const ProgramStateRef &getState() const { return Pred->getState(); }
 
   /// Check if the checker changed the state of the execution; ex: added
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index def2970d448d48..a054a819a15a85 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -89,6 +89,8 @@ class SVal {
 
   SValKind getKind() const { return Kind; }
 
+  StringRef getKindStr() const;
+
   // This method is required for using SVal in a FoldingSetNode.  It
   // extracts a unique signature for this SVal object.
   void Profile(llvm::FoldingSetNodeID &ID) const {
diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
index 8dd08f14b2728b..d8b818eead26be 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -16,6 +16,7 @@
 #include "clang/Basic/Builtins.h"
 #include "clang/Basic/CharInfo.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
 #include "clang/StaticAnalyzer/Core/CheckerManager.h"
@@ -337,7 +338,8 @@ class CStringChecker : public Checker< eval::Call,
                          const Stmt *S, StringRef WarningMsg) const;
   void emitAdditionOverflowBug(CheckerContext &C, ProgramStateRef State) const;
   void emitUninitializedReadBug(CheckerContext &C, ProgramStateRef State,
-                                const Expr *E, StringRef Msg) const;
+                                const Expr *E, const MemRegion *R,
+                                StringRef Msg) const;
   ProgramStateRef checkAdditionOverflow(CheckerContext &C,
                                             ProgramStateRef state,
                                             NonLoc left,
@@ -474,7 +476,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
     OS << "The first element of the ";
     printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
     OS << " argument is undefined";
-    emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
+    emitUninitializedReadBug(C, State, Buffer.Expression,
+                             FirstElementVal->getAsRegion(), OS.str());
     return nullptr;
   }
 
@@ -538,7 +541,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
     OS << ") in the ";
     printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
     OS << " argument is undefined";
-    emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
+    emitUninitializedReadBug(C, State, Buffer.Expression,
+                             FirstElementVal->getAsRegion(), OS.str());
     return nullptr;
   }
   return State;
@@ -818,7 +822,7 @@ void CStringChecker::emitNullArgBug(CheckerContext &C, ProgramStateRef State,
 
 void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                                               ProgramStateRef State,
-                                              const Expr *E,
+                                              const Expr *E, const MemRegion *R,
                                               StringRef Msg) const {
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_UninitRead)
@@ -831,6 +835,7 @@ void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                     Report->getLocation());
     Report->addRange(E->getSourceRange());
     bugreporter::trackExpressionValue(N, E, *Report);
+    Report->addVisitor<NoStoreFuncVisitor>(R->castAs<SubRegion>());
     C.emitReport(std::move(Report));
   }
 }
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index 7102bf51a57e8b..9ec287d9044eac 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -315,6 +315,62 @@ static bool isFunctionMacroExpansion(SourceLocation Loc,
   return EInfo.isFunctionMacroExpansion();
 }
 
+static const LocationContext *getFirstNonCtorCall(const LocationContext *LCtx) {
+  while (llvm::isa_and_nonnull<CXXConstructorDecl>(LCtx->getDecl()))
+    LCtx = LCtx->getParent();
+  return LCtx;
+}
+
+static const MemRegion *getInitializerRegion(const PostInitializer &PI) {
+  return reinterpret_cast<const MemRegion *>(PI.getLocationValue());
+}
+
+/// Based largely on isInitializationOfVar(). Checks if N initializes FR.
+static bool isInitializationOfField(const ExplodedNode *N,
+                                    const FieldRegion *FR) {
+  std::optional<PostInitializer> P = N->getLocationAs<PostInitializer>();
+  if (!P)
+    return false;
+
+  const MemRegion *InitR = getInitializerRegion(*P);
+
+  if (FR != InitR)
+    return false;
+
+  const MemSpaceRegion *FRSpace = FR->getMemorySpace();
+  const auto *FRCtx = dyn_cast<StackSpaceRegion>(FRSpace);
+
+  if (!FRCtx)
+    return true;
+
+  // The stack frame of N is the constructor, but the FieldRegion is not local
+  // to the constructor, but rather to the caller function.
+  const LocationContext *CallerCtx =
+      getFirstNonCtorCall(N->getLocationContext());
+
+  return FRCtx->getStackFrame() == CallerCtx->getStackFrame();
+}
+
+static bool isConstructorCallFor(const SubRegion *RegionOfInterest,
+                                 const ExplodedNode *N) {
+
+  if (N->getStackFrame()->inTopFrame())
+    return false;
+
+  ProgramStateRef State = N->getState();
+
+  CallEventRef<> Call =
+      State->getStateManager().getCallEventManager().getCaller(
+          N->getStackFrame(), State);
+
+  if (const CXXConstructorCall *CC = dyn_cast<CXXConstructorCall>(Call)) {
+    if (CC->getCXXThisVal().getAsRegion() == RegionOfInterest) {
+      return true;
+    }
+  }
+  return false;
+}
+
 /// \return Whether \c RegionOfInterest was modified at \p N,
 /// where \p ValueAfter is \c RegionOfInterest's value at the end of the
 /// stack frame.
@@ -324,10 +380,22 @@ static bool wasRegionOfInterestModifiedAt(const SubRegion *RegionOfInterest,
   ProgramStateRef State = N->getState();
   ProgramStateManager &Mgr = N->getState()->getStateManager();
 
+  // If the region of interest is constructed right here, it is obviously
+  // modifying. This gets rid of notes like "Retuning without writing to *this",
+  // which makes no sense right in the constructor call.
+  if (isConstructorCallFor(RegionOfInterest, N))
+    return true;
+
   if (!N->getLocationAs<PostStore>() && !N->getLocationAs<PostInitializer>() &&
       !N->getLocationAs<PostStmt>())
     return false;
 
+  // If we are tracking a field, check if we reached its initialization point
+  // in an initializer list.
+  if (const FieldRegion *FR = RegionOfInterest->getAs<FieldRegion>())
+    if (isInitializationOfField(N, FR))
+      return true;
+
   // Writing into region of interest.
   if (auto PS = N->getLocationAs<PostStmt>())
     if (auto *BO = PS->getStmtAs<BinaryOperator>())
@@ -335,8 +403,12 @@ static bool wasRegionOfInterestModifiedAt(const SubRegion *RegionOfInterest,
                                       N->getSVal(BO->getLHS()).getAsRegion()))
         return true;
 
-  // SVal after the state is possibly different.
   SVal ValueAtN = N->getState()->getSVal(RegionOfInterest);
+
+  // If ValueAtN and ValueAfter are different, then RegionOfInterest was
+  // modified. We also need to handle the special case of undefined values,
+  // we need to check that the other value is not undefined, only then was
+  // the region modified.
   if (!Mgr.getSValBuilder()
            .areEqual(State, ValueAtN, ValueAfter)
            .isConstrainedTrue() &&
@@ -526,91 +598,6 @@ PathDiagnosticPieceRef NoStateChangeFuncVisitor::VisitNode(
 // Implementation of NoStoreFuncVisitor.
 //===----------------------------------------------------------------------===//
 
-namespace {
-/// Put a diagnostic on return statement of all inlined functions
-/// for which  the region of interest \p RegionOfInterest was passed into,
-/// but not written inside, and it has caused an undefined read or a null
-/// pointer dereference outside.
-class NoStoreFuncVisitor final : public NoStateChangeFuncVisitor {
-  const SubRegion *RegionOfInterest;
-  MemRegionManager &MmrMgr;
-  const SourceManager &SM;
-  const PrintingPolicy &PP;
-
-  /// Recursion limit for dereferencing fields when looking for the
-  /// region of interest.
-  /// The limit of two indicates that we will dereference fields only once.
-  static const unsigned DEREFERENCE_LIMIT = 2;
-
-  using RegionVector = SmallVector<const MemRegion *, 5>;
-
-public:
-  NoStoreFuncVisitor(const SubRegion *R, bugreporter::TrackingKind TKind)
-      : NoStateChangeFuncVisitor(TKind), RegionOfInterest(R),
-        MmrMgr(R->getMemRegionManager()),
-        SM(MmrMgr.getContext().getSourceManager()),
-        PP(MmrMgr.getContext().getPrintingPolicy()) {}
-
-  void Profile(llvm::FoldingSetNodeID &ID) const override {
-    static int Tag = 0;
-    ID.AddPointer(&Tag);
-    ID.AddPointer(RegionOfInterest);
-  }
-
-private:
-  /// \return Whether \c RegionOfInterest was modified at \p CurrN compared to
-  /// the value it holds in \p CallExitBeginN.
-  bool wasModifiedBeforeCallExit(const ExplodedNode *CurrN,
-                                 const ExplodedNode *CallExitBeginN) override;
-
-  /// Attempts to find the region of interest in a given record decl,
-  /// by either following the base classes or fields.
-  /// Dereferences fields up to a given recursion limit.
-  /// Note that \p Vec is passed by value, leading to quadratic copying cost,
-  /// but it's OK in practice since its length is limited to DEREFERENCE_LIMIT.
-  /// \return A chain fields leading to the region of interest or std::nullopt.
-  const std::optional<RegionVector>
-  findRegionOfInterestInRecord(const RecordDecl *RD, ProgramStateRef State,
-                               const MemRegion *R, const RegionVector &Vec = {},
-                               int depth = 0);
-
-  // Region of interest corresponds to an IVar, exiting a method
-  // which could have written into that IVar, but did not.
-  PathDiagnosticPieceRef maybeEmitNoteForObjCSelf(PathSensitiveBugReport &R,
-                                                  const ObjCMethodCall &Call,
-                                                  const ExplodedNode *N) final;
-
-  PathDiagnosticPieceRef maybeEmitNoteForCXXThis(PathSensitiveBugReport &R,
-                                                 const CXXConstructorCall &Call,
-                                                 const ExplodedNode *N) final;
-
-  PathDiagnosticPieceRef
-  maybeEmitNoteForParameters(PathSensitiveBugReport &R, const CallEvent &Call,
-                             const ExplodedNode *N) final;
-
-  /// Consume the information on the no-store stack frame in order to
-  /// either emit a note or suppress the report enirely.
-  /// \return Diagnostics piece for region not modified in the current function,
-  /// if it decides to emit one.
-  PathDiagnosticPieceRef
-  maybeEmitNote(PathSensitiveBugReport &R, const CallEvent &Call,
-                const ExplodedNode *N, const RegionVector &FieldChain,
-                const MemRegion *MatchedRegion, StringRef FirstElement,
-                bool FirstIsReferenceType, unsigned IndirectionLevel);
-
-  bool prettyPrintRegionName(const RegionVector &FieldChain,
-                             const MemRegion *MatchedRegion,
-                             StringRef FirstElement, bool FirstIsReferenceType,
-                             unsigned IndirectionLevel,
-                             llvm::raw_svector_ostream &os);
-
-  StringRef prettyPrintFirstElement(StringRef FirstElement,
-                                    bool MoreItemsExpected,
-                                    int IndirectionLevel,
-                                    llvm::raw_svector_ostream &os);
-};
-} // namespace
-
 /// \return Whether the method declaration \p Parent
 /// syntactically has a binary operation writing into the ivar \p Ivar.
 static bool potentiallyWritesIntoIvar(const Decl *Parent,
diff --git a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
index eb9cde5c8918fc..c40f686818db8e 100644
--- a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
@@ -36,6 +36,7 @@
 #include "llvm/ADT/APSInt.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include <cassert>
 #include <optional>
 #include <tuple>
@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, UnaryOperator::Opcode opc,
   llvm_unreachable("Unexpected unary operator");
 }
 
+namespace {
+/// Iterate through to store to find the actual value this LazyCompoundVal
+/// corresponds to. Further reading is in LazyCompoundVal's docs.
+struct LazyHandler : public StoreManager::BindingsHandler {
+  nonloc::LazyCompoundVal l;
+  std::optional<SVal> &Ret;
+
+  LazyHandler(nonloc::LazyCompoundVal l, std::optional<SVal> &Ret)
+      : l(l), Ret(Ret) {}
+
+  virtual bool HandleBinding(StoreManager &SMgr, Store Store,
+                             const MemRegion *Region, SVal Val) override {
+    if (const MemRegion *MR = Val.getAsRegion()) {
+      if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {
+        Ret = Val;
+        return false;
+      }
+    }
+    return true;
+  }
+};
+} // namespace
+
+/// Find the actual value behind a LazyCompoundVal. May return none if the
+/// query fails, or only finds another LazyCompoundVal (e.g. circular
+/// reference).
+static std::optional<SVal> extractActualValueFrom(ProgramStateRef State,
+                                                  nonloc::LazyCompoundVal L) {
+  std::optional<SVal> Ret;
+  LazyHandler handler{L, Ret};
+  State->getStateManager().getStoreManager().iterBindings(State->getStore(),
+                                                          handler);
+  if (Ret) {
+    std::optional<SVal> RVal = Ret = State->getSVal(Ret->getAsRegion());
+    // If our best efforts lead us back to another LazyCompoundValue, give up.
+    if (Ret == RVal)
+      return {};
+  }
+  return Ret;
+}
+
 SVal SValBuilder::evalBinOp(ProgramStateRef state, BinaryOperator::Opcode op,
                             SVal lhs, SVal rhs, QualType type) {
   if (lhs.isUndef() || rhs.isUndef())
@@ -498,8 +540,20 @@ SVal SValBuilder::evalBinOp(ProgramStateRef state, BinaryOperator::Opcode op,
   if (lhs.isUnknown() || rhs.isUnknown())
     return UnknownVal();
 
-  if (isa<nonloc::LazyCompoundVal>(lhs) || isa<nonloc::LazyCompoundVal>(rhs)) {
-    return UnknownVal();
+  if (isa<nonloc::LazyCompoundVal>(lhs)) {
+    std::optional<SVal> LhsVal =
+        extractActualValueFrom(state, lhs.castAs<nonloc::La...
[truncated]

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First part of my review, I'll continue tomorrow.

Comment on lines +733 to +734
/// The limit of two indicates that we will dereference fields only once.
static const unsigned DEREFERENCE_LIMIT = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider replacing this with MAX_DEREFERENCE_COUNT = 1 which would be mostly self-documenting. (However, the current name is also OK if you prefer it.)

EDIT: Now I see that this is not new code, just moved from elsewhere. Feel free to ignore this if you're not interested.

const ExplodedNode *N) final;

/// Consume the information on the no-store stack frame in order to
/// either emit a note or suppress the report enirely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// either emit a note or suppress the report enirely.
/// either emit a note or suppress the report entirely.

Just a typo.

@@ -538,7 +541,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
OS << ") in the ";
printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
OS << " argument is undefined";
emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
emitUninitializedReadBug(C, State, Buffer.Expression,
FirstElementVal->getAsRegion(), OS.str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FirstElementVal->getAsRegion(), OS.str());
LastElementVal->getAsRegion(), OS.str());

}

static const MemRegion *getInitializerRegion(const PostInitializer &PI) {
return reinterpret_cast<const MemRegion *>(PI.getLocationValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reinterpret cast is really scary, but I see that this is apparently the intended way of using PostInitializer objects (which is a bit crazy). Perhaps add an explanation comment.

It would be good to have a templated PostInitializer::getLocationValue<T>() const which returns const T * instead of const void *. However, that class is defined outside of CSA so I'm not sure that this is worth the effort.

const LocationContext *CallerCtx =
getFirstNonCtorCall(N->getLocationContext());

return FRCtx->getStackFrame() == CallerCtx->getStackFrame();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like you are calling the same method on two different objects of the same type, but IIUC their types are unrelated: FRCtx is a StackSpaceRegion while CallerCtx is a LocationContext.

I think it would be helpful to rename FRCtx to FRStackSpace or something similar to emphasize that it's not a (location) context object.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I continued the review and unfortunately I found several fundamental issues within the implementation of extractActualValueFrom() -- see inline comments for details.

Moreover I'm not convinced that this "replace LazyCompoundVals with actual values" logic is always the right approach for your goals.

When we discussed this yesterday, you told me that this logic was intended to handle the situations where a LazyCompoundVal represents a non-aggregate type (like an integer) -- but I think that we should catch that kind of non-canonical representation at an earlier point (e.g. within some getSVal method). Hiding an integer behind this kind of lazy wrapper is a serious defect of our store model which inhibits many operations -- not just evalBinOp.

On the other hand, this "replace LazyCompoundVals with a better representation" approach could be a good idea in the cases when the LazyCompoundVal represents an aggregate. However in this case I don't think that you can get a non-lazy representation by simply selecting the right binding (either via iterBindings or via a direct getBinding call) -- you will need to find or write an explicit "convert this LazyCompoundVal into a non-lazy CompoundVal" function.

Comment on lines +527 to +530
std::optional<SVal> RVal = Ret = State->getSVal(Ret->getAsRegion());
// If our best efforts lead us back to another LazyCompoundValue, give up.
if (Ret == RVal)
return {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is fishy here -- after RVal = Ret the condition Ret == RVal should always be true.

Comment on lines +506 to +507
if (const MemRegion *MR = Val.getAsRegion()) {
if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing this MR, you should check Region->isSubRegionOf(l.getRegion()) or something similar!

When iterBindings() calls this callback, it will pass [Region, value bound to that region] pairs (as the third and fourth arguments), and if you want to only consider the values stored within the region l.getRegion(), you need to check that the key (i.e. Region i.e. the third argument) is a subregion of the region whose contents were "frozen" as a LazyCompoundVal.

Your current code looks for values stored anywhere within the frozen old state that are pointer-typed and point to a subregion of l.getRegion().

State->getStateManager().getStoreManager().iterBindings(State->getStore(),
handler);
if (Ret) {
std::optional<SVal> RVal = Ret = State->getSVal(Ret->getAsRegion());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you calling State->getSval(<some region>) within this function?

This method queries the value currently stored within that region which is as far as I see completely irrelevant for the goals of this function.

nonloc::LazyCompoundVal L) {
std::optional<SVal> Ret;
LazyHandler handler{L, Ret};
State->getStateManager().getStoreManager().iterBindings(State->getStore(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the implementation of iterBindings and the other methods of StoreManager and I still think that iterBindings is not the right tool for this job and you should use SVal StoreManager::getBinding(Store S, Loc L, QualType T) instead.

I know that the documentation of LazyCompoundVal suggests using iterBindings and that those docs were blessed by several reviewers (including me), but that doesn't change the fact that there are other more practical methods and iterBindings doesn't always provide a complete answer, because it only iterates over direct bindings and skips the so-called default bindings.

nonloc::LazyCompoundVal L) {
std::optional<SVal> Ret;
LazyHandler handler{L, Ret};
State->getStateManager().getStoreManager().iterBindings(State->getStore(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of State->getStore() (the current state of the memory) you must use L->getStore() (the old memory snapshot which defines the value represented by the LazyCompoundVal L).

Szelethus added a commit to Szelethus/llvm-project that referenced this pull request Sep 12, 2024
…ring.UninitRead

This is a drastic simplification of llvm#106982.

The patch was inspired by a pretty poor bug report on FFMpeg:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5766433&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now:

https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?is-unique=on&run=curl_curl-7_66_0_cstring_uninit_baseline&run=ffmpeg_n4.3.1_cstring_uninit_baseline&run=memcached_1.6.8_cstring_uninit_baseline&run=openssl_openssl-3.0.0-alpha7_cstring_uninit_baseline&run=sqlite_version-3.33.0_cstring_uninit_baseline&run=tmux_2.6_cstring_uninit_baseline&run=twin_v0.8.1_cstring_uninit_baseline&run=vim_v8.2.1920_cstring_uninit_baseline&newcheck=curl_curl-7_66_0_cstring_uninit_final&newcheck=ffmpeg_n4.3.1_cstring_uninit_final&newcheck=memcached_1.6.8_cstring_uninit_final&newcheck=openssl_openssl-3.0.0-alpha7_cstring_uninit_final&newcheck=sqlite_version-3.33.0_cstring_uninit_final&newcheck=tmux_2.6_cstring_uninit_final&newcheck=twin_v0.8.1_cstring_uninit_final&newcheck=vim_v8.2.1920_cstring_uninit_final&diff-type=Unresolved&items-per-page=100&checker-name=alpha.unix.cstring.UninitializedRead&report-id=5768860&report-hash=82c63868ba782bfa216ee3f2b5734d6b&report-filepath=%2atiertexseqv.c

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.
@Szelethus
Copy link
Contributor Author

Superceding this patch to 108373.

@Szelethus Szelethus closed this Sep 12, 2024
Szelethus added a commit that referenced this pull request Sep 19, 2024
#108373)

…ring.UninitRead

This is a drastic simplification of #106982. If you read that patch,
this is the same thing with all BugReporterVisitors.cpp and
SValBuilder.cpp changes removed! (since all replies came regarding
changed to those files, I felt the new PR was justified)

The patch was inspired by a pretty poor bug report on FFMpeg:

![image](https://github.com/user-attachments/assets/8f4e03d8-45a4-4ea2-a63d-3ab78d097be9)

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now, with 'seq_unpack_rle_block' having notes describing the path of
execution and lack of a value change:

![image](https://github.com/user-attachments/assets/8de5d101-052e-4ecb-9cd9-7c29724333d2)

![image](https://github.com/user-attachments/assets/8bf52a95-62de-44e7-aef8-03a46a3fa08e)

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.
tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
llvm#108373)

…ring.UninitRead

This is a drastic simplification of llvm#106982. If you read that patch,
this is the same thing with all BugReporterVisitors.cpp and
SValBuilder.cpp changes removed! (since all replies came regarding
changed to those files, I felt the new PR was justified)

The patch was inspired by a pretty poor bug report on FFMpeg:

![image](https://github.com/user-attachments/assets/8f4e03d8-45a4-4ea2-a63d-3ab78d097be9)

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now, with 'seq_unpack_rle_block' having notes describing the path of
execution and lack of a value change:

![image](https://github.com/user-attachments/assets/8de5d101-052e-4ecb-9cd9-7c29724333d2)

![image](https://github.com/user-attachments/assets/8bf52a95-62de-44e7-aef8-03a46a3fa08e)

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants