Skip to content

[clang][analyzer] Model allocation behavior or getdelim/geline #83138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 6, 2024

Conversation

alejandro-alvarez-sonarsource
Copy link
Contributor

getdelim and getline may free, allocate, or re-allocate the input buffer, ensuring its size is enough to hold the incoming line, the delimiter, and the null terminator.

*lineptr must be a valid argument to free, which means it can be either

  1. NULL, in which case these functions perform an allocation equivalent to a call to malloc even on failure.
  2. A pointer returned by the malloc family of functions. Other pointers are UB (alloca, a pointer to a static, to a stack variable, etc.)

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Feb 27, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 27, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource)

Changes

getdelim and getline may free, allocate, or re-allocate the input buffer, ensuring its size is enough to hold the incoming line, the delimiter, and the null terminator.

*lineptr must be a valid argument to free, which means it can be either

  1. NULL, in which case these functions perform an allocation equivalent to a call to malloc even on failure.
  2. A pointer returned by the malloc family of functions. Other pointers are UB (alloca, a pointer to a static, to a stack variable, etc.)

Full diff: https://github.com/llvm/llvm-project/pull/83138.diff

5 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h (+6)
  • (modified) clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp (+67-6)
  • (modified) clang/lib/StaticAnalyzer/Core/CheckerHelpers.cpp (+9)
  • (modified) clang/test/Analysis/Inputs/system-header-simulator-for-malloc.h (+1)
  • (added) clang/test/Analysis/getline-alloc.c (+71)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h
index 65982457ad8393..60421e5437d82f 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h
@@ -13,6 +13,9 @@
 #ifndef LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CHECKERHELPERS_H
 #define LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CHECKERHELPERS_H
 
+#include "ProgramState_Fwd.h"
+#include "SVals.h"
+
 #include "clang/AST/OperationKinds.h"
 #include "clang/AST/Stmt.h"
 #include "clang/Basic/OperatorKinds.h"
@@ -110,6 +113,9 @@ class OperatorKind {
 OperatorKind operationKindFromOverloadedOperator(OverloadedOperatorKind OOK,
                                                  bool IsBinary);
 
+std::optional<DefinedSVal> getPointeeDefVal(SVal PtrSVal,
+                                            ProgramStateRef State);
+
 } // namespace ento
 
 } // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 79ab05f2c7866a..c5c382634c981c 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -382,6 +382,8 @@ class MallocChecker
   CHECK_FN(checkGMemdup)
   CHECK_FN(checkGMallocN)
   CHECK_FN(checkGMallocN0)
+  CHECK_FN(preGetdelim)
+  CHECK_FN(checkGetdelim)
   CHECK_FN(checkReallocN)
   CHECK_FN(checkOwnershipAttr)
 
@@ -391,6 +393,11 @@ class MallocChecker
   using CheckFn = std::function<void(const MallocChecker *,
                                      const CallEvent &Call, CheckerContext &C)>;
 
+  const CallDescriptionMap<CheckFn> PreFnMap{
+      {{{"getline"}, 3}, &MallocChecker::preGetdelim},
+      {{{"getdelim"}, 4}, &MallocChecker::preGetdelim},
+  };
+
   const CallDescriptionMap<CheckFn> FreeingMemFnMap{
       {{{"free"}, 1}, &MallocChecker::checkFree},
       {{{"if_freenameindex"}, 1}, &MallocChecker::checkIfFreeNameIndex},
@@ -439,6 +446,8 @@ class MallocChecker
        std::bind(&MallocChecker::checkRealloc, _1, _2, _3, false)},
       {{{"g_realloc_n"}, 3}, &MallocChecker::checkReallocN},
       {{{"g_try_realloc_n"}, 3}, &MallocChecker::checkReallocN},
+      {{{"getline"}, 3}, &MallocChecker::checkGetdelim},
+      {{{"getdelim"}, 4}, &MallocChecker::checkGetdelim},
   };
 
   bool isMemCall(const CallEvent &Call) const;
@@ -588,11 +597,14 @@ class MallocChecker
   ///      }
   /// \param [in] ReturnsNullOnFailure Whether the memory deallocation function
   ///   we're modeling returns with Null on failure.
+  /// \param [in] ArgValOpt Optional value to use for the argument instead of
+  /// the one obtained from ArgExpr.
   /// \returns The ProgramState right after deallocation.
   [[nodiscard]] ProgramStateRef
   FreeMemAux(CheckerContext &C, const Expr *ArgExpr, const CallEvent &Call,
              ProgramStateRef State, bool Hold, bool &IsKnownToBeAllocated,
-             AllocationFamily Family, bool ReturnsNullOnFailure = false) const;
+             AllocationFamily Family, bool ReturnsNullOnFailure = false,
+             std::optional<SVal> ArgValOpt = {}) const;
 
   // TODO: Needs some refactoring, as all other deallocation modeling
   // functions are suffering from out parameters and messy code due to how
@@ -1423,6 +1435,46 @@ void MallocChecker::checkGMallocN0(const CallEvent &Call,
   C.addTransition(State);
 }
 
+void MallocChecker::preGetdelim(const CallEvent &Call,
+                                CheckerContext &C) const {
+  if (!Call.isGlobalCFunction())
+    return;
+
+  ProgramStateRef State = C.getState();
+  const auto LinePtr = getPointeeDefVal(Call.getArgSVal(0), State);
+  if (!LinePtr)
+    return;
+
+  bool IsKnownToBeAllocated = false;
+  State = FreeMemAux(C, Call.getArgExpr(0), Call, State, false,
+                     IsKnownToBeAllocated, AF_Malloc, false, LinePtr);
+  if (State)
+    C.addTransition(State);
+}
+
+void MallocChecker::checkGetdelim(const CallEvent &Call,
+                                  CheckerContext &C) const {
+  if (!Call.isGlobalCFunction())
+    return;
+
+  ProgramStateRef State = C.getState();
+  // Handle the post-conditions of getline and getdelim:
+  // Register the new conjured value as an allocated buffer.
+  const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
+  if (!CE)
+    return;
+
+  SValBuilder &svalBuilder = C.getSValBuilder();
+
+  const auto LinePtr = getPointeeDefVal(Call.getArgSVal(0), State);
+  const auto Size = getPointeeDefVal(Call.getArgSVal(1), State);
+  if (!LinePtr || !Size || !LinePtr->getAsRegion())
+    return;
+
+  State = setDynamicExtent(State, LinePtr->getAsRegion(), *Size, svalBuilder);
+  C.addTransition(MallocUpdateRefState(C, CE, State, AF_Malloc, *LinePtr));
+}
+
 void MallocChecker::checkReallocN(const CallEvent &Call,
                                   CheckerContext &C) const {
   ProgramStateRef State = C.getState();
@@ -1895,15 +1947,17 @@ static void printExpectedDeallocName(raw_ostream &os, AllocationFamily Family) {
   }
 }
 
-ProgramStateRef MallocChecker::FreeMemAux(
-    CheckerContext &C, const Expr *ArgExpr, const CallEvent &Call,
-    ProgramStateRef State, bool Hold, bool &IsKnownToBeAllocated,
-    AllocationFamily Family, bool ReturnsNullOnFailure) const {
+ProgramStateRef
+MallocChecker::FreeMemAux(CheckerContext &C, const Expr *ArgExpr,
+                          const CallEvent &Call, ProgramStateRef State,
+                          bool Hold, bool &IsKnownToBeAllocated,
+                          AllocationFamily Family, bool ReturnsNullOnFailure,
+                          std::optional<SVal> ArgValOpt) const {
 
   if (!State)
     return nullptr;
 
-  SVal ArgVal = C.getSVal(ArgExpr);
+  SVal ArgVal = ArgValOpt.value_or(C.getSVal(ArgExpr));
   if (!isa<DefinedOrUnknownSVal>(ArgVal))
     return nullptr;
   DefinedOrUnknownSVal location = ArgVal.castAs<DefinedOrUnknownSVal>();
@@ -2881,6 +2935,13 @@ void MallocChecker::checkPreCall(const CallEvent &Call,
       return;
   }
 
+  // We need to handle getline pre-conditions here before the pointed region
+  // gets invalidated by StreamChecker
+  if (const auto *PreFN = PreFnMap.lookup(Call)) {
+    (*PreFN)(this, Call, C);
+    return;
+  }
+
   // We will check for double free in the post visit.
   if (const AnyFunctionCall *FC = dyn_cast<AnyFunctionCall>(&Call)) {
     const FunctionDecl *FD = FC->getDecl();
diff --git a/clang/lib/StaticAnalyzer/Core/CheckerHelpers.cpp b/clang/lib/StaticAnalyzer/Core/CheckerHelpers.cpp
index 84ad20a5480792..364c87e910b7b5 100644
--- a/clang/lib/StaticAnalyzer/Core/CheckerHelpers.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CheckerHelpers.cpp
@@ -14,6 +14,7 @@
 #include "clang/AST/Decl.h"
 #include "clang/AST/Expr.h"
 #include "clang/Lex/Preprocessor.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
 #include <optional>
 
 namespace clang {
@@ -182,5 +183,13 @@ OperatorKind operationKindFromOverloadedOperator(OverloadedOperatorKind OOK,
   }
 }
 
+std::optional<DefinedSVal> getPointeeDefVal(SVal PtrSVal,
+                                            ProgramStateRef State) {
+  if (const auto *Ptr = PtrSVal.getAsRegion()) {
+    return State->getSVal(Ptr).getAs<DefinedSVal>();
+  }
+  return std::nullopt;
+}
+
 } // namespace ento
 } // namespace clang
diff --git a/clang/test/Analysis/Inputs/system-header-simulator-for-malloc.h b/clang/test/Analysis/Inputs/system-header-simulator-for-malloc.h
index e76455655e9e2e..bc7009eb0d1bea 100644
--- a/clang/test/Analysis/Inputs/system-header-simulator-for-malloc.h
+++ b/clang/test/Analysis/Inputs/system-header-simulator-for-malloc.h
@@ -9,6 +9,7 @@ typedef __typeof(sizeof(int)) size_t;
 void *malloc(size_t);
 void *calloc(size_t, size_t);
 void free(void *);
+void *alloca(size_t);
 
 
 #if __OBJC__
diff --git a/clang/test/Analysis/getline-alloc.c b/clang/test/Analysis/getline-alloc.c
new file mode 100644
index 00000000000000..eb2002af1219d1
--- /dev/null
+++ b/clang/test/Analysis/getline-alloc.c
@@ -0,0 +1,71 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix,debug.ExprInspection -verify %s
+
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix,alpha.unix,debug.ExprInspection -verify %s
+
+#include "Inputs/system-header-simulator.h"
+#include "Inputs/system-header-simulator-for-malloc.h"
+
+void test_getline_null_buffer() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  size_t n = 0;
+  if (getline(&buffer, &n, F1) > 0) {
+    char c = buffer[0]; // ok
+  }
+  free(buffer);
+  fclose(F1);
+}
+
+void test_getline_malloc_buffer() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+
+  size_t n = 10;
+  char *buffer = malloc(n);
+  char *ptr = buffer;
+
+  ssize_t r = getdelim(&buffer, &n, '\r', F1);
+  // ptr may be dangling
+  free(ptr);    // expected-warning {{Attempt to free released memory}}
+  free(buffer); // ok
+  fclose(F1);
+}
+
+void test_getline_alloca() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  size_t n = 10;
+  char *buffer = alloca(n);
+  getline(&buffer, &n, F1); // expected-warning {{Memory allocated by alloca() should not be deallocated}}
+  fclose(F1);
+}
+
+void test_getline_invalid_ptr() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  size_t n = 10;
+  char *buffer = (char*)test_getline_invalid_ptr;
+  getline(&buffer, &n, F1); // expected-warning {{Argument to getline() is the address of the function 'test_getline_invalid_ptr', which is not memory allocated by malloc()}}
+  fclose(F1);
+}
+
+void test_getline_leak() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+
+  char *buffer = NULL;
+  size_t n = 0;
+  ssize_t read;
+
+  while ((read = getline(&buffer, &n, F1)) != -1) {
+    printf("%s\n", buffer);
+  }
+
+  fclose(F1); // expected-warning {{Potential memory leak}}
+}

@steakhal steakhal requested review from balazske and NagyDonat March 5, 2024 20:12
Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this seems to be a good improvement!

I added some minor remarks, but I didn't notice any serious issues.

Comment on lines +1440 to +1441
if (!Call.isGlobalCFunction())
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the middle of refactoring CallDescription::matches() and after that there will be a cleaner alternative instead of this kind of late check. (No action needed from you, I'll update this code along with all the other checkers.)

if (!LinePtr)
return;

bool IsKnownToBeAllocated = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand it correctly, you declare this bool because FreeMemAux needs a reference that'll be use as an out parameter; but you don't actually use the value that's returned in it.

Consider adding a comment that explains this, because out-parameters are not that common and the fact that this acts as an out-parameter is not marked (unlike the older style when out-parameters are implemented with pointers and when &TargetVariable is passed to a function you can guess that it's an out-parameter). Originally I thought that this is an in-parameter and you put it into a named variable to explain its purpose (as an alternative to the comment /*IsKnownToBeAllocated=*/).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment to clarify the intent of this variable.

if (!CE)
return;

SValBuilder &svalBuilder = C.getSValBuilder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For SValBuilder objects I prefer the other commonly used name SVB, which follows the convention that variable names start with a capital letter. However, this is also OK if you wish to be consistent with other functions in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed. SVB is also used in this file.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, the change LGTM now.

@steakhal steakhal merged commit 67c6ad6 into llvm:main Mar 6, 2024
@alejandro-alvarez-sonarsource alejandro-alvarez-sonarsource deleted the aa/getline_alloc branch September 10, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants