Skip to content

[clang][analyzer] Support fgets in the SteamChecker #73638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 29, 2023
Merged

[clang][analyzer] Support fgets in the SteamChecker #73638

merged 1 commit into from
Nov 29, 2023

Conversation

benshi001
Copy link
Member

@benshi001 benshi001 commented Nov 28, 2023

No description provided.

@benshi001 benshi001 requested review from balazske and steakhal and removed request for balazske November 28, 2023 11:37
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Nov 28, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 28, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Ben Shi (benshi001)

Changes

This PR contains two commits, the first one is identical to #73335, please just review the second one about fgets.


Full diff: https://github.com/llvm/llvm-project/pull/73638.diff

4 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp (+116-47)
  • (modified) clang/test/Analysis/Inputs/system-header-simulator.h (+2)
  • (modified) clang/test/Analysis/stream-error.c (+61-4)
  • (modified) clang/test/Analysis/stream.c (+13)
diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index 8eca989d7bcdea4..a4799b5f762caee 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -252,10 +252,16 @@ class StreamChecker : public Checker<check::PreCall, eval::Call,
         std::bind(&StreamChecker::evalFreadFwrite, _1, _2, _3, _4, false), 3}},
       {{{"fgetc"}, 1},
        {std::bind(&StreamChecker::preReadWrite, _1, _2, _3, _4, true),
-        std::bind(&StreamChecker::evalFgetcFputc, _1, _2, _3, _4, true), 0}},
+        std::bind(&StreamChecker::evalFgetx, _1, _2, _3, _4, true), 0}},
+      {{{"fgets"}, 3},
+       {std::bind(&StreamChecker::preReadWrite, _1, _2, _3, _4, true),
+        std::bind(&StreamChecker::evalFgetx, _1, _2, _3, _4, false), 2}},
       {{{"fputc"}, 2},
        {std::bind(&StreamChecker::preReadWrite, _1, _2, _3, _4, false),
-        std::bind(&StreamChecker::evalFgetcFputc, _1, _2, _3, _4, false), 1}},
+        std::bind(&StreamChecker::evalFputx, _1, _2, _3, _4, true), 1}},
+      {{{"fputs"}, 2},
+       {std::bind(&StreamChecker::preReadWrite, _1, _2, _3, _4, false),
+        std::bind(&StreamChecker::evalFputx, _1, _2, _3, _4, false), 1}},
       {{{"fseek"}, 3},
        {&StreamChecker::preFseek, &StreamChecker::evalFseek, 0}},
       {{{"ftell"}, 1},
@@ -317,8 +323,11 @@ class StreamChecker : public Checker<check::PreCall, eval::Call,
   void evalFreadFwrite(const FnDescription *Desc, const CallEvent &Call,
                        CheckerContext &C, bool IsFread) const;
 
-  void evalFgetcFputc(const FnDescription *Desc, const CallEvent &Call,
-                      CheckerContext &C, bool IsRead) const;
+  void evalFgetx(const FnDescription *Desc, const CallEvent &Call,
+                 CheckerContext &C, bool SingleChar) const;
+
+  void evalFputx(const FnDescription *Desc, const CallEvent &Call,
+                 CheckerContext &C, bool IsSingleChar) const;
 
   void preFseek(const FnDescription *Desc, const CallEvent &Call,
                 CheckerContext &C) const;
@@ -754,9 +763,8 @@ void StreamChecker::evalFreadFwrite(const FnDescription *Desc,
     C.addTransition(StateFailed);
 }
 
-void StreamChecker::evalFgetcFputc(const FnDescription *Desc,
-                                   const CallEvent &Call, CheckerContext &C,
-                                   bool IsRead) const {
+void StreamChecker::evalFgetx(const FnDescription *Desc, const CallEvent &Call,
+                              CheckerContext &C, bool SingleChar) const {
   ProgramStateRef State = C.getState();
   SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();
   if (!StreamSym)
@@ -773,10 +781,96 @@ void StreamChecker::evalFgetcFputc(const FnDescription *Desc,
   assertStreamStateOpened(OldSS);
 
   // `fgetc` returns the read character on success, otherwise returns EOF.
+  // `fgets` returns the read buffer address on success, otherwise returns NULL.
+
+  if (OldSS->ErrorState != ErrorFEof) {
+    if (SingleChar) {
+      // Generate a transition for the success state of `fgetc`.
+      NonLoc RetVal = makeRetVal(C, CE).castAs<NonLoc>();
+      ProgramStateRef StateNotFailed =
+          State->BindExpr(CE, C.getLocationContext(), RetVal);
+      SValBuilder &SVB = C.getSValBuilder();
+      ASTContext &ASTC = C.getASTContext();
+      // The returned 'unsigned char' of `fgetc` is converted to 'int',
+      // so we need to check if it is in range [0, 255].
+      auto CondLow =
+          SVB.evalBinOp(State, BO_GE, RetVal, SVB.makeZeroVal(ASTC.IntTy),
+                        SVB.getConditionType())
+              .getAs<DefinedOrUnknownSVal>();
+      auto CondHigh =
+          SVB.evalBinOp(State, BO_LE, RetVal,
+                        SVB.makeIntVal(SVB.getBasicValueFactory()
+                                           .getMaxValue(ASTC.UnsignedCharTy)
+                                           .getLimitedValue(),
+                                       ASTC.IntTy),
+                        SVB.getConditionType())
+              .getAs<DefinedOrUnknownSVal>();
+      if (!CondLow || !CondHigh)
+        return;
+      StateNotFailed = StateNotFailed->assume(*CondLow, true);
+      if (!StateNotFailed)
+        return;
+      StateNotFailed = StateNotFailed->assume(*CondHigh, true);
+      if (!StateNotFailed)
+        return;
+      C.addTransition(StateNotFailed);
+    } else {
+      // Generate a transition for the success state of `fgets`.
+      std::optional<DefinedSVal> GetBuf =
+          Call.getArgSVal(0).getAs<DefinedSVal>();
+      if (!GetBuf)
+        return;
+      ProgramStateRef StateNotFailed =
+          State->BindExpr(CE, C.getLocationContext(), *GetBuf);
+      StateNotFailed = StateNotFailed->set<StreamMap>(
+          StreamSym, StreamState::getOpened(Desc));
+      C.addTransition(StateNotFailed);
+    }
+  }
+
+  // Add transition for the failed state.
+  ProgramStateRef StateFailed;
+  if (SingleChar)
+    StateFailed = bindInt(*EofVal, State, C, CE);
+  else
+    StateFailed =
+        State->BindExpr(CE, C.getLocationContext(),
+                        C.getSValBuilder().makeNullWithType(CE->getType()));
+
+  // If a (non-EOF) error occurs, the resulting value of the file position
+  // indicator for the stream is indeterminate.
+  StreamErrorState NewES =
+      OldSS->ErrorState == ErrorFEof ? ErrorFEof : ErrorFEof | ErrorFError;
+  StreamState NewSS = StreamState::getOpened(Desc, NewES, !NewES.isFEof());
+  StateFailed = StateFailed->set<StreamMap>(StreamSym, NewSS);
+  if (OldSS->ErrorState != ErrorFEof)
+    C.addTransition(StateFailed, constructSetEofNoteTag(C, StreamSym));
+  else
+    C.addTransition(StateFailed);
+}
+
+void StreamChecker::evalFputx(const FnDescription *Desc, const CallEvent &Call,
+                              CheckerContext &C, bool IsSingleChar) const {
+  ProgramStateRef State = C.getState();
+  SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();
+  if (!StreamSym)
+    return;
+
+  const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
+  if (!CE)
+    return;
+
+  const StreamState *OldSS = State->get<StreamMap>(StreamSym);
+  if (!OldSS)
+    return;
+
+  assertStreamStateOpened(OldSS);
+
   // `fputc` returns the written character on success, otherwise returns EOF.
+  // `fputs` returns a non negative value on sucecess, otherwise returns EOF.
 
-  // Generate a transition for the success state of fputc.
-  if (!IsRead) {
+  if (IsSingleChar) {
+    // Generate a transition for the success state of `fputc`.
     std::optional<NonLoc> PutVal = Call.getArgSVal(0).getAs<NonLoc>();
     if (!PutVal)
       return;
@@ -785,57 +879,32 @@ void StreamChecker::evalFgetcFputc(const FnDescription *Desc,
     StateNotFailed =
         StateNotFailed->set<StreamMap>(StreamSym, StreamState::getOpened(Desc));
     C.addTransition(StateNotFailed);
-  }
-  // Generate a transition for the success state of fgetc.
-  // If we know the state to be FEOF at fgetc, do not add a success state.
-  else if (OldSS->ErrorState != ErrorFEof) {
+  } else {
+    // Generate a transition for the success state of `fputs`.
     NonLoc RetVal = makeRetVal(C, CE).castAs<NonLoc>();
     ProgramStateRef StateNotFailed =
         State->BindExpr(CE, C.getLocationContext(), RetVal);
     SValBuilder &SVB = C.getSValBuilder();
     auto &ASTC = C.getASTContext();
-    // The returned 'unsigned char' of `fgetc` is converted to 'int',
-    // so we need to check if it is in range [0, 255].
-    auto CondLow =
-        SVB.evalBinOp(State, BO_GE, RetVal, SVB.makeZeroVal(ASTC.IntTy),
-                      SVB.getConditionType())
-            .getAs<DefinedOrUnknownSVal>();
-    auto CondHigh =
-        SVB.evalBinOp(State, BO_LE, RetVal,
-                      SVB.makeIntVal(SVB.getBasicValueFactory()
-                                         .getMaxValue(ASTC.UnsignedCharTy)
-                                         .getLimitedValue(),
-                                     ASTC.IntTy),
-                      SVB.getConditionType())
-            .getAs<DefinedOrUnknownSVal>();
-    if (!CondLow || !CondHigh)
+    auto Cond = SVB.evalBinOp(State, BO_GE, RetVal, SVB.makeZeroVal(ASTC.IntTy),
+                              SVB.getConditionType())
+                    .getAs<DefinedOrUnknownSVal>();
+    if (!Cond)
       return;
-    StateNotFailed = StateNotFailed->assume(*CondLow, true);
-    if (!StateNotFailed)
-      return;
-    StateNotFailed = StateNotFailed->assume(*CondHigh, true);
+    StateNotFailed = StateNotFailed->assume(*Cond, true);
     if (!StateNotFailed)
       return;
+    StateNotFailed =
+        StateNotFailed->set<StreamMap>(StreamSym, StreamState::getOpened(Desc));
     C.addTransition(StateNotFailed);
   }
 
-  // Add transition for the failed state.
+  // Add transition for the failed state. The resulting value of the file
+  // position indicator for the stream is indeterminate.
   ProgramStateRef StateFailed = bindInt(*EofVal, State, C, CE);
-
-  // If a (non-EOF) error occurs, the resulting value of the file position
-  // indicator for the stream is indeterminate.
-  StreamErrorState NewES;
-  if (IsRead)
-    NewES =
-        OldSS->ErrorState == ErrorFEof ? ErrorFEof : ErrorFEof | ErrorFError;
-  else
-    NewES = ErrorFError;
-  StreamState NewSS = StreamState::getOpened(Desc, NewES, !NewES.isFEof());
+  StreamState NewSS = StreamState::getOpened(Desc, ErrorFError, true);
   StateFailed = StateFailed->set<StreamMap>(StreamSym, NewSS);
-  if (IsRead && OldSS->ErrorState != ErrorFEof)
-    C.addTransition(StateFailed, constructSetEofNoteTag(C, StreamSym));
-  else
-    C.addTransition(StateFailed);
+  C.addTransition(StateFailed);
 }
 
 void StreamChecker::preFseek(const FnDescription *Desc, const CallEvent &Call,
diff --git a/clang/test/Analysis/Inputs/system-header-simulator.h b/clang/test/Analysis/Inputs/system-header-simulator.h
index fc57e8bdc3d30c3..7089bd8bfc9d983 100644
--- a/clang/test/Analysis/Inputs/system-header-simulator.h
+++ b/clang/test/Analysis/Inputs/system-header-simulator.h
@@ -49,7 +49,9 @@ int fclose(FILE *fp);
 size_t fread(void *restrict, size_t, size_t, FILE *restrict);
 size_t fwrite(const void *restrict, size_t, size_t, FILE *restrict);
 int fgetc(FILE *stream);
+char *fgets(char *restrict str, int count, FILE *restrict stream);
 int fputc(int ch, FILE *stream);
+int fputs(const char *restrict s, FILE *restrict stream);
 int fseek(FILE *__stream, long int __off, int __whence);
 long int ftell(FILE *__stream);
 void rewind(FILE *__stream);
diff --git a/clang/test/Analysis/stream-error.c b/clang/test/Analysis/stream-error.c
index 38e6b77b9bb5053..c8332bcbfa8ca72 100644
--- a/clang/test/Analysis/stream-error.c
+++ b/clang/test/Analysis/stream-error.c
@@ -123,6 +123,29 @@ void error_fgetc(void) {
   fgetc(F); // expected-warning {{Stream might be already closed}}
 }
 
+void error_fgets(void) {
+  FILE *F = tmpfile();
+  char Buf[256];
+  if (!F)
+    return;
+  char *Ret = fgets(Buf, sizeof(Buf), F);
+  if (Ret == Buf) {
+    clang_analyzer_eval(feof(F) || ferror(F)); // expected-warning {{FALSE}}
+  } else {
+    clang_analyzer_eval(Ret == NULL);          // expected-warning {{TRUE}}
+    clang_analyzer_eval(feof(F) || ferror(F)); // expected-warning {{TRUE}}
+    if (feof(F)) {
+      clang_analyzer_eval(ferror(F)); // expected-warning {{FALSE}}
+      fgets(Buf, sizeof(Buf), F);     // expected-warning {{Read function called when stream is in EOF state}}
+    } else {
+      clang_analyzer_eval(ferror(F)); // expected-warning {{TRUE}}
+      fgets(Buf, sizeof(Buf), F);     // expected-warning {{might be 'indeterminate'}}
+    }
+  }
+  fclose(F);
+  fgets(Buf, sizeof(Buf), F);         // expected-warning {{Stream might be already closed}}
+}
+
 void error_fputc(void) {
   FILE *F = tmpfile();
   if (!F)
@@ -131,16 +154,50 @@ void error_fputc(void) {
   if (Ret == EOF) {
     clang_analyzer_eval(ferror(F)); // expected-warning {{TRUE}}
     clang_analyzer_eval(feof(F));   // expected-warning {{FALSE}}
-    fputc('Y', F); // expected-warning {{might be 'indeterminate'}}
+    fputc('Y', F);                  // expected-warning {{might be 'indeterminate'}}
   } else {
-    clang_analyzer_eval(Ret == 'X'); // expected-warning {{TRUE}}
-    clang_analyzer_eval(feof(F) || ferror(F)); // expected-warning {{FALSE}}
-    fputc('Y', F); // no-warning
+    clang_analyzer_eval(Ret == 'X');             // expected-warning {{TRUE}}
+    clang_analyzer_eval(feof(F) || ferror(F));   // expected-warning {{FALSE}}
+    fputc('Y', F);                               // no-warning
   }
   fclose(F);
   fputc('A', F); // expected-warning {{Stream might be already closed}}
 }
 
+void error_fputs(void) {
+  FILE *F = tmpfile();
+  if (!F)
+    return;
+  int Ret = fputs("XYZ", F);
+  if (Ret >= 0) {
+    clang_analyzer_eval(feof(F) || ferror(F));   // expected-warning {{FALSE}}
+    fputs("QWD", F);                             // no-warning
+  } else {
+    clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
+    clang_analyzer_eval(ferror(F));  // expected-warning {{TRUE}}
+    clang_analyzer_eval(feof(F));    // expected-warning {{FALSE}}
+    fputs("QWD", F);                 // expected-warning {{might be 'indeterminate'}}
+  }
+  fclose(F);
+  fputs("ABC", F); // expected-warning {{Stream might be already closed}}
+}
+
+void write_after_eof_is_allowed(void) {
+  FILE *F = tmpfile();
+  if (!F)
+    return;
+  StreamTesterChecker_make_feof_stream(F);
+  if (fputs("QWD", F) >= 0)                    // no-warning
+    clang_analyzer_eval(feof(F) || ferror(F)); // expected-warning {{FALSE}}
+  StreamTesterChecker_make_feof_stream(F);
+  if (fputc('Q', F) == 'Q')                    // no-warning
+    clang_analyzer_eval(feof(F) || ferror(F)); // expected-warning {{FALSE}}
+  StreamTesterChecker_make_feof_stream(F);
+  if (fwrite("012345678", 1, 10, F) == 10)     // no-warning
+    clang_analyzer_eval(feof(F) || ferror(F)); // expected-warning {{FALSE}}
+  fclose(F);
+}
+
 void freadwrite_zerosize(FILE *F) {
   size_t Ret;
   Ret = fwrite(0, 1, 0, F);
diff --git a/clang/test/Analysis/stream.c b/clang/test/Analysis/stream.c
index a7ee9982478bb96..0b655dbdc571fe2 100644
--- a/clang/test/Analysis/stream.c
+++ b/clang/test/Analysis/stream.c
@@ -20,12 +20,25 @@ void check_fgetc(void) {
   fclose(fp);
 }
 
+void check_fgets(void) {
+  FILE *fp = tmpfile();
+  char buf[256];
+  fgets(buf, sizeof(buf), fp); // expected-warning {{Stream pointer might be NULL}}
+  fclose(fp);
+}
+
 void check_fputc(void) {
   FILE *fp = tmpfile();
   fputc('A', fp); // expected-warning {{Stream pointer might be NULL}}
   fclose(fp);
 }
 
+void check_fputs(void) {
+  FILE *fp = tmpfile();
+  fputs("ABC", fp); // expected-warning {{Stream pointer might be NULL}}
+  fclose(fp);
+}
+
 void check_fseek(void) {
   FILE *fp = tmpfile();
   fseek(fp, 0, 0); // expected-warning {{Stream pointer might be NULL}}

@benshi001 benshi001 merged commit 47df664 into llvm:main Nov 29, 2023
@benshi001 benshi001 deleted the csa-ssr branch November 29, 2023 11:20
@@ -778,42 +781,61 @@ void StreamChecker::evalFgetc(const FnDescription *Desc, const CallEvent &Call,
assertStreamStateOpened(OldSS);

Copy link
Contributor

@steakhal steakhal Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch caused a downstream test failure.
Here is the fix:

Suggested change
// We don't model the buffer, thus they should escape.
State = Call.invalidateRegions(C.blockCount(), State);

Test where it currently breaks:

void test_double_switch_ok() {
  char buffer[10] = {10};
  FILE *F1 = tmpfile();
  if (!F1)
    return;

  char* s = fgets(buffer, sizeof(buffer), F1); // ok
  if (s) {
    clang_analyzer_dump_char(buffer[0]); // expected-warning {{conj_$}} Should not preserve the previous value, thus should not be 10.
  }
  // Hmm, what do we know of the buffer on failure? IDK.

  fclose(F1);
}

I'd suggest to backport the fix for this to llvm-18, as this is a regression to not modeling "fgets".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this invalidate only the buffer, not F1?

Copy link
Contributor

@steakhal steakhal Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance you should be right. However, when I tried it, it didn't break any tests but this one.
I presume there must be something else going on that prevents escaping the stream pointer.
I've not checked.

Alternatively what I did was this:

[[nodiscard]] static ProgramStateRef
escapeArgsAfterIndex(ProgramStateRef State, CheckerContext &C,
                     const CallEvent &Call, unsigned FirstEscapingArgIndex) {
  const auto *CE = Call.getOriginExpr();
  assert(CE);

  if (Call.getNumArgs() <= FirstEscapingArgIndex)
    return State;

  SmallVector<SVal> EscapingArgs;
  EscapingArgs.reserve(Call.getNumArgs() - FirstEscapingArgIndex);
  for (auto EscArgIdx :
       llvm::seq<int>(FirstEscapingArgIndex, Call.getNumArgs()))
    EscapingArgs.push_back(Call.getArgSVal(EscArgIdx));
  State = State->invalidateRegions(EscapingArgs, CE, C.blockCount(),
                                   C.getLocationContext(),
                                   /*CausesPointerEscape=*/false);
  return State;
}

// at the callsite:
State = escapeArgsAfterIndex(State, C, Call, /*FirstEscapingArgIndex=*/2);

This approach similarly to the previous one, only broke this single test if I recall.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a same kind of problem with fread, or not? (That function was not changed in the recent commits.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem should apply to all APIs that potentially write to the passed buffer; including fread too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants