Skip to content

Static analyzer cherrypicks 24 #3681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

haoNoQ
Copy link

@haoNoQ haoNoQ commented Dec 14, 2021

Clang Static Analyzer is traditionally kept reasonably fresh on stable branches through continuous cherry-picking.

ASDenysPetrov and others added 30 commits December 13, 2021 20:48
… test file.

Summary: Move the test case to existing test file. Remove test file as duplicated. The file was mistakenly added due to concerns of a hidden bug (see https://reviews.llvm.org/D104381). After it turned out, that the bug was already fixed with another revision (https://reviews.llvm.org/D85817) and corresponding test was added as well, we can remove this file.

Differential Revision: https://reviews.llvm.org/D106152

(cherry picked from commit 497b1b9)
(cherry picked from commit d3bae387bbc4406711c5c59072151e73df996691)
This change follows up on a FIXME submitted with D105974. This change simply let's the reference case fall through to return a concrete 'true'
instead of a nonloc pointer of appropriate length set to NULL.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D107720

(cherry picked from commit d39ebda)
(cherry picked from commit 75f9657516e19e6c993c3fe410415459cf157adf)
Add some notes and track of bad return value.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D107051

(cherry picked from commit 9f517fd)
(cherry picked from commit a8e407f65d6904983bc63cf066befb7802f50528)
(cherry picked from commit 027c5a6)
(cherry picked from commit 93117a9b047cb8878236914dc69380526b2a6a1f)
…ract NoStateChangeVisitor class

Preceding discussion on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2021-June/068450.html

NoStoreFuncVisitor is a rather unique visitor. As VisitNode is invoked on most
other visitors, they are looking for the point where something changed -- change
on a value, some checker-specific GDM trait, a new constraint.
NoStoreFuncVisitor, however, looks specifically for functions that *didn't*
write to a MemRegion of interesting. Quoting from its comments:

/// Put a diagnostic on return statement of all inlined functions
/// for which  the region of interest \p RegionOfInterest was passed into,
/// but not written inside, and it has caused an undefined read or a null
/// pointer dereference outside.

It so happens that there are a number of other similar properties that are
worth checking. For instance, if some memory leaks, it might be interesting why
a function didn't take ownership of said memory:

void sink(int *P) {} // no notes

void f() {
  sink(new int(5)); // note: Memory is allocated
                    // Well hold on, sink() was supposed to deal with
                    // that, this must be a false positive...
} // warning: Potential memory leak [cplusplus.NewDeleteLeaks]

In here, the entity of interest isn't a MemRegion, but a symbol. The property
that changed here isn't a change of value, but rather liveness and GDM traits
managed by MalloChecker.

This patch moves some of the logic of NoStoreFuncVisitor to a new abstract
class, NoStateChangeFuncVisitor. This is mostly calculating and caching the
stack frames in which the entity of interest wasn't changed.

Descendants of this interface have to define 3 things:

* What constitutes as a change to an entity (this is done by overriding
wasModifiedBeforeCallExit)
* What the diagnostic message should be (this is done by overriding
maybeEmitNoteFor.*)
* What constitutes as the entity of interest being passed into the function (this
is also done by overriding maybeEmitNoteFor.*)

Differential Revision: https://reviews.llvm.org/D105553

(cherry picked from commit c019142)
(cherry picked from commit eb7fbf5d3986c19a7d6a3c964f9ac869b9cd65e6)
…that could have, but did not change ownership on leaked memory

This is a rather common feedback we get from out leak checkers: bug reports are
really short, and are contain barely any usable information on what the analyzer
did to conclude that a leak actually happened.

This happens because of our bug report minimizing effort. We construct bug
reports by inspecting the ExplodedNodes that lead to the error from the bottom
up (from the error node all the way to the root of the exploded graph), and mark
entities that were the cause of a bug, or have interacted with it as
interesting. In order to make the bug report a bit less verbose, whenever we
find an entire function call (from CallEnter to CallExitEnd) that didn't talk
about any interesting entity, we prune it (click here for more info on bug
report generation). Even if the event to highlight is exactly this lack of
interaction with interesting entities.

D105553 generalized the visitor that creates notes for these cases. This patch
adds a new kind of NoStateChangeVisitor that leaves notes in functions that
took a piece of dynamically allocated memory that later leaked as parameter,
and didn't change its ownership status.

Differential Revision: https://reviews.llvm.org/D105553

(cherry picked from commit 2d3668c)
(cherry picked from commit 2393aca2c6ab95ac0f0db9af453d4a8ead14826f)
This patch adds the flag `extra-checkers` to the sub-command `build` for
passing a comma separated list of additional checkers to include.

Differential Revision: https://reviews.llvm.org/D106739

(cherry picked from commit 198e677)
(cherry picked from commit 6146abadae08954cb92aed452ced6a568a485012)
Summary: Change and replace some functions which IE does not support. This patch is made as a continuation of D92928 revision. Also improve hot keys behavior.

Differential Revision: https://reviews.llvm.org/D107366

(cherry picked from commit 9dabacd)
(cherry picked from commit 990b1e7c98fc944150ed1816040f4305eab9aab1)
It fails on ubuntu bionic otherwise with:
```
scan-build-py-14: Run 'scan-view /tmp/scan-build-2021-08-09-09-14-36-765350-nx9s888s' to examine bug reports.
scan-build-py-14: Internal error.
Traceback (most recent call last):
  File "/usr/lib/llvm-14/lib/libscanbuild/__init__.py", line 125, in wrapper
    return function(*args, **kwargs)
  File "/usr/lib/llvm-14/lib/libscanbuild/analyze.py", line 72, in scan_build
    number_of_bugs = document(args)
  File "/usr/lib/llvm-14/lib/libscanbuild/report.py", line 35, in document
    for bug in read_bugs(args.output, html_reports_available):
  File "/usr/lib/llvm-14/lib/libscanbuild/report.py", line 282, in read_bugs
    for bug in parser(bug_file):
  File "/usr/lib/llvm-14/lib/libscanbuild/report.py", line 421, in parse_bug_html
    for line in handler.readlines():
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3360: ordinal not in range(128)
scan-build-py-14: Please run this command again and turn on verbose mode (add '-vvvv' as argument).
```

I guess it is caused by a problem in Python 3.6

Reviewed By: phosek, isthismyaccount

Differential Revision: https://reviews.llvm.org/D107887

(cherry picked from commit 70b06fe)
(cherry picked from commit 8635e36542b21f6907ee8990add79d1a66e685e8)
…ze() for FAMs""

This reverts commit df1f4e0.

Now the test case explicitly specifies the target triple.
I decided to use x86_64 for that matter, to have a fixed
bitwidth for `size_t`.

Aside from that, relanding the original changes of:
https://reviews.llvm.org/D105184

(cherry picked from commit e5646b9)
(cherry picked from commit 7477e2449d06e2198a6d51e6cdce35135b5e7a31)
Not only global variables can hold references to dead stack variables.
Consider this example:

  void write_stack_address_to(char **q) {
    char local;
    *q = &local;
  }

  void test_stack() {
    char *p;
    write_stack_address_to(&p);
  }

The address of 'local' is assigned to 'p', which becomes a dangling
pointer after 'write_stack_address_to()' returns.

The StackAddrEscapeChecker was looking for bindings in the store which
referred to variables of the popped stack frame, but it only considered
global variables in this regard. This patch relaxes this, catching
stack variable bindings as well.

---

This patch also works for temporary objects like:

  struct Bar {
    const int &ref;
    explicit Bar(int y) : ref(y) {
      // Okay.
    } // End of the constructor call, `ref` is dangling now. Warning!
  };

  void test() {
    Bar{33}; // Temporary object, so the corresponding memregion is
             // *not* a VarRegion.
  }

---

The return value optimization aka. copy-elision might kick in but that
is modeled by passing an imaginary CXXThisRegion which refers to the
parent stack frame which is supposed to be the 'return slot'.
Objects residing in the 'return slot' outlive the scope of the inner
call, thus we should expect no warning about them - except if we
explicitly disable copy-elision.

Reviewed By: NoQ, martong

Differential Revision: https://reviews.llvm.org/D107078

(cherry picked from commit 6ad47e1)
(cherry picked from commit 4d7887b4188e62b8e6f486a5aca4382a660537d8)
Quoting https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html:
> In the absence of the zero-length array extension, in ISO C90 the contents
> array in the example above would typically be declared to have a single
> element.

We should not assume that the size of the //flexible array member// field has
a single element, because in some cases they use it as a fallback for not
having the //zero-length array// language extension.
In this case, the analyzer should return `Unknown` as the extent of the field
instead.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D108230

(cherry picked from commit 91c07eb)
(cherry picked from commit 007660ec9a45e8aed7160230e176b8f12018b9d6)
`SVB.getStateManager().getOwningEngine().getAnalysisManager().getAnalyzerOptions()`
is quite a mouthful and might involve a few pointer indirections to get
such a simple thing like an analyzer option.

This patch introduces an `AnalyzerOptions` reference to the `SValBuilder`
abstract class, while refactors a few cases to use this /simpler/ accessor.

Reviewed By: martong, Szelethus

Differential Revision: https://reviews.llvm.org/D108824

(cherry picked from commit b97a964)
(cherry picked from commit fc9ce3b08a48b76e53b139c7575b7e742c894aa6)
Once installed, scan-build-py doesn't know anything about its auxiliary
executable and can't find them.
Use relative path wrt. scan-build-py script.

Differential Revision: https://reviews.llvm.org/D109659

(cherry picked from commit c84755a)
(cherry picked from commit 9b27152d2c32ae902bfa1abe643033154a34e78e)
…ntire function calls, rather than each ExplodedNode in it

Fix a compilation error due to a missing 'template' keyword.

Differential Revision: https://reviews.llvm.org/D108695

(cherry picked from commit 0213d7e)
(cherry picked from commit 869b603c9b1d8111fa6cffd3dace26e35aa722fd)
…y when a function "intents", but doesn't change ownership, enable by default

D105819 Added NoOwnershipChangeVisitor, but it is only registered when an
off-by-default, hidden checker option was enabled. The reason behind this was
that it grossly overestimated the set of functions that really needed a note:

std::string getTrainName(const Train *T) {
  return T->name;
} // note: Retuning without changing the ownership of or deallocating memory
// Umm... I mean duh? Nor would I expect this function to do anything like that...

void foo() {
  Train *T = new Train("Land Plane");
  print(getTrainName(T)); // note: calling getTrainName / returning from getTrainName
} // warn: Memory leak

This patch adds a heuristic that guesses that any function that has an explicit
operator delete call could have be responsible for deallocating the memory that
ended up leaking. This is waaaay too conservative (see the TODOs in the new
function), but it safer to err on the side of too little than too much, and
would allow us to enable the option by default *now*, and add refinements
one-by-one.

Differential Revision: https://reviews.llvm.org/D108753

(cherry picked from commit 9d359f6)
(cherry picked from commit 056b7b982308758603af67389ac424af2283898e)
See PR51842.

This fixes an assert firing in the static analyzer, triggered by implicit moves
in blocks in C mode:

This also simplifies the AST a little bit when compiling non C++ code,
as the xvalue implicit casts are not inserted.

We keep and test that the nrvo flag is still being set on the VarDecls,
as that is still a bit beneficial while not really making anything
more complicated.

Signed-off-by: Matheus Izvekov <[email protected]>

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D109654

(cherry picked from commit 2d6829b)
(cherry picked from commit 551d751b7c2c2e3945d3f52a11b0b19af89f8116)
Adding trackExpressionValue to the checker so it tracks the value of the
implicit cast's DeclRefExpression up to initialization/assignment. This
way the report becomes cleaner.

Differential Revision: https://reviews.llvm.org/D109836

(cherry picked from commit 96ec9b6)
(cherry picked from commit c0db73f7d52b968b68eef9812b15a0160202e864)
Differential Revision: https://reviews.llvm.org/D109349

(cherry picked from commit b588f5d)
(cherry picked from commit dcf575c5dea9152149f33f33e03405f3168f7ca5)
Since https://reviews.llvm.org/D87118, the StaticAnalyzer directory is
added unconditionally. In theory this should not cause the static analyzer
sources to be built unless they are referenced by another target. However,
the clang-cpp target (defined in clang/tools/clang-shlib) uses the
CLANG_STATIC_LIBS global property to determine which libraries need to
be included. To solve this issue, this patch avoids adding libraries to
that property if EXCLUDE_FROM_ALL is set.

In case something like this comes up again: `cmake --graphviz=targets.dot`
is quite useful to see why a target is included as part of `ninja all`.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D109611

(cherry picked from commit 6d7b3d6)
(cherry picked from commit 8c28ce80538ff278c68868110b754f9a46173353)
…y declaration in a global scope.

Summary: Fix the point that we didn't take into account array's dimension. Retrieve a value of global constant array by iterating through its initializer list.

Differential Revision: https://reviews.llvm.org/D104285

Fixes: https://bugs.llvm.org/show_bug.cgi?id=50604
(cherry picked from commit 98a95d4)
(cherry picked from commit ad3fc5175cfe66287b366eb1227e3d51b27e7ddf)
It replaces the usage of readPlist,writePlist functions with load,dump
in plistlib package.

This fixes deprecation issues when analyzer reports are being generated
outside of docker.

Patch by Manas!

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D107312

(cherry picked from commit a3d0b58)
(cherry picked from commit f6a17c8c8baaaaef4615d8d01c8eed4c079de22d)
This patch introduces a new checker: `alpha.security.cert.env.InvalidPtr`

Checker finds usage of invalidated pointers related to environment.

Based on the following SEI CERT Rules:
ENV34-C: https://wiki.sei.cmu.edu/confluence/x/8tYxBQ
ENV31-C: https://wiki.sei.cmu.edu/confluence/x/5NUxBQ

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D97699

(cherry picked from commit 811b173)
(cherry picked from commit fb9ef91bc74db2d65bb31313fde865f9d248573f)
This simple change addresses a special case of structure/pointer
aliasing that produced different symbolvals, leading to false positives
during analysis.

The reproducer is as simple as this.

```lang=C++
struct s {
  int v;
};

void foo(struct s *ps) {
  struct s ss = *ps;
  clang_analyzer_dump(ss.v); // reg_$1<int Element{SymRegion{reg_$0<struct s *ps>},0 S64b,struct s}.v>
  clang_analyzer_dump(ps->v); //reg_$3<int SymRegion{reg_$0<struct s *ps>}.v>
  clang_analyzer_eval(ss.v == ps->v); // UNKNOWN
}
```

Acks: Many thanks to @steakhal and @martong for the group debug session.

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D110625

(cherry picked from commit b29186c)
(cherry picked from commit 6ab6444d01a67b70e3160e5001f1c43ed7552316)
There is an error in the implementation of the logic of reaching the `Unknonw` tristate in CmpOpTable.

```
void cmp_op_table_unknownX2(int x, int y, int z) {
  if (x >= y) {
                    // x >= y    [1, 1]
    if (x + z < y)
      return;
                    // x + z < y [0, 0]
    if (z != 0)
      return;
                    // x < y     [0, 0]
    clang_analyzer_eval(x > y);  // expected-warning{{TRUE}} expected-warning{{FALSE}}
  }
}
```
We miss the `FALSE` warning because the false branch is infeasible.

We have to exploit simplification to discover the bug. If we had `x < y`
as the second condition then the analyzer would return the parent state
on the false path and the new constraint would not be part of the State.
But adding `z` to the condition makes both paths feasible.

The root cause of the bug is that we reach the `Unknown` tristate
twice, but in both occasions we reach the same `Op` that is `>=` in the
test case. So, we reached `>=` twice, but we never reached `!=`, thus
querying the `Unknonw2x` column with `getCmpOpStateForUnknownX2` is
wrong.

The solution is to ensure that we reached both **different** `Op`s once.

Differential Revision: https://reviews.llvm.org/D110910

(cherry picked from commit 792be5d)
(cherry picked from commit e4baf546f0a76ff839386996115e8f08e1a773c8)
This tiny change improves the debugging experience of the solver a lot!

Differential Revision: https://reviews.llvm.org/D110911

(cherry picked from commit b8f6c85)
(cherry picked from commit de344130bccad48b4f9add25e05bd770716c9736)
…mory.

Clarify the message provided when the analyzer catches the use of memory
that is allocated with size zero.

Differential Revision: https://reviews.llvm.org/D111655

(cherry picked from commit f3ec9d8)
(cherry picked from commit 988236dff9c16ba364fcede4ea94d4b84c4028b8)
The `getenv()` function might return `NULL` just like any other function.
However, in case of `getenv()` a state-split seems justified since the
programmer should expect the failure of this function.

`secure_getenv(const char *name)` behaves the same way but is not handled
right now.
Note that `std::getenv()` is also not handled.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111245

(cherry picked from commit 7fc1503)
(cherry picked from commit 2e680db50c66eda7bff370307c4f98a3eef65c5a)
If the `assume-controlled-environment` is `true`, we should expect `getenv()`
to succeed, and the result should not be considered tainted.
By default, the option will be `false`.

Reviewed By: NoQ, martong

Differential Revision: https://reviews.llvm.org/D111296

(cherry picked from commit edde4ef)
(cherry picked from commit 405bd5166cfc5dd9771ea5d651c9b2a76c6551bc)
'(self.prop)' produces a surprising AST where ParenExpr
resides inside `PseudoObjectExpr.

This breaks ObjCMethodCall::getMessageKind() which in turn causes us
to perform unnecessary dynamic dispatch bifurcation when evaluating
body-farmed property accessors, which in turn causes us
to explore infeasible paths.

(cherry picked from commit 12cbc8c)
(cherry picked from commit d733df79d0f187bc465dd341984d494e600ec0f9)
This NFC change accomplishes three things:
1) Splits up the single unittest into reasonable segments.
2) Extends the test infra using a template to select the AST-node
   from which it is supposed to construct a `CallEvent`.
3) Adds a *lot* of different tests, documenting the current
   capabilities of the `CallDescription`. The corresponding tests are
   marked with `FIXME`s, where the current behavior should be different.

Both `CXXMemberCallExpr` and `CXXOperatorCallExpr` are derived from
`CallExpr`, so they are matched by using the default template parameter.
On the other hand, `CXXConstructExpr` is not derived from `CallExpr`.
In case we want to match for them, we need to pass the type explicitly
to the `CallDescriptionAction`.

About destructors:
They have no AST-node, but they are generated in the CFG machinery in
the analyzer. Thus, to be able to match against them, we would need to
construct a CFG and walk on that instead of simply walking the AST.

I'm also relaxing the `EXPECT`ation in the
`CallDescriptionConsumer::performTest()`, to check the `LookupResult`
only if we matched for the `CallDescription`.
This is necessary to allow tests in which we expect *no* matches at all.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111794

(cherry picked from commit 5644d15)
(cherry picked from commit f9aad16d8e923d3902147222a9542087c25df39b)
@haoNoQ haoNoQ force-pushed the static-analyzer-cherrypicks-24 branch from f9aad16 to effbeb8 Compare December 14, 2021 05:31
@haoNoQ
Copy link
Author

haoNoQ commented Dec 14, 2021

@swift-ci test

@haoNoQ haoNoQ merged commit 0aea8ec into swiftlang:stable/20210726 Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.