Skip to content

Static analyzer cherrypicks 25 #3982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
2ae8060
[clang] Make loop in CFGBuilder::VisitCXXTryStmt() more canonical
nico Oct 26, 2021
47bf973
[clang] Implement CFG construction for @try and @catch
nico Oct 21, 2021
12dea04
[clang] Implement CFG construction for @try and @catch
nico Oct 21, 2021
0689b6c
[analyzer] Fix StringChecker for Unknown params
Oct 26, 2021
50c3fdb
[Analyzer][solver] Simplification: reorganize equalities with adjustment
Oct 25, 2021
b09115f
[Analyzer][solver] Handle adjustments in constraint assignor remainder
Oct 22, 2021
e2a199e
[analyzer] sprintf is a taint propagator not a source
Oct 28, 2021
01e48a2
[analyzer] Retrieve a character from StringLiteral as an initializer …
ASDenysPetrov Oct 20, 2021
f898e0d
[clang][scan-build] Use uname -s to detect the operating system.
fcambus Oct 14, 2021
30ec4f6
[analyzer] Dump checker name if multiple checkers evaluate the same call
Nov 2, 2021
591ebb6
[AST, Analysis] Use llvm::reverse (NFC)
kazutakahirata Nov 7, 2021
18bb28c
[analyzer] Retrieve a value from list initialization of multi-dimensi…
ASDenysPetrov Oct 22, 2021
25de534
[analyzer][solver] Iterate to a fixpoint during symbol simplification…
Jul 26, 2021
f376401
[analyzer][solver] Remove reference to RangedConstraintManager
Nov 5, 2021
694bdb9
[analyzer] Fix region cast between the same types with different qual…
ASDenysPetrov Nov 9, 2021
b55599e
[analyzer][NFC] Separate CallDescription from CallEvent
Nov 15, 2021
baaf4c2
[analyzer][NFC] Make the API of CallDescription safer slightly
Nov 17, 2021
0d2c253
[analyzer][NFC] Introduce CallDescriptionSets
Nov 19, 2021
048dd82
[analyzer][NFC] Introduce CallDescription::matches() in addition to i…
Nov 19, 2021
0998a67
[analyzer][NFC] Switch to using CallDescription::matches() instead of…
Nov 19, 2021
3f4869e
[analyzer][NFC] Demonstrate the use of CallDescriptionSet
Nov 19, 2021
9871de8
[analyzer][NFC] CallDescription should own the qualified name parts
Nov 19, 2021
c83a53e
[analyzer][NFC] Consolidate the inner representation of CallDescriptions
Nov 19, 2021
8addc77
[analyzer][NFC] Use enum for CallDescription flags
Nov 19, 2021
ef4e0d1
[analyzer][NFC] MaybeUInt -> MaybeCount
Nov 19, 2021
d3dfa01
[Analyzer][Core] Simplify IntSym in SValBuilder
Nov 11, 2021
66981d0
[Analyzer][Core] Better simplification in SimpleSValBuilder::evalBinOpNN
Nov 11, 2021
ddebabd
[analyzer][NFC] Refactor AnalysisConsumer::getModeForDecl()
Nov 29, 2021
9fd65af
[Analyzer][solver] Do not remove the simplified symbol from the eq class
Nov 26, 2021
607f147
[Analyzer][Core] Make SValBuilder to better simplify svals with 3 sym…
Nov 11, 2021
bd4f579
[Analyzer][solver] Simplification: Do a fixpoint iteration before the…
Dec 1, 2021
85ddb29
[analyzer] Ignore flex generated files
Dec 6, 2021
965bd40
[analyzer][solver] Fix assertion on (NonLoc, Op, Loc) expressions
Dec 6, 2021
6284bc5
[Analyzer] SValBuilder: Simlify a SymExpr to the absolute simplest form
Dec 1, 2021
e971d64
[Analysis] Ignore casts and unary ops for uninitialized values
isanbard Dec 7, 2021
e9bd5cd
[NFC][analyzer] Return underlying strings directly instead of OS.str()
kepler-5 Dec 9, 2021
29ef2f0
[NFC][testing] Return underlying strings directly instead of OS.str()
kepler-5 Dec 9, 2021
989e905
[analyzer] Implemented RangeSet::Factory::unite function to handle in…
ASDenysPetrov Nov 18, 2021
0a89706
[analyzer] Expand conversion check to check more expressions for over…
Dec 14, 2021
f474fd2
[analyzer][NFC] Change return value of StoreManager::attemptDownCast …
ASDenysPetrov Dec 16, 2021
7613623
[analyzer] Enable move semantics for CallDescriptionMap
Dec 19, 2021
a8a9a24
[analyzer] Add range constructor to CallDescriptionMap
Dec 19, 2021
9931dde
[StaticAnalyzer] Remove redundant declaration isStdSmartPtr (NFC)
kazutakahirata Dec 25, 2021
cfe3dec
[Clang][CFG] check children statements of asm goto
nickdesaulniers Jan 7, 2022
649be9e
[analyzer] Produce SymbolCast symbols for integral types in SValBuild…
ASDenysPetrov Jul 2, 2021
6a7d75c
[analyzer][NFC] Refactor GenericTaintChecker to use CallDescriptionMap
Jan 12, 2022
140e681
[Analyzer] Add docs to StdCLibraryFunctionArgsChecker
Jan 18, 2022
f584e92
[analyzer] Restrict CallDescription fuzzy builtin matching
Feb 11, 2022
a07342b
[analyzer][NFCi] Use the correct BugType in CStringChecker.
phyBrackets Feb 14, 2022
3e8146f
[analyzer] Fix a crash in NoStateChangeVisitor with body-farmed stack…
haoNoQ Feb 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions clang/docs/analyzer/checkers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2333,6 +2333,88 @@ A data is tainted when it comes from an unreliable source.
alpha.unix
^^^^^^^^^^^

.. _alpha-unix-StdCLibraryFunctionArgs:

alpha.unix.StdCLibraryFunctionArgs (C)
""""""""""""""""""""""""""""""""""""""
Check for calls of standard library functions that violate predefined argument
constraints. For example, it is stated in the C standard that for the ``int
isalnum(int ch)`` function the behavior is undefined if the value of ``ch`` is
not representable as unsigned char and is not equal to ``EOF``.

.. code-block:: c

void test_alnum_concrete(int v) {
int ret = isalnum(256); // \
// warning: Function argument constraint is not satisfied
(void)ret;
}

If the argument's value is unknown then the value is assumed to hold the proper value range.

.. code-block:: c

#define EOF -1
int test_alnum_symbolic(int x) {
int ret = isalnum(x);
// after the call, ret is assumed to be in the range [-1, 255]

if (ret > 255) // impossible (infeasible branch)
if (x == 0)
return ret / x; // division by zero is not reported
return ret;
}

If the user disables the checker then the argument violation warning is
suppressed. However, the assumption about the argument is still modeled. This
is because exploring an execution path that already contains undefined behavior
is not valuable.

There are different kind of constraints modeled: range constraint, not null
constraint, buffer size constraint. A **range constraint** requires the
argument's value to be in a specific range, see ``isalnum`` as an example above.
A **not null constraint** requires the pointer argument to be non-null.

A **buffer size** constraint specifies the minimum size of the buffer
argument. The size might be a known constant. For example, ``asctime_r`` requires
that the buffer argument's size must be greater than or equal to ``26`` bytes. In
other cases, the size is denoted by another argument or as a multiplication of
two arguments.
For instance, ``size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream)``.
Here, ``ptr`` is the buffer, and its minimum size is ``size * nmemb``

.. code-block:: c

void buffer_size_constraint_violation(FILE *file) {
enum { BUFFER_SIZE = 1024 };
wchar_t wbuf[BUFFER_SIZE];

const size_t size = sizeof(*wbuf); // 4
const size_t nitems = sizeof(wbuf); // 4096

// Below we receive a warning because the 3rd parameter should be the
// number of elements to read, not the size in bytes. This case is a known
// vulnerability described by the the ARR38-C SEI-CERT rule.
fread(wbuf, size, nitems, file);
}

**Limitations**

The checker is in alpha because the reports cannot provide notes about the
values of the arguments. Without this information it is hard to confirm if the
constraint is indeed violated. For example, consider the above case for
``fread``. We display in the warning message that the size of the 1st arg
should be equal to or less than the value of the 2nd arg times the 3rd arg.
However, we fail to display the concrete values (``4`` and ``4096``) for those
arguments.

**Parameters**

The checker models functions (and emits diagnostics) from the C standard by
default. The ``ModelPOSIX`` option enables the checker to model (and emit
diagnostics) for functions that are defined in the POSIX standard. This option
is disabled by default.

.. _alpha-unix-BlockInCriticalSection:

alpha.unix.BlockInCriticalSection (C)
Expand Down
2 changes: 1 addition & 1 deletion clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,7 @@ def StdCLibraryFunctionArgsChecker : Checker<"StdCLibraryFunctionArgs">,
"or is EOF.">,
Dependencies<[StdCLibraryFunctionsChecker]>,
WeakDependencies<[CallAndMessageChecker, NonNullParamChecker, StreamChecker]>,
Documentation<NotDocumented>;
Documentation<HasAlphaDocumentation>;

} // end "alpha.unix"

Expand Down
10 changes: 5 additions & 5 deletions clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
std::string Str;
llvm::raw_string_ostream OS(Str);
S->printPretty(OS, nullptr, PrintingPolicy(ACtx.getLangOpts()));
return OS.str();
return Str;
}

bool isThisObject(const SymbolicRegion *R) {
Expand Down Expand Up @@ -69,7 +69,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
std::string Str;
llvm::raw_string_ostream OS(Str);
OS << "concrete memory address '" << I << "'";
return OS.str();
return Str;
}

std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) {
Expand All @@ -82,7 +82,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
llvm::raw_string_ostream OS(Str);
OS << (I.isSigned() ? "signed " : "unsigned ") << I.getBitWidth()
<< "-bit integer '" << I << "'";
return OS.str();
return Str;
}

std::string VisitNonLocLazyCompoundVal(nonloc::LazyCompoundVal V) {
Expand Down Expand Up @@ -123,7 +123,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
OS << "(" << Visit(S->getLHS()) << ") "
<< std::string(BinaryOperator::getOpcodeStr(S->getOpcode())) << " "
<< S->getRHS();
return OS.str();
return Str;
}

// TODO: IntSymExpr doesn't appear in practice.
Expand Down Expand Up @@ -177,7 +177,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
else
OS << "'" << Visit(R->getIndex()) << "'";
OS << " of " + Visit(R->getSuperRegion());
return OS.str();
return Str;
}

std::string VisitNonParamVarRegion(const NonParamVarRegion *R) {
Expand Down
17 changes: 17 additions & 0 deletions clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,11 @@ ANALYZER_OPTION(bool, ShouldDisplayCheckerNameForText, "display-checker-name",
"Display the checker name for textual outputs",
true)

ANALYZER_OPTION(bool, ShouldSupportSymbolicIntegerCasts,
"support-symbolic-integer-casts",
"Produce cast symbols for integral types.",
false)

ANALYZER_OPTION(
bool, ShouldConsiderSingleElementArraysAsFlexibleArrayMembers,
"consider-single-element-arrays-as-flexible-array-members",
Expand All @@ -336,6 +341,18 @@ ANALYZER_OPTION(
"might be modeled by the analyzer to never return NULL.",
false)

ANALYZER_OPTION(
bool, ShouldIgnoreBisonGeneratedFiles, "ignore-bison-generated-files",
"If enabled, any files containing the \"/* A Bison parser, made by\" "
"won't be analyzed.",
true)

ANALYZER_OPTION(
bool, ShouldIgnoreFlexGeneratedFiles, "ignore-flex-generated-files",
"If enabled, any files containing the \"/* A lexical scanner generated by "
"flex\" won't be analyzed.",
true)

//===----------------------------------------------------------------------===//
// Unsigned analyzer options.
//===----------------------------------------------------------------------===//
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
//===- CallDescription.h - function/method call matching --*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
/// \file This file defines a generic mechanism for matching for function and
/// method calls of C, C++, and Objective-C languages. Instances of these
/// classes are frequently used together with the CallEvent classes.
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CALLDESCRIPTION_H
#define LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CALLDESCRIPTION_H

#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/Optional.h"
#include "llvm/Support/Compiler.h"
#include <vector>

namespace clang {
class IdentifierInfo;
} // namespace clang

namespace clang {
namespace ento {

enum CallDescriptionFlags : unsigned {
CDF_None = 0,

/// Describes a C standard function that is sometimes implemented as a macro
/// that expands to a compiler builtin with some __builtin prefix.
/// The builtin may as well have a few extra arguments on top of the requested
/// number of arguments.
CDF_MaybeBuiltin = 1 << 0,
};

/// This class represents a description of a function call using the number of
/// arguments and the name of the function.
class CallDescription {
friend class CallEvent;
using MaybeCount = Optional<unsigned>;

mutable Optional<const IdentifierInfo *> II;
// The list of the qualified names used to identify the specified CallEvent,
// e.g. "{a, b}" represent the qualified names, like "a::b".
std::vector<std::string> QualifiedName;
MaybeCount RequiredArgs;
MaybeCount RequiredParams;
int Flags;

public:
/// Constructs a CallDescription object.
///
/// @param QualifiedName The list of the name qualifiers of the function that
/// will be matched. The user is allowed to skip any of the qualifiers.
/// For example, {"std", "basic_string", "c_str"} would match both
/// std::basic_string<...>::c_str() and std::__1::basic_string<...>::c_str().
///
/// @param RequiredArgs The number of arguments that is expected to match a
/// call. Omit this parameter to match every occurrence of call with a given
/// name regardless the number of arguments.
CallDescription(CallDescriptionFlags Flags,
ArrayRef<const char *> QualifiedName,
MaybeCount RequiredArgs = None,
MaybeCount RequiredParams = None);

/// Construct a CallDescription with default flags.
CallDescription(ArrayRef<const char *> QualifiedName,
MaybeCount RequiredArgs = None,
MaybeCount RequiredParams = None);

CallDescription(std::nullptr_t) = delete;

/// Get the name of the function that this object matches.
StringRef getFunctionName() const { return QualifiedName.back(); }

/// Get the qualified name parts in reversed order.
/// E.g. { "std", "vector", "data" } -> "vector", "std"
auto begin_qualified_name_parts() const {
return std::next(QualifiedName.rbegin());
}
auto end_qualified_name_parts() const { return QualifiedName.rend(); }

/// It's false, if and only if we expect a single identifier, such as
/// `getenv`. It's true for `std::swap`, or `my::detail::container::data`.
bool hasQualifiedNameParts() const { return QualifiedName.size() > 1; }

/// @name Matching CallDescriptions against a CallEvent
/// @{

/// Returns true if the CallEvent is a call to a function that matches
/// the CallDescription.
///
/// \note This function is not intended to be used to match Obj-C method
/// calls.
bool matches(const CallEvent &Call) const;

/// Returns true whether the CallEvent matches on any of the CallDescriptions
/// supplied.
///
/// \note This function is not intended to be used to match Obj-C method
/// calls.
friend bool matchesAny(const CallEvent &Call, const CallDescription &CD1) {
return CD1.matches(Call);
}

/// \copydoc clang::ento::matchesAny(const CallEvent &, const CallDescription &)
template <typename... Ts>
friend bool matchesAny(const CallEvent &Call, const CallDescription &CD1,
const Ts &...CDs) {
return CD1.matches(Call) || matchesAny(Call, CDs...);
}
/// @}
};

/// An immutable map from CallDescriptions to arbitrary data. Provides a unified
/// way for checkers to react on function calls.
template <typename T> class CallDescriptionMap {
friend class CallDescriptionSet;

// Some call descriptions aren't easily hashable (eg., the ones with qualified
// names in which some sections are omitted), so let's put them
// in a simple vector and use linear lookup.
// TODO: Implement an actual map for fast lookup for "hashable" call
// descriptions (eg., the ones for C functions that just match the name).
std::vector<std::pair<CallDescription, T>> LinearMap;

public:
CallDescriptionMap(
std::initializer_list<std::pair<CallDescription, T>> &&List)
: LinearMap(List) {}

template <typename InputIt>
CallDescriptionMap(InputIt First, InputIt Last) : LinearMap(First, Last) {}

~CallDescriptionMap() = default;

// These maps are usually stored once per checker, so let's make sure
// we don't do redundant copies.
CallDescriptionMap(const CallDescriptionMap &) = delete;
CallDescriptionMap &operator=(const CallDescription &) = delete;

CallDescriptionMap(CallDescriptionMap &&) = default;
CallDescriptionMap &operator=(CallDescriptionMap &&) = default;

LLVM_NODISCARD const T *lookup(const CallEvent &Call) const {
// Slow path: linear lookup.
// TODO: Implement some sort of fast path.
for (const std::pair<CallDescription, T> &I : LinearMap)
if (I.first.matches(Call))
return &I.second;

return nullptr;
}
};

/// An immutable set of CallDescriptions.
/// Checkers can efficiently decide if a given CallEvent matches any
/// CallDescription in the set.
class CallDescriptionSet {
CallDescriptionMap<bool /*unused*/> Impl = {};

public:
CallDescriptionSet(std::initializer_list<CallDescription> &&List);

CallDescriptionSet(const CallDescriptionSet &) = delete;
CallDescriptionSet &operator=(const CallDescription &) = delete;

LLVM_NODISCARD bool contains(const CallEvent &Call) const;
};

} // namespace ento
} // namespace clang

#endif // LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CALLDESCRIPTION_H
Loading