Skip to content

[clang][analyzer] Move checker alpha.unix.StdCLibraryFunctions out of alpha. #66207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -408,6 +408,8 @@ Static Analyzer
- Added a new checker ``core.BitwiseShift`` which reports situations where
bitwise shift operators produce undefined behavior (because some operand is
negative or too large).
- Move checker ``alpha.unix.StdCLibraryFunctions`` out of the ``alpha`` package
to ``unix.StdCLibraryFunctions``.

.. _release-notes-sanitizers:

Expand Down
188 changes: 94 additions & 94 deletions clang/docs/analyzer/checkers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1026,6 +1026,100 @@ Check for null pointers being passed as arguments to C string functions:
return strlen(0); // warn
}

.. _unix-StdCLibraryFunctions:

unix.StdCLibraryFunctions (C)
"""""""""""""""""""""""""""""""""""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align this with the section title.

Check for calls of standard library functions that violate predefined argument
constraints. For example, it is stated in the C standard that for the ``int
isalnum(int ch)`` function the behavior is undefined if the value of ``ch`` is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grammar feels odd here.

not representable as unsigned char and is not equal to ``EOF``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Char and unsigned are not escaped like we usually do for code.


.. code-block:: c

#define EOF -1
void test_alnum_concrete(int v) {
int ret = isalnum(256); // \
// warning: Function argument outside of allowed range
(void)ret;
}

void buffer_size_violation(FILE *file) {
enum { BUFFER_SIZE = 1024 };
wchar_t wbuf[BUFFER_SIZE];

const size_t size = sizeof(*wbuf); // 4
const size_t nitems = sizeof(wbuf); // 4096

// Below we receive a warning because the 3rd parameter should be the
// number of elements to read, not the size in bytes. This case is a known
// vulnerability described by the ARR38-C SEI-CERT rule.
fread(wbuf, size, nitems, file);
}

You can think of this checker as defining restrictions (pre- and postconditions)
on standard library functions. Preconditions are checked, and when they are
violated, a warning is emitted. Post conditions are added to the analysis, e.g.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here "post condition" is spelled as separate words, while previously there was a hyphen between the words.

that the return value must be no greater than 255.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the return value" of what? I think a function name is missing here.


For example if an argument to a function must be in between 0 and 255, but the
value of the argument is unknown, the analyzer will conservatively assume that
it is in this interval. Similarly, if a function mustn't be called with a null
pointer and the null value of the argument can not be proven, the analyzer will
assume that it is non-null.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pointer and the null value of the argument can not be proven, the analyzer will
assume that it is non-null.
pointer and the analyzer cannot prove that it is null, then it will assume that
it is non-null.

"can not be proven" is stronger than "the analyzer cannot prove it"


These are the possible checks on the values passed as function arguments:
- The argument has an allowed range (or multiple ranges) of values. The checker
can detect if a passed value is outside of the allowed range and show the
actual and allowed values.
- The argument has pointer type and is not allowed to be null pointer. Many
(but not all) standard functions can produce undefined behavior if a null
pointer is passed, these cases can be detected by the checker.
- The argument is a pointer to a memory block and the minimal size of this
buffer is determined by another argument to the function, or by
multiplication of two arguments (like at function ``fread``), or is a fixed
value (for example ``asctime_r`` requires at least a buffer of size 26). The
checker can detect if the buffer size is too small and in optimal case show
the size of the buffer and the values of the corresponding arguments.

.. code-block:: c

int test_alnum_symbolic(int x) {
int ret = isalnum(x);
// after the call, ret is assumed to be in the range [-1, 255]

if (ret > 255) // impossible (infeasible branch)
if (x == 0)
return ret / x; // division by zero is not reported
return ret;
}

Additionally to the argument and return value conditions, this checker also adds
state of the value ``errno`` if applicable to the analysis. Many system
functions set the ``errno`` value only if an error occurs (together with a
specific return value of the function), otherwise it becomes undefined. This
checker changes the analysis state to contain such information. This data is
used by other checkers, for example :ref:`alpha-unix-Errno`.

**Limitations**

The checker can not always provide notes about the values of the arguments.
Without this information it is hard to confirm if the constraint is indeed
violated. The argument values are shown if they are known constants or the value
is determined by previous (not too complicated) assumptions.

The checker can produce false positives in cases such as if the program has
invariants not known to the analyzer engine or the bug report path contains
calls to unknown functions. In these cases the analyzer fails to detect the real
range of the argument.

**Parameters**

The checker models functions (and emits diagnostics) from the C standard by
default. The ``ModelPOSIX`` option enables modeling (and emit diagnostics) of
additional functions that are defined in the POSIX standard. This option is
disabled by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should have an exhaustive list of the modeled functions here, or that wouldn't be useful?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly the list of functions was already rejected by reviewers (because maintenance problems). Otherwise I think it is good to have an exact list of modeled functions.

.. _osx-checkers:

osx
Expand Down Expand Up @@ -2651,100 +2745,6 @@ For a more detailed description of configuration options, please see the
alpha.unix
^^^^^^^^^^^
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are here, could you align this with its section title as well?


.. _alpha-unix-StdCLibraryFunctions:

alpha.unix.StdCLibraryFunctions (C)
"""""""""""""""""""""""""""""""""""
Check for calls of standard library functions that violate predefined argument
constraints. For example, it is stated in the C standard that for the ``int
isalnum(int ch)`` function the behavior is undefined if the value of ``ch`` is
not representable as unsigned char and is not equal to ``EOF``.

.. code-block:: c

#define EOF -1
void test_alnum_concrete(int v) {
int ret = isalnum(256); // \
// warning: Function argument outside of allowed range
(void)ret;
}

void buffer_size_violation(FILE *file) {
enum { BUFFER_SIZE = 1024 };
wchar_t wbuf[BUFFER_SIZE];

const size_t size = sizeof(*wbuf); // 4
const size_t nitems = sizeof(wbuf); // 4096

// Below we receive a warning because the 3rd parameter should be the
// number of elements to read, not the size in bytes. This case is a known
// vulnerability described by the ARR38-C SEI-CERT rule.
fread(wbuf, size, nitems, file);
}

You can think of this checker as defining restrictions (pre- and postconditions)
on standard library functions. Preconditions are checked, and when they are
violated, a warning is emitted. Post conditions are added to the analysis, e.g.
that the return value must be no greater than 255.

For example if an argument to a function must be in between 0 and 255, but the
value of the argument is unknown, the analyzer will conservatively assume that
it is in this interval. Similarly, if a function mustn't be called with a null
pointer and the null value of the argument can not be proven, the analyzer will
assume that it is non-null.

These are the possible checks on the values passed as function arguments:
- The argument has an allowed range (or multiple ranges) of values. The checker
can detect if a passed value is outside of the allowed range and show the
actual and allowed values.
- The argument has pointer type and is not allowed to be null pointer. Many
(but not all) standard functions can produce undefined behavior if a null
pointer is passed, these cases can be detected by the checker.
- The argument is a pointer to a memory block and the minimal size of this
buffer is determined by another argument to the function, or by
multiplication of two arguments (like at function ``fread``), or is a fixed
value (for example ``asctime_r`` requires at least a buffer of size 26). The
checker can detect if the buffer size is too small and in optimal case show
the size of the buffer and the values of the corresponding arguments.

.. code-block:: c

int test_alnum_symbolic(int x) {
int ret = isalnum(x);
// after the call, ret is assumed to be in the range [-1, 255]

if (ret > 255) // impossible (infeasible branch)
if (x == 0)
return ret / x; // division by zero is not reported
return ret;
}

Additionally to the argument and return value conditions, this checker also adds
state of the value ``errno`` if applicable to the analysis. Many system
functions set the ``errno`` value only if an error occurs (together with a
specific return value of the function), otherwise it becomes undefined. This
checker changes the analysis state to contain such information. This data is
used by other checkers, for example :ref:`alpha-unix-Errno`.

**Limitations**

The checker can not always provide notes about the values of the arguments.
Without this information it is hard to confirm if the constraint is indeed
violated. The argument values are shown if they are known constants or the value
is determined by previous (not too complicated) assumptions.

The checker can produce false positives in cases such as if the program has
invariants not known to the analyzer engine or the bug report path contains
calls to unknown functions. In these cases the analyzer fails to detect the real
range of the argument.

**Parameters**

The checker models functions (and emits diagnostics) from the C standard by
default. The ``ModelPOSIX`` option enables modeling (and emit diagnostics) of
additional functions that are defined in the POSIX standard. This option is
disabled by default.

.. _alpha-unix-BlockInCriticalSection:

alpha.unix.BlockInCriticalSection (C)
Expand Down
43 changes: 22 additions & 21 deletions clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
Original file line number Diff line number Diff line change
Expand Up @@ -532,6 +532,27 @@ def MismatchedDeallocatorChecker : Checker<"MismatchedDeallocator">,
Dependencies<[DynamicMemoryModeling]>,
Documentation<HasDocumentation>;

def StdCLibraryFunctionsChecker : Checker<"StdCLibraryFunctions">,
HelpText<"Check for invalid arguments of C standard library functions, "
"and apply relations between arguments and return value">,
CheckerOptions<[
CmdLineOption<Boolean,
"DisplayLoadedSummaries",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you confirm that this option is not advertised in the user facing docs.

"If set to true, the checker displays the found summaries "
"for the given translation unit.",
"false",
Released,
Hide>,
CmdLineOption<Boolean,
"ModelPOSIX",
"If set to true, the checker models additional functions "
"from the POSIX standard.",
"false",
InAlpha>
]>,
WeakDependencies<[CallAndMessageChecker, NonNullParamChecker]>,
Documentation<HasDocumentation>;

def VforkChecker : Checker<"Vfork">,
HelpText<"Check for proper usage of vfork">,
Documentation<HasDocumentation>;
Expand Down Expand Up @@ -574,27 +595,6 @@ def BlockInCriticalSectionChecker : Checker<"BlockInCriticalSection">,
HelpText<"Check for calls to blocking functions inside a critical section">,
Documentation<HasDocumentation>;

def StdCLibraryFunctionsChecker : Checker<"StdCLibraryFunctions">,
HelpText<"Check for invalid arguments of C standard library functions, "
"and apply relations between arguments and return value">,
CheckerOptions<[
CmdLineOption<Boolean,
"DisplayLoadedSummaries",
"If set to true, the checker displays the found summaries "
"for the given translation unit.",
"false",
Released,
Hide>,
CmdLineOption<Boolean,
"ModelPOSIX",
"If set to true, the checker models additional functions "
"from the POSIX standard.",
"false",
InAlpha>
]>,
WeakDependencies<[CallAndMessageChecker, NonNullParamChecker, StreamChecker]>,
Documentation<HasDocumentation>;

} // end "alpha.unix"

//===----------------------------------------------------------------------===//
Expand Down Expand Up @@ -1622,6 +1622,7 @@ def DebugIteratorModeling : Checker<"DebugIteratorModeling">,
def StdCLibraryFunctionsTesterChecker : Checker<"StdCLibraryFunctionsTester">,
HelpText<"Add test functions to the summary map, so testing of individual "
"summary constituents becomes possible.">,
WeakDependencies<[StdCLibraryFunctionsChecker]>,
Documentation<NotDocumented>;

} // end "debug"
Expand Down
2 changes: 1 addition & 1 deletion clang/test/Analysis/PR49642.c
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// RUN: %clang_analyze_cc1 -Wno-implicit-function-declaration -Wno-implicit-int -w -verify %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions
// RUN: -analyzer-checker=unix.StdCLibraryFunctions

// expected-no-diagnostics

Expand Down
4 changes: 2 additions & 2 deletions clang/test/Analysis/analyzer-config.c
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@
// CHECK-NEXT: alpha.security.MmapWriteExec:MmapProtRead = 0x01
// CHECK-NEXT: alpha.security.taint.TaintPropagation:Config = ""
// CHECK-NEXT: alpha.unix.Errno:AllowErrnoReadOutsideConditionExpressions = true
// CHECK-NEXT: alpha.unix.StdCLibraryFunctions:DisplayLoadedSummaries = false
// CHECK-NEXT: alpha.unix.StdCLibraryFunctions:ModelPOSIX = false
// CHECK-NEXT: apply-fixits = false
// CHECK-NEXT: assume-controlled-environment = false
// CHECK-NEXT: avoid-suppressing-null-argument-paths = false
Expand Down Expand Up @@ -129,6 +127,8 @@
// CHECK-NEXT: track-conditions-debug = false
// CHECK-NEXT: unix.DynamicMemoryModeling:AddNoOwnershipChangeNotes = true
// CHECK-NEXT: unix.DynamicMemoryModeling:Optimistic = false
// CHECK-NEXT: unix.StdCLibraryFunctions:DisplayLoadedSummaries = false
// CHECK-NEXT: unix.StdCLibraryFunctions:ModelPOSIX = false
// CHECK-NEXT: unroll-loops = false
// CHECK-NEXT: verbose-report-filename = false
// CHECK-NEXT: widen-loops = false
1 change: 1 addition & 0 deletions clang/test/Analysis/analyzer-enabled-checkers.c
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
// CHECK-NEXT: unix.Malloc
// CHECK-NEXT: unix.MallocSizeof
// CHECK-NEXT: unix.MismatchedDeallocator
// CHECK-NEXT: unix.StdCLibraryFunctions
// CHECK-NEXT: unix.Vfork
// CHECK-NEXT: unix.cstring.BadSizeArg
// CHECK-NEXT: unix.cstring.NullArg
Expand Down
4 changes: 2 additions & 2 deletions clang/test/Analysis/conversion.c
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// RUN: %clang_analyze_cc1 %s \
// RUN: -Wno-conversion -Wno-tautological-constant-compare \
// RUN: -analyzer-checker=core,apiModeling,alpha.unix.StdCLibraryFunctions,alpha.core.Conversion \
// RUN: -analyzer-checker=core,apiModeling,unix.StdCLibraryFunctions,alpha.core.Conversion \
// RUN: -verify

unsigned char U8;
Expand Down Expand Up @@ -187,7 +187,7 @@ char dontwarn10(long long x) {
}


// C library functions, handled via alpha.unix.StdCLibraryFunctions
// C library functions, handled via unix.StdCLibraryFunctions

int isascii(int c);
void libraryFunction1(void) {
Expand Down
4 changes: 2 additions & 2 deletions clang/test/Analysis/errno-stdlibraryfunctions-notes.c
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
// RUN: %clang_analyze_cc1 -verify -analyzer-output text %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions \
// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
// RUN: -analyzer-checker=apiModeling.Errno \
// RUN: -analyzer-checker=alpha.unix.Errno \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:ModelPOSIX=true
// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true

#include "Inputs/errno_var.h"

Expand Down
4 changes: 2 additions & 2 deletions clang/test/Analysis/errno-stdlibraryfunctions.c
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
// RUN: %clang_analyze_cc1 -verify %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions \
// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
// RUN: -analyzer-checker=apiModeling.Errno \
// RUN: -analyzer-checker=alpha.unix.Errno \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:ModelPOSIX=true
// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true

#include "Inputs/errno_var.h"

Expand Down
6 changes: 3 additions & 3 deletions clang/test/Analysis/std-c-library-functions-POSIX-lookup.c
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
// RUN: %clang_analyze_cc1 %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false \
// RUN: -triple i686-unknown-linux 2>&1 | FileCheck %s --allow-empty
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
// RUN: %clang_analyze_cc1 %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false \
// RUN: -triple i686-unknown-linux 2>&1 | FileCheck %s
Expand Down
12 changes: 6 additions & 6 deletions clang/test/Analysis/std-c-library-functions-POSIX.c
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
// RUN: %clang_analyze_cc1 %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false \
// RUN: -triple i686-unknown-linux -verify

// RUN: %clang_analyze_cc1 %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=alpha.unix.StdCLibraryFunctions \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config alpha.unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true \
// RUN: -analyzer-config unix.StdCLibraryFunctions:DisplayLoadedSummaries=true \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false \
// RUN: -triple i686-unknown-linux 2>&1 | FileCheck %s
Expand Down
Loading