Skip to content

[clang][analyzer] Remove array bounds check from PointerSubChecker #102580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 12, 2024

Conversation

balazske
Copy link
Collaborator

@balazske balazske commented Aug 9, 2024

At pointer subtraction only pointers are allowed that point into an array (or one after the end), this fact was checker by the checker. This check is now removed because it is a special case of array indexing error that is handled by a different checker (ArrayBoundsV2). At least theoretically the array bounds checker (when finalized) should find the same cases that were detected by the PointerSubChecker.

At pointer subtraction only pointers are allowed that point into
an array (or one after the end), this fact was checker by the checker.
This check is now removed because it is a special case of array
indexing error that is handled by a different checker (ArrayBoundsV2).
At least theoretically the array bounds checker (when finalized)
should find the same cases that were detected by the PointerSubChecker.
@balazske balazske requested review from gamesh411 and NagyDonat August 9, 2024 07:54
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Aug 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 9, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balázs Kéri (balazske)

Changes

At pointer subtraction only pointers are allowed that point into an array (or one after the end), this fact was checker by the checker. This check is now removed because it is a special case of array indexing error that is handled by a different checker (ArrayBoundsV2). At least theoretically the array bounds checker (when finalized) should find the same cases that were detected by the PointerSubChecker.


Full diff: https://github.com/llvm/llvm-project/pull/102580.diff

3 Files Affected:

  • (modified) clang/docs/analyzer/checkers.rst (+8-9)
  • (modified) clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp (+4-89)
  • (modified) clang/test/Analysis/pointer-sub.c (-9)
diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index 55832d20bd27a..df3a013ce269c 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -2501,7 +2501,14 @@ alpha.core.PointerSub (C)
 Check for pointer subtractions on two pointers pointing to different memory
 chunks. According to the C standard §6.5.6 only subtraction of pointers that
 point into (or one past the end) the same array object is valid (for this
-purpose non-array variables are like arrays of size 1).
+purpose non-array variables are like arrays of size 1). This checker only
+searches for different memory objects at subtraction, but does not check if the
+array index is correct (
+:ref:`alpha.security.ArrayBoundsV2 <alpha-security-ArrayBoundsV2>` checks the
+index to some extent).
+
+Furthermore, only cases are reported where stack-allocated objects are involved
+(no warnings on pointers to memory allocated by `malloc`).
 
 .. code-block:: c
 
@@ -2511,11 +2518,6 @@ purpose non-array variables are like arrays of size 1).
    x = &d[4] - &c[1]; // warn: 'c' and 'd' are different arrays
    x = (&a + 1) - &a;
    x = &b - &a; // warn: 'a' and 'b' are different variables
-   x = (&a + 2) - &a; // warn: for a variable it is only valid to have a pointer
-                      // to one past the address of it
-   x = &c[10] - &c[0];
-   x = &c[11] - &c[0]; // warn: index larger than one past the end
-   x = &c[-1] - &c[0]; // warn: negative index
  }
 
  struct S {
@@ -2538,9 +2540,6 @@ offsets of members in a structure, using pointer subtractions. This is still
 undefined behavior according to the standard and code like this can be replaced
 with the `offsetof` macro.
 
-The checker only reports cases where stack-allocated objects are involved (no
-warnings on pointers to memory allocated by `malloc`).
-
 .. _alpha-core-StackAddressAsyncEscape:
 
 alpha.core.StackAddressAsyncEscape (C)
diff --git a/clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp
index b856b0edc6151..f0dc5efd75f7d 100644
--- a/clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp
@@ -31,99 +31,12 @@ class PointerSubChecker
   const llvm::StringLiteral Msg_MemRegionDifferent =
       "Subtraction of two pointers that do not point into the same array "
       "is undefined behavior.";
-  const llvm::StringLiteral Msg_LargeArrayIndex =
-      "Using an array index greater than the array size at pointer subtraction "
-      "is undefined behavior.";
-  const llvm::StringLiteral Msg_NegativeArrayIndex =
-      "Using a negative array index at pointer subtraction "
-      "is undefined behavior.";
-  const llvm::StringLiteral Msg_BadVarIndex =
-      "Indexing the address of a variable with other than 1 at this place "
-      "is undefined behavior.";
-
-  /// Check that an array is indexed in the allowed range that is 0 to "one
-  /// after the end". The "array" can be address of a non-array variable.
-  /// @param E Expression of the pointer subtraction.
-  /// @param ElemReg An indexed region in the subtraction expression.
-  /// @param Reg Region of the other side of the expression.
-  bool checkArrayBounds(CheckerContext &C, const Expr *E,
-                        const ElementRegion *ElemReg,
-                        const MemRegion *Reg) const;
 
 public:
   void checkPreStmt(const BinaryOperator *B, CheckerContext &C) const;
 };
 }
 
-static bool isArrayVar(const MemRegion *R) {
-  while (R) {
-    if (isa<VarRegion>(R))
-      return true;
-    if (const auto *ER = dyn_cast<ElementRegion>(R))
-      R = ER->getSuperRegion();
-    else
-      return false;
-  }
-  return false;
-}
-
-bool PointerSubChecker::checkArrayBounds(CheckerContext &C, const Expr *E,
-                                         const ElementRegion *ElemReg,
-                                         const MemRegion *Reg) const {
-  if (!ElemReg)
-    return true;
-
-  const MemRegion *SuperReg = ElemReg->getSuperRegion();
-  if (!isArrayVar(SuperReg))
-    return true;
-
-  auto ReportBug = [&](const llvm::StringLiteral &Msg) {
-    if (ExplodedNode *N = C.generateNonFatalErrorNode()) {
-      auto R = std::make_unique<PathSensitiveBugReport>(BT, Msg, N);
-      R->addRange(E->getSourceRange());
-      C.emitReport(std::move(R));
-    }
-  };
-
-  ProgramStateRef State = C.getState();
-  SValBuilder &SVB = C.getSValBuilder();
-
-  if (SuperReg == Reg) {
-    // Case like `(&x + 1) - &x`. Only 1 or 0 is allowed as index.
-    if (const llvm::APSInt *I = SVB.getKnownValue(State, ElemReg->getIndex());
-        I && (!I->isOne() && !I->isZero()))
-      ReportBug(Msg_BadVarIndex);
-    return false;
-  }
-
-  DefinedOrUnknownSVal ElemCount =
-      getDynamicElementCount(State, SuperReg, SVB, ElemReg->getElementType());
-  auto IndexTooLarge = SVB.evalBinOp(C.getState(), BO_GT, ElemReg->getIndex(),
-                                     ElemCount, SVB.getConditionType())
-                           .getAs<DefinedOrUnknownSVal>();
-  if (IndexTooLarge) {
-    ProgramStateRef S1, S2;
-    std::tie(S1, S2) = C.getState()->assume(*IndexTooLarge);
-    if (S1 && !S2) {
-      ReportBug(Msg_LargeArrayIndex);
-      return false;
-    }
-  }
-  auto IndexTooSmall = SVB.evalBinOp(State, BO_LT, ElemReg->getIndex(),
-                                     SVB.makeZeroVal(SVB.getArrayIndexType()),
-                                     SVB.getConditionType())
-                           .getAs<DefinedOrUnknownSVal>();
-  if (IndexTooSmall) {
-    ProgramStateRef S1, S2;
-    std::tie(S1, S2) = State->assume(*IndexTooSmall);
-    if (S1 && !S2) {
-      ReportBug(Msg_NegativeArrayIndex);
-      return false;
-    }
-  }
-  return true;
-}
-
 void PointerSubChecker::checkPreStmt(const BinaryOperator *B,
                                      CheckerContext &C) const {
   // When doing pointer subtraction, if the two pointers do not point to the
@@ -151,9 +64,11 @@ void PointerSubChecker::checkPreStmt(const BinaryOperator *B,
   const auto *ElemLR = dyn_cast<ElementRegion>(LR);
   const auto *ElemRR = dyn_cast<ElementRegion>(RR);
 
-  if (!checkArrayBounds(C, B->getLHS(), ElemLR, RR))
+  // Allow cases like "(&x + 1) - &x".
+  if (ElemLR && ElemLR->getSuperRegion() == RR)
     return;
-  if (!checkArrayBounds(C, B->getRHS(), ElemRR, LR))
+  // Allow cases like "&x - (&x + 1)".
+  if (ElemRR && ElemRR->getSuperRegion() == LR)
     return;
 
   const ValueDecl *DiffDeclL = nullptr;
diff --git a/clang/test/Analysis/pointer-sub.c b/clang/test/Analysis/pointer-sub.c
index 194c891889952..57c7fca6064d5 100644
--- a/clang/test/Analysis/pointer-sub.c
+++ b/clang/test/Analysis/pointer-sub.c
@@ -9,13 +9,9 @@ void f1(void) {
   d = (&x + 1) - &x; // no-warning ('&x' is like a single-element array)
   d = &x - (&x + 1); // no-warning
   d = (&x + 0) - &x; // no-warning
-  d = (&x - 1) - &x; // expected-warning{{Indexing the address of a variable with other than 1 at this place is undefined behavior}}
-  d = (&x + 2) - &x; // expected-warning{{Indexing the address of a variable with other than 1 at this place is undefined behavior}}
 
   d = (z + 9) - z; // no-warning (pointers to same array)
   d = (z + 10) - z; // no-warning (pointer to "one after the end")
-  d = (z + 11) - z; // expected-warning{{Using an array index greater than the array size at pointer subtraction is undefined behavior}}
-  d = (z - 1) - z; // expected-warning{{Using a negative array index at pointer subtraction is undefined behavior}}
 }
 
 void f2(void) {
@@ -28,11 +24,6 @@ void f2(void) {
   q = &b[3];
   d = q - p; // expected-warning{{Subtraction of two pointers that}}
 
-  q = a + 10;
-  d = q - p; // no warning (use of pointer to one after the end is allowed)
-  q = a + 11;
-  d = q - a; // expected-warning{{Using an array index greater than the array size at pointer subtraction is undefined behavior}}
-
   d = &a[4] - a; // no-warning
   d = &a[2] - p; // no-warning
   d = &c - p; // expected-warning{{Subtraction of two pointers that}}

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the PR/commit message you write that

At least theoretically the array bounds checker (when finalized) should find the same cases that were detected by the PointerSubChecker.

but I'm pretty sure that the array bound checker already does find all these cases (and much more complex cases as well). (The only remaining reason why ArrayBoundV2 is in alpha is that it emits too many false positives, because engine issues like the inaccurate modeling of loops feed it with incorrect information.)

Instead of deleting the array bound testcases, you could also enable ArrayBoundV2 in the pointer-sub test files and validate that it does find those reports.

Also note that the array bounds checker is named ArrayBoundV2 with no "s" in the name. (I also wrote this incorrectly in a comment yesterday.)

Comment on lines 2505 to 2508
searches for different memory objects at subtraction, but does not check if the
array index is correct (
:ref:`alpha.security.ArrayBoundsV2 <alpha-security-ArrayBoundsV2>` checks the
index to some extent).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
searches for different memory objects at subtraction, but does not check if the
array index is correct (
:ref:`alpha.security.ArrayBoundsV2 <alpha-security-ArrayBoundsV2>` checks the
index to some extent).
reports subtraction between different memory objects and does not check whether
the index (or more generally, memory offset) is within bounds. Bounds checking
is done by :ref:`alpha.security.ArrayBoundV2 <alpha-security-ArrayBoundV2>`.

@balazske
Copy link
Collaborator Author

balazske commented Aug 9, 2024

The ArrayBoundV2 checker needs some update to find all cases. For example (&x - 1) - &x is not found, because the checker does only check ArraySubscriptExpr (and others), not a BinaryOperator with pointer and integer.

@NagyDonat
Copy link
Contributor

Oh, you're right, invalid pointer arithmetic like (&x - 1) is not handled by ArrayBoundV2, because right now that's the responsibility of a THIRD checker, alpha.core.PointerArithm.

However, directly after bringing ArrayBoundV2 out of alpha, I'll continue with working on this PointerArithm and either I'll completely merge it with ArrayBoundV2 or I'll keep it as a separate "frontend" which also calls the bounds-checking logic that's behind ArrayBoundV2.

Based on this I'd suggest that you should

  • delete the bounds checking logic from PointerSubChecker,
  • delete the testcases that were testing it (because it would be too complicated to bring in two other checkers + AFAIK PointerArithm is in a bad shape currently),
  • refer to both ArrayBoundV2 and PointerArithm in the documentation.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, I'm satisfied with the state of this commit. However let's wait for an independent approval from @steakhal @haoNoQ or someone else, because this change does reduce the scope of this checker.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really checked the PR, but we don't map the PointerSub checker to any rules.
Maybe, after it's out of alpha we will come back and check if it makes sense for us :)

Thank you working on these btw. A lot of unappreciated work all front.

Copy link
Member

@benshi001 benshi001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@balazske balazske merged commit e607360 into llvm:main Aug 12, 2024
9 checks passed
@balazske balazske deleted the pointersub_noarraybounds branch August 12, 2024 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants