Skip to content

[Sema] Diagnose tautological bounds checks #120222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 18, 2024

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Dec 17, 2024

This diagnoses comparisons like ptr + unsigned_index < ptr and ptr + unsigned_index >= ptr, which are always false/true because addition of a pointer and an unsigned index cannot wrap (or the behavior is undefined).

This warning is intended to help find broken bounds checks (which must be implemented in terms of uintptr_t instead).

Fixes #120214.

This diagnoses comparisons like `ptr + unsigned_index < ptr` and
`ptr + unsigned_index >= ptr`, which are always false/true because
addition of a pointer and an unsigned index cannot wrap (or the
behavior is undefined).

This warning is intended to help find broken bounds checks (which
must be implemented in terms of uintptr_t instead).

Fixes llvm#120214.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 17, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 17, 2024

@llvm/pr-subscribers-clang

Author: Nikita Popov (nikic)

Changes

This diagnoses comparisons like ptr + unsigned_index &lt; ptr and ptr + unsigned_index &gt;= ptr, which are always false/true because addition of a pointer and an unsigned index cannot wrap (or the behavior is undefined).

This warning is intended to help find broken bounds checks (which must be implemented in terms of uintptr_t instead).

Fixes #120214.


Full diff: https://github.com/llvm/llvm-project/pull/120222.diff

3 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+1-1)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+49)
  • (added) clang/test/Sema/tautological-pointer-comparison.c (+69)
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 9344b620779b84..d67a81f8564a8e 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -10246,7 +10246,7 @@ def warn_dangling_reference_captured_by_unknown : Warning<
 // should result in a warning, since these always evaluate to a constant.
 // Array comparisons have similar warnings
 def warn_comparison_always : Warning<
-  "%select{self-|array }0comparison always evaluates to "
+  "%select{self-|array |pointer }0comparison always evaluates to "
   "%select{a constant|true|false|'std::strong_ordering::equal'}1">,
   InGroup<TautologicalCompare>;
 def warn_comparison_bitwise_always : Warning<
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 20bf6f7f6f28ff..4e814972d5c978 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -11786,6 +11786,49 @@ static bool checkForArray(const Expr *E) {
   return D->getType()->isArrayType() && !D->isWeak();
 }
 
+/// Detect patterns ptr + size >= ptr and ptr + size < ptr, where ptr is a
+/// pointer and size is an unsigned integer. Return whether the result is
+/// always true/false.
+static std::optional<bool> isTautologicalBoundsCheck(Expr *LHS, Expr *RHS,
+                                                     BinaryOperatorKind Opc) {
+  if (!LHS->getType()->isPointerType())
+    return std::nullopt;
+
+  // Canonicalize to >= or < predicate.
+  switch (Opc) {
+  case BO_GE:
+  case BO_LT:
+    break;
+  case BO_GT:
+    std::swap(LHS, RHS);
+    Opc = BO_LT;
+    break;
+  case BO_LE:
+    std::swap(LHS, RHS);
+    Opc = BO_GE;
+    break;
+  default:
+    return std::nullopt;
+  }
+
+  auto *BO = dyn_cast<BinaryOperator>(LHS);
+  if (!BO || BO->getOpcode() != BO_Add)
+    return std::nullopt;
+
+  Expr *Other;
+  if (Expr::isSameComparisonOperand(BO->getLHS(), RHS))
+    Other = BO->getRHS();
+  else if (Expr::isSameComparisonOperand(BO->getRHS(), RHS))
+    Other = BO->getLHS();
+  else
+    return std::nullopt;
+
+  if (!Other->getType()->isUnsignedIntegerType())
+    return std::nullopt;
+
+  return Opc == BO_GE;
+}
+
 /// Diagnose some forms of syntactically-obvious tautological comparison.
 static void diagnoseTautologicalComparison(Sema &S, SourceLocation Loc,
                                            Expr *LHS, Expr *RHS,
@@ -11895,6 +11938,12 @@ static void diagnoseTautologicalComparison(Sema &S, SourceLocation Loc,
                             S.PDiag(diag::warn_comparison_always)
                                 << 1 /*array comparison*/
                                 << Result);
+    } else if (std::optional<bool> Res =
+                   isTautologicalBoundsCheck(LHS, RHS, Opc)) {
+      S.DiagRuntimeBehavior(Loc, nullptr,
+                            S.PDiag(diag::warn_comparison_always)
+                                << 2 /*pointer comparison*/
+                                << (*Res ? AlwaysTrue : AlwaysFalse));
     }
   }
 
diff --git a/clang/test/Sema/tautological-pointer-comparison.c b/clang/test/Sema/tautological-pointer-comparison.c
new file mode 100644
index 00000000000000..f3f7c7da8d6907
--- /dev/null
+++ b/clang/test/Sema/tautological-pointer-comparison.c
@@ -0,0 +1,69 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+int add_ptr_idx_ult_ptr(const char *ptr, unsigned index) {
+  return ptr + index < ptr; // expected-warning {{pointer comparison always evaluates to false}}
+}
+
+int add_idx_ptr_ult_ptr(const char *ptr, unsigned index) {
+  return index + ptr < ptr; // expected-warning {{pointer comparison always evaluates to false}}
+}
+
+int ptr_ugt_add_ptr_idx(const char *ptr, unsigned index) {
+  return ptr > ptr + index; // expected-warning {{pointer comparison always evaluates to false}}
+}
+
+int ptr_ugt_add_idx_ptr(const char *ptr, unsigned index) {
+  return ptr > index + ptr; // expected-warning {{pointer comparison always evaluates to false}}
+}
+
+int add_ptr_idx_uge_ptr(const char *ptr, unsigned index) {
+  return ptr + index >= ptr; // expected-warning {{pointer comparison always evaluates to true}}
+}
+
+int add_idx_ptr_uge_ptr(const char *ptr, unsigned index) {
+  return index + ptr >= ptr; // expected-warning {{pointer comparison always evaluates to true}}
+}
+
+int ptr_ule_add_ptr_idx(const char *ptr, unsigned index) {
+  return ptr <= ptr + index; // expected-warning {{pointer comparison always evaluates to true}}
+}
+
+int ptr_ule_add_idx_ptr(const char *ptr, unsigned index) {
+  return ptr <= index + ptr; // expected-warning {{pointer comparison always evaluates to true}}
+}
+
+// Negative tests with wrong predicate.
+
+int add_ptr_idx_ule_ptr(const char *ptr, unsigned index) {
+  return ptr + index <= ptr;
+}
+
+int add_ptr_idx_ugt_ptr(const char *ptr, unsigned index) {
+  return ptr + index > ptr;
+}
+
+int ptr_uge_add_idx_ptr(const char *ptr, unsigned index) {
+  return ptr >= index + ptr;
+}
+
+int ptr_ult_add_idx_ptr(const char *ptr, unsigned index) {
+  return ptr < index + ptr;
+}
+
+// Negative test with signed index.
+
+int add_ptr_idx_ult_ptr_signed(const char *ptr, int index) {
+  return ptr + index < ptr;
+}
+
+// Negative test with unrelated pointers. 
+
+int add_ptr_idx_ult_ptr2(const char *ptr, const char *ptr2, unsigned index) {
+  return ptr + index < ptr2;
+}
+
+// Negative test with non-pointer operands.
+
+int add_ptr_idx_ult_ptr_not_pointer(unsigned ptr, unsigned index) {
+  return ptr + index < ptr;
+}

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks great.

Can you add

  • a changelog entry
  • a test with C arrays?

/// Detect patterns ptr + size >= ptr and ptr + size < ptr, where ptr is a
/// pointer and size is an unsigned integer. Return whether the result is
/// always true/false.
static std::optional<bool> isTautologicalBoundsCheck(Expr *LHS, Expr *RHS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static std::optional<bool> isTautologicalBoundsCheck(Expr *LHS, Expr *RHS,
static std::optional<bool> isTautologicalBoundsCheck(const Expr *LHS, const Expr *RHS,

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@hstk30-hw hstk30-hw merged commit 6d34cfa into llvm:main Dec 18, 2024
9 checks passed
@zmodem
Copy link
Collaborator

zmodem commented Dec 18, 2024

Nice, thank you!

@nathanchance
Copy link
Member

nathanchance commented Dec 18, 2024

@nikic I noticed your comment on #118472 (the motivator for this change AFAICT):

-fwrapv should already cover pointers

but it does not seem like this warning takes that into account (see the last example below)? I noticed a few instances of this warning in the Linux kernel but it sets -fno-strict-overflow for well defined overflow semantics it can depend on because it wants to avoid optimizations like the one that triggered this warning (-fno-delete-null-pointer-checks is a similarly used flag), so claiming the result of the checks are always false just does not seem to line up with reality. It seems like if the optimizer is going to respect this check and not remove it, this warning should not trigger either.

int check(const int* foo, unsigned int idx)
{
        return foo + idx < foo;
}
$ clang-19 -O2 -c test.c

$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 31 c0                         xorl    %eax, %eax
       2: c3                            retq

$ clang-19 -O2 -fno-strict-overflow -c test.o

$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 89 f0                         movl    %esi, %eax
       2: 48 8d 0c 87                   leaq    (%rdi,%rax,4), %rcx
       6: 31 c0                         xorl    %eax, %eax
       8: 48 39 f9                      cmpq    %rdi, %rcx
       b: 0f 92 c0                      setb    %al
       e: c3                            retq
$ gcc --version | head -1
gcc (GCC) 14.2.1 20240910

$ gcc -O2 -c test.c

$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 31 c0                         xorl    %eax, %eax
       2: c3                            retq

$ gcc -O2 -fno-strict-overflow -c test.c

$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 89 f6                         movl    %esi, %esi
       2: 48 8d 04 b7                   leaq    (%rdi,%rsi,4), %rax
       6: 48 39 f8                      cmpq    %rdi, %rax
       9: 0f 92 c0                      setb    %al
       c: 0f b6 c0                      movzbl  %al, %eax
       f: c3                            retq
$ clang --version | head -1
ClangBuiltLinux clang version 20.0.0git (https://github.com/llvm/llvm-project.git 99c2e3b78210a345afb1b5121f12b0e7bf923543)

$ clang -O2 -c test.c
test.c:3:19: warning: pointer comparison always evaluates to false [-Wtautological-compare]
    3 |         return foo + idx < foo;
      |                          ^
1 warning generated.

$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 31 c0                         xorl    %eax, %eax
       2: c3                            retq

$ clang -O2 -c -fno-strict-overflow test.c
test.c:3:19: warning: pointer comparison always evaluates to false [-Wtautological-compare]
    3 |         return foo + idx < foo;
      |                          ^
1 warning generated.

$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 89 f0                         movl    %esi, %eax
       2: 48 8d 0c 87                   leaq    (%rdi,%rax,4), %rcx
       6: 31 c0                         xorl    %eax, %eax
       8: 48 39 f9                      cmpq    %rdi, %rcx
       b: 0f 92 c0                      setb    %al
       e: c3                            retq

I tested something like:

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index e06a092177ef..2dd7c5951d71 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -11789,10 +11789,11 @@ static bool checkForArray(const Expr *E) {
 /// Detect patterns ptr + size >= ptr and ptr + size < ptr, where ptr is a
 /// pointer and size is an unsigned integer. Return whether the result is
 /// always true/false.
-static std::optional<bool> isTautologicalBoundsCheck(const Expr *LHS,
+static std::optional<bool> isTautologicalBoundsCheck(Sema &S,
+                                                     const Expr *LHS,
                                                      const Expr *RHS,
                                                      BinaryOperatorKind Opc) {
-  if (!LHS->getType()->isPointerType())
+  if (!LHS->getType()->isPointerType() || S.getLangOpts().isSignedOverflowDefined())
     return std::nullopt;

   // Canonicalize to >= or < predicate.
@@ -11940,7 +11941,7 @@ static void diagnoseTautologicalComparison(Sema &S, SourceLocation Loc,
                                 << 1 /*array comparison*/
                                 << Result);
     } else if (std::optional<bool> Res =
-                   isTautologicalBoundsCheck(LHS, RHS, Opc)) {
+                   isTautologicalBoundsCheck(S, LHS, RHS, Opc)) {
       S.DiagRuntimeBehavior(Loc, nullptr,
                             S.PDiag(diag::warn_comparison_always)
                                 << 2 /*pointer comparison*/

which results in a behavior I would expect:

$ clang -fsyntax-only test.c
test.c:3:19: warning: pointer comparison always evaluates to false [-Wtautological-compare]
    3 |         return foo + idx < foo;
      |                          ^
1 warning generated.

$ clang -fsyntax-only -fwrapv test.c

If I am missing something, please let me know. If that change is acceptable, I don't mind submitting it with some tests.

@nikic
Copy link
Contributor Author

nikic commented Dec 18, 2024

@nathanchance You are correct, this warning should indeed respect -fwrapv/-fno-strict-overflow. Your patch looks reasonable to me as well.

@nathanchance
Copy link
Member

@nathanchance You are correct, this warning should indeed respect -fwrapv/-fno-strict-overflow. Your patch looks reasonable to me as well.

Thanks for the confirmation, I have submitted #120480 for this.

nathanchance added a commit that referenced this pull request Dec 19, 2024
The tautological bounds check warning added in #120222 does not take
into account whether signed integer overflow is well defined or not,
which could result in a developer removing a bounds check that may not
actually be always false because of different overflow semantics.

```c
int check(const int* foo, unsigned int idx)
{
    return foo + idx < foo;
}
```

```
$ clang -O2 -c test.c
test.c:3:19: warning: pointer comparison always evaluates to false [-Wtautological-compare]
    3 |         return foo + idx < foo;
      |                          ^
1 warning generated.

# Bounds check is eliminated without -fwrapv, warning was correct
$ llvm-objdump -dr test.o
...
0000000000000000 <check>:
       0: 31 c0                         xorl    %eax, %eax
       2: c3                            retq
```

```
$ clang -O2 -fwrapv -c test.c
test.c:3:19: warning: pointer comparison always evaluates to false [-Wtautological-compare]
    3 |         return foo + idx < foo;
      |                          ^
1 warning generated.

# Bounds check remains, warning was wrong
$ llvm-objdump -dr test.o
0000000000000000 <check>:
       0: 89 f0                         movl    %esi, %eax
       2: 48 8d 0c 87                   leaq    (%rdi,%rax,4), %rcx
       6: 31 c0                         xorl    %eax, %eax
       8: 48 39 f9                      cmpq    %rdi, %rcx
       b: 0f 92 c0                      setb    %al
       e: c3                            retq
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

-Wtautological-compare should flag tautological pointer bounds checks
7 participants