Skip to content

RFC: [clang-tidy] [analyzer] Move nondeterministic pointer usage check to tidy #110471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

vabridgers
Copy link
Contributor

@vabridgers vabridgers commented Sep 30, 2024

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as simple clang-tidy check-like code, wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open-source projects.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang-tools-extra clang-tidy clang:static analyzer labels Sep 30, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: None (vabridgers)

Changes

RFC: This PR is not ready to merge yet, preliminary comments are welcome.
Here are some questions to consider...

Q) Is the name and file location ok in the directory structure?
I initially chose bugprone and NondeterministicPointerUsage as a
starting point for review.
Q) I wrote an explanation for the check based on internal discussion
since I felt like this was missing. Please check and comment.
Q) There are more ways to iterate over an unordered container
than captured here. Do those need to be detected as well?
Q) I squashed PointerIteration and PointerSorting. I think that works
in this case, but interested to hear your comments about that.
Q) I ended up expanding upon the C++ simulation header, but had thoughts
about breaking that up into multiple smaller files. Maybe there's an
opportunity to refactor some C++ simulation files across multiple
checkers as a seperate PR first, or maybe even as part of this one?

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as clang-tidy checks wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open source projects.


Patch is 75.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110471.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp (+67)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h (+36)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst (+31)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h (+1450)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/nondeterministic-pointer-usage.cpp (+69)
  • (modified) clang/docs/analyzer/checkers.rst (-31)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (-18)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (-2)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerIterationChecker.cpp (-101)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp (-115)
  • (removed) clang/test/Analysis/ptr-sort.cpp (-36)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index 689eb92a3d8d17..7c177196d76f58 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "MultipleStatementMacroCheck.h"
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
+#include "NondeterministicPointerUsageCheck.h"
 #include "NotNullTerminatedResultCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
@@ -179,6 +180,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<cppcoreguidelines::NarrowingConversionsCheck>(
         "bugprone-narrowing-conversions");
     CheckFactories.registerCheck<NoEscapeCheck>("bugprone-no-escape");
+    CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
+        "bugprone-nondeterministic-pointer-usage");
     CheckFactories.registerCheck<NonZeroEnumToBoolConversionCheck>(
         "bugprone-non-zero-enum-to-bool-conversion");
     CheckFactories.registerCheck<NotNullTerminatedResultCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index cb0d8ae98bac58..5628572b984226 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   MultipleNewInOneExpressionCheck.cpp
   MultipleStatementMacroCheck.cpp
   NoEscapeCheck.cpp
+  NondeterministicPointerUsageCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
   OptionalValueConversionCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
new file mode 100644
index 00000000000000..ddc314af739d97
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
@@ -0,0 +1,67 @@
+//===--- NondetermnisticPointerUsageCheck.cpp - clang-tidy ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NondeterministicPointerUsageCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Lex/Lexer.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+void NondeterministicPointerUsageCheck::registerMatchers(MatchFinder *Finder) {
+
+  auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));
+
+  auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
+      anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
+            hasName("std::unordered_multiset"),
+            hasName("std::unordered_multimap")))))));
+
+  Finder->addMatcher(
+      stmt(cxxForRangeStmt(hasRangeInit(RangeInit.bind("rangeinit")),
+                           hasLoopVariable(LoopVariable.bind("loopVar"))))
+          .bind("cxxForRangeStmt"),
+      this);
+
+  auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
+                         callee(functionDecl(hasName("std::nth_element"))),
+                         callee(functionDecl(hasName("std::sort"))),
+                         callee(functionDecl(hasName("std::partial_sort"))),
+                         callee(functionDecl(hasName("std::partition"))),
+                         callee(functionDecl(hasName("std::stable_partition"))),
+                         callee(functionDecl(hasName("std::stable_sort"))));
+
+  auto IteratesPointerEltsM = hasArgument(
+      0,
+      cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
+          hasCanonicalType(pointsTo(hasCanonicalType(pointerType())))))))))));
+
+  Finder->addMatcher(stmt(callExpr(allOf(SortFuncM, IteratesPointerEltsM)))
+                         .bind("sortsemantic"),
+                     this);
+}
+
+void NondeterministicPointerUsageCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  const auto *ForRangePointers =
+      Result.Nodes.getNodeAs<Stmt>("cxxForRangeStmt");
+  const auto *SortPointers = Result.Nodes.getNodeAs<Stmt>("sortsemantic");
+
+  if ((ForRangePointers) && !(ForRangePointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<CXXForRangeStmt>(ForRangePointers);
+    diag(Node->getRParenLoc(), "Iteration of pointers is nondeterministic");
+  }
+
+  if ((SortPointers) && !(SortPointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<Stmt>(SortPointers);
+    diag(Node->getBeginLoc(), "Sorting pointers is nondeterministic");
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
new file mode 100644
index 00000000000000..556a4fce4dffe2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
@@ -0,0 +1,36 @@
+//===--- NondeterministicPointerUsageCheck.h - clang-tidy ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+
+#include "../ClangTidyCheck.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds temporaries that look like RAII objects.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.html
+class NondeterministicPointerUsageCheck : public ClangTidyCheck {
+public:
+  NondeterministicPointerUsageCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+  std::optional<TraversalKind> getCheckTraversalKind() const override {
+    return TK_IgnoreUnlessSpelledInSource;
+  }
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
new file mode 100644
index 00000000000000..8625487c6d694a
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
@@ -0,0 +1,31 @@
+.. title:: clang-tidy - bugprone-nondeterministic-pointer-usage
+
+nondeterministic-pointer-usage
+==============================
+
+Finds nondeterministic usages of pointers in unordered containers.
+
+One canonical example is iteration across a container of pointers.
+
+.. code-block:: c++
+
+  {
+    for (auto i : UnorderedPtrSet)
+      f(i);
+  }
+
+Another such example is sorting a container of pointers.
+
+.. code-block:: c++
+
+  {
+    std::sort(VectorOfPtr.begin(), VectorOfPtr.end());
+  }
+
+Iteration of a containers of pointers may present the order of different
+pointers differently across different runs of a program. In some cases this
+may be acceptable behavior, in others this may be unexpected behavior. This
+check is advisory for this reason.
+
+This check only detects range-based for loops over unordered sets. Other
+similar usages will not be found and are false negatives.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
new file mode 100644
index 00000000000000..b279a8f20d8ddd
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
@@ -0,0 +1,1450 @@
+// Like the compiler, the static analyzer treats some functions differently if
+// they come from a system header -- for example, it is assumed that system
+// functions do not arbitrarily free() their parameters, and that some bugs
+// found in system headers cannot be fixed by the user and should be
+// suppressed.
+#pragma clang system_header
+
+typedef unsigned char uint8_t;
+
+typedef __typeof__(sizeof(int)) size_t;
+typedef __typeof__((char*)0-(char*)0) ptrdiff_t;
+void *memmove(void *s1, const void *s2, size_t n);
+
+namespace std {
+  typedef size_t size_type;
+#if __cplusplus >= 201103L
+  using nullptr_t = decltype(nullptr);
+#endif
+}
+
+namespace std {
+  struct input_iterator_tag { };
+  struct output_iterator_tag { };
+  struct forward_iterator_tag : public input_iterator_tag { };
+  struct bidirectional_iterator_tag : public forward_iterator_tag { };
+  struct random_access_iterator_tag : public bidirectional_iterator_tag { };
+
+  template <typename Iterator> struct iterator_traits {
+    typedef typename Iterator::difference_type difference_type;
+    typedef typename Iterator::value_type value_type;
+    typedef typename Iterator::pointer pointer;
+    typedef typename Iterator::reference reference;
+    typedef typename Iterator::iterator_category iterator_category;
+  };
+}
+
+template <typename T, typename Ptr, typename Ref> struct __vector_iterator {
+  typedef __vector_iterator<T, T *, T &> iterator;
+  typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __vector_iterator(const Ptr p = 0) : ptr(p) {}
+  __vector_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __vector_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __vector_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __vector_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  template<typename U, typename Ptr2, typename Ref2>
+  difference_type operator-(const __vector_iterator<U, Ptr2, Ref2> &rhs);
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __deque_iterator {
+  typedef __deque_iterator<T, T *, T &> iterator;
+  typedef __deque_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __deque_iterator(const Ptr p = 0) : ptr(p) {}
+  __deque_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __deque_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __deque_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __deque_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __list_iterator {
+  typedef __list_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __list_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::bidirectional_iterator_tag iterator_category;
+
+  __list_iterator(T* it = 0) : item(it) {}
+  __list_iterator(const iterator &rhs): item(rhs.item) {}
+  __list_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __list_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  __list_iterator<T, Ptr, Ref> operator--() { item = item->prev; return *this; }
+  __list_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this;
+    item = item->prev;
+    return tmp;
+  }
+
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __list_iterator;
+
+private:
+  T* item;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __fwdl_iterator {
+  typedef __fwdl_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __fwdl_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::forward_iterator_tag iterator_category;
+
+  __fwdl_iterator(T* it = 0) : item(it) {}
+  __fwdl_iterator(const iterator &rhs): item(rhs.item) {}
+  __fwdl_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __fwdl_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __fwdl_iterator;
+
+private:
+  T* item;
+};
+
+namespace std {
+  template <class T1, class T2>
+  struct pair {
+    T1 first;
+    T2 second;
+
+    pair() : first(), second() {}
+    pair(const T1 &a, const T2 &b) : first(a), second(b) {}
+
+    template<class U1, class U2>
+    pair(const pair<U1, U2> &other) : first(other.first),
+                                      second(other.second) {}
+  };
+
+  template<class T2, class T1>
+  T2& get(pair<T1, T2>& p) ;
+  template<class T1, class T2>
+  T1& get(const pair<T1, T2>& p) ;
+
+  typedef __typeof__(sizeof(int)) size_t;
+
+  template <class T> class initializer_list;
+
+  template< class T > struct remove_reference      {typedef T type;};
+  template< class T > struct remove_reference<T&>  {typedef T type;};
+  template< class T > struct remove_reference<T&&> {typedef T type;};
+
+  template<typename T> typename remove_reference<T>::type&& move(T&& a);
+  template <typename T> T *__addressof(T &x);
+  template <typename T> T *addressof(T &x);
+  template <typename T> const T& as_const(T& x);
+  template <typename T> T&& forward(T&& x);
+  // FIXME: Declare forward_like
+  // FIXME: Declare move_if_noexcept
+
+  template< class T >
+  using remove_reference_t = typename remove_reference<T>::type;
+
+  template <class T>
+  void swap(T &a, T &b) {
+    T c(std::move(a));
+    a = std::move(b);
+    b = std::move(c);
+  }
+
+  template<typename T>
+  class vector {
+    T *_start;
+    T *_finish;
+    T *_end_of_storage;
+
+  public:
+    typedef T value_type;
+    typedef size_t size_type;
+    typedef __vector_iterator<T, T *, T &> iterator;
+    typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+    vector() : _start(0), _finish(0), _end_of_storage(0) {}
+    template <typename InputIterator>
+    vector(InputIterator first, InputIterator last);
+    vector(const vector &other);
+    vector(vector &&other);
+    ~vector();
+
+    size_t size() const {
+      return size_t(_finish - _start);
+    }
+
+    vector& operator=(const vector &other);
+    vector& operator=(vector &&other);
+    vector& operator=(std::initializer_list<T> ilist);
+
+    void assign(size_type count, const T &value);
+    template <typename InputIterator >
+    void assign(InputIterator first, InputIterator last);
+    void assign(std::initializer_list<T> ilist);
+
+    void clear();
+
+    void push_back(const T &value);
+    void push_back(T &&value);
+    template<class... Args>
+    void emplace_back(Args&&... args);
+    void pop_back();
+
+    iterator insert(const_iterator position, const value_type &val);
+    iterator insert(const_iterator position, size_type n,
+                    const value_type &val);
+    template <typename InputIterator>
+    iterator insert(const_iterator position, InputIterator first,
+                    InputIterator last);
+    iterator insert(const_iterator position, value_type &&val);
+    iterator insert(const_iterator position, initializer_list<value_type> il);
+
+    template <class... Args>
+    iterator emplace(const_iterator position, Args&&... args);
+
+    iterator erase(const_iterator position);
+    iterator erase(const_iterator first, const_iterator last);
+
+    T &operator[](size_t n) {
+      return _start[n];
+    }
+
+    const T &operator[](size_t n) const {
+      return _start[n];
+    }
+
+    iterator begin() { return iterator(_start); }
+    const_iterator begin() const { return const_iterator(_start); }
+    const_iterator cbegin() const { return const_iterator(_start); }
+    iterator end() { return iterator(_finish); }
+    const_iterator end() const { return const_iterator(_finish); }
+    const_iterator cend() const { return const_iterator(_finish); }
+    T& front() { return *begin(); }
+    const T& front() const { return *begin(); }
+    T& back() { return *(end() - 1); }
+    const T& back() const { re...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-tidy

Author: None (vabridgers)

Changes

RFC: This PR is not ready to merge yet, preliminary comments are welcome.
Here are some questions to consider...

Q) Is the name and file location ok in the directory structure?
I initially chose bugprone and NondeterministicPointerUsage as a
starting point for review.
Q) I wrote an explanation for the check based on internal discussion
since I felt like this was missing. Please check and comment.
Q) There are more ways to iterate over an unordered container
than captured here. Do those need to be detected as well?
Q) I squashed PointerIteration and PointerSorting. I think that works
in this case, but interested to hear your comments about that.
Q) I ended up expanding upon the C++ simulation header, but had thoughts
about breaking that up into multiple smaller files. Maybe there's an
opportunity to refactor some C++ simulation files across multiple
checkers as a seperate PR first, or maybe even as part of this one?

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as clang-tidy checks wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open source projects.


Patch is 75.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110471.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp (+67)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h (+36)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst (+31)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h (+1450)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/nondeterministic-pointer-usage.cpp (+69)
  • (modified) clang/docs/analyzer/checkers.rst (-31)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (-18)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (-2)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerIterationChecker.cpp (-101)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp (-115)
  • (removed) clang/test/Analysis/ptr-sort.cpp (-36)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index 689eb92a3d8d17..7c177196d76f58 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "MultipleStatementMacroCheck.h"
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
+#include "NondeterministicPointerUsageCheck.h"
 #include "NotNullTerminatedResultCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
@@ -179,6 +180,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<cppcoreguidelines::NarrowingConversionsCheck>(
         "bugprone-narrowing-conversions");
     CheckFactories.registerCheck<NoEscapeCheck>("bugprone-no-escape");
+    CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
+        "bugprone-nondeterministic-pointer-usage");
     CheckFactories.registerCheck<NonZeroEnumToBoolConversionCheck>(
         "bugprone-non-zero-enum-to-bool-conversion");
     CheckFactories.registerCheck<NotNullTerminatedResultCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index cb0d8ae98bac58..5628572b984226 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   MultipleNewInOneExpressionCheck.cpp
   MultipleStatementMacroCheck.cpp
   NoEscapeCheck.cpp
+  NondeterministicPointerUsageCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
   OptionalValueConversionCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
new file mode 100644
index 00000000000000..ddc314af739d97
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
@@ -0,0 +1,67 @@
+//===--- NondetermnisticPointerUsageCheck.cpp - clang-tidy ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NondeterministicPointerUsageCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Lex/Lexer.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+void NondeterministicPointerUsageCheck::registerMatchers(MatchFinder *Finder) {
+
+  auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));
+
+  auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
+      anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
+            hasName("std::unordered_multiset"),
+            hasName("std::unordered_multimap")))))));
+
+  Finder->addMatcher(
+      stmt(cxxForRangeStmt(hasRangeInit(RangeInit.bind("rangeinit")),
+                           hasLoopVariable(LoopVariable.bind("loopVar"))))
+          .bind("cxxForRangeStmt"),
+      this);
+
+  auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
+                         callee(functionDecl(hasName("std::nth_element"))),
+                         callee(functionDecl(hasName("std::sort"))),
+                         callee(functionDecl(hasName("std::partial_sort"))),
+                         callee(functionDecl(hasName("std::partition"))),
+                         callee(functionDecl(hasName("std::stable_partition"))),
+                         callee(functionDecl(hasName("std::stable_sort"))));
+
+  auto IteratesPointerEltsM = hasArgument(
+      0,
+      cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
+          hasCanonicalType(pointsTo(hasCanonicalType(pointerType())))))))))));
+
+  Finder->addMatcher(stmt(callExpr(allOf(SortFuncM, IteratesPointerEltsM)))
+                         .bind("sortsemantic"),
+                     this);
+}
+
+void NondeterministicPointerUsageCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  const auto *ForRangePointers =
+      Result.Nodes.getNodeAs<Stmt>("cxxForRangeStmt");
+  const auto *SortPointers = Result.Nodes.getNodeAs<Stmt>("sortsemantic");
+
+  if ((ForRangePointers) && !(ForRangePointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<CXXForRangeStmt>(ForRangePointers);
+    diag(Node->getRParenLoc(), "Iteration of pointers is nondeterministic");
+  }
+
+  if ((SortPointers) && !(SortPointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<Stmt>(SortPointers);
+    diag(Node->getBeginLoc(), "Sorting pointers is nondeterministic");
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
new file mode 100644
index 00000000000000..556a4fce4dffe2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
@@ -0,0 +1,36 @@
+//===--- NondeterministicPointerUsageCheck.h - clang-tidy ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+
+#include "../ClangTidyCheck.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds temporaries that look like RAII objects.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.html
+class NondeterministicPointerUsageCheck : public ClangTidyCheck {
+public:
+  NondeterministicPointerUsageCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+  std::optional<TraversalKind> getCheckTraversalKind() const override {
+    return TK_IgnoreUnlessSpelledInSource;
+  }
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
new file mode 100644
index 00000000000000..8625487c6d694a
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
@@ -0,0 +1,31 @@
+.. title:: clang-tidy - bugprone-nondeterministic-pointer-usage
+
+nondeterministic-pointer-usage
+==============================
+
+Finds nondeterministic usages of pointers in unordered containers.
+
+One canonical example is iteration across a container of pointers.
+
+.. code-block:: c++
+
+  {
+    for (auto i : UnorderedPtrSet)
+      f(i);
+  }
+
+Another such example is sorting a container of pointers.
+
+.. code-block:: c++
+
+  {
+    std::sort(VectorOfPtr.begin(), VectorOfPtr.end());
+  }
+
+Iteration of a containers of pointers may present the order of different
+pointers differently across different runs of a program. In some cases this
+may be acceptable behavior, in others this may be unexpected behavior. This
+check is advisory for this reason.
+
+This check only detects range-based for loops over unordered sets. Other
+similar usages will not be found and are false negatives.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
new file mode 100644
index 00000000000000..b279a8f20d8ddd
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
@@ -0,0 +1,1450 @@
+// Like the compiler, the static analyzer treats some functions differently if
+// they come from a system header -- for example, it is assumed that system
+// functions do not arbitrarily free() their parameters, and that some bugs
+// found in system headers cannot be fixed by the user and should be
+// suppressed.
+#pragma clang system_header
+
+typedef unsigned char uint8_t;
+
+typedef __typeof__(sizeof(int)) size_t;
+typedef __typeof__((char*)0-(char*)0) ptrdiff_t;
+void *memmove(void *s1, const void *s2, size_t n);
+
+namespace std {
+  typedef size_t size_type;
+#if __cplusplus >= 201103L
+  using nullptr_t = decltype(nullptr);
+#endif
+}
+
+namespace std {
+  struct input_iterator_tag { };
+  struct output_iterator_tag { };
+  struct forward_iterator_tag : public input_iterator_tag { };
+  struct bidirectional_iterator_tag : public forward_iterator_tag { };
+  struct random_access_iterator_tag : public bidirectional_iterator_tag { };
+
+  template <typename Iterator> struct iterator_traits {
+    typedef typename Iterator::difference_type difference_type;
+    typedef typename Iterator::value_type value_type;
+    typedef typename Iterator::pointer pointer;
+    typedef typename Iterator::reference reference;
+    typedef typename Iterator::iterator_category iterator_category;
+  };
+}
+
+template <typename T, typename Ptr, typename Ref> struct __vector_iterator {
+  typedef __vector_iterator<T, T *, T &> iterator;
+  typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __vector_iterator(const Ptr p = 0) : ptr(p) {}
+  __vector_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __vector_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __vector_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __vector_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  template<typename U, typename Ptr2, typename Ref2>
+  difference_type operator-(const __vector_iterator<U, Ptr2, Ref2> &rhs);
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __deque_iterator {
+  typedef __deque_iterator<T, T *, T &> iterator;
+  typedef __deque_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __deque_iterator(const Ptr p = 0) : ptr(p) {}
+  __deque_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __deque_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __deque_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __deque_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __list_iterator {
+  typedef __list_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __list_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::bidirectional_iterator_tag iterator_category;
+
+  __list_iterator(T* it = 0) : item(it) {}
+  __list_iterator(const iterator &rhs): item(rhs.item) {}
+  __list_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __list_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  __list_iterator<T, Ptr, Ref> operator--() { item = item->prev; return *this; }
+  __list_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this;
+    item = item->prev;
+    return tmp;
+  }
+
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __list_iterator;
+
+private:
+  T* item;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __fwdl_iterator {
+  typedef __fwdl_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __fwdl_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::forward_iterator_tag iterator_category;
+
+  __fwdl_iterator(T* it = 0) : item(it) {}
+  __fwdl_iterator(const iterator &rhs): item(rhs.item) {}
+  __fwdl_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __fwdl_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __fwdl_iterator;
+
+private:
+  T* item;
+};
+
+namespace std {
+  template <class T1, class T2>
+  struct pair {
+    T1 first;
+    T2 second;
+
+    pair() : first(), second() {}
+    pair(const T1 &a, const T2 &b) : first(a), second(b) {}
+
+    template<class U1, class U2>
+    pair(const pair<U1, U2> &other) : first(other.first),
+                                      second(other.second) {}
+  };
+
+  template<class T2, class T1>
+  T2& get(pair<T1, T2>& p) ;
+  template<class T1, class T2>
+  T1& get(const pair<T1, T2>& p) ;
+
+  typedef __typeof__(sizeof(int)) size_t;
+
+  template <class T> class initializer_list;
+
+  template< class T > struct remove_reference      {typedef T type;};
+  template< class T > struct remove_reference<T&>  {typedef T type;};
+  template< class T > struct remove_reference<T&&> {typedef T type;};
+
+  template<typename T> typename remove_reference<T>::type&& move(T&& a);
+  template <typename T> T *__addressof(T &x);
+  template <typename T> T *addressof(T &x);
+  template <typename T> const T& as_const(T& x);
+  template <typename T> T&& forward(T&& x);
+  // FIXME: Declare forward_like
+  // FIXME: Declare move_if_noexcept
+
+  template< class T >
+  using remove_reference_t = typename remove_reference<T>::type;
+
+  template <class T>
+  void swap(T &a, T &b) {
+    T c(std::move(a));
+    a = std::move(b);
+    b = std::move(c);
+  }
+
+  template<typename T>
+  class vector {
+    T *_start;
+    T *_finish;
+    T *_end_of_storage;
+
+  public:
+    typedef T value_type;
+    typedef size_t size_type;
+    typedef __vector_iterator<T, T *, T &> iterator;
+    typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+    vector() : _start(0), _finish(0), _end_of_storage(0) {}
+    template <typename InputIterator>
+    vector(InputIterator first, InputIterator last);
+    vector(const vector &other);
+    vector(vector &&other);
+    ~vector();
+
+    size_t size() const {
+      return size_t(_finish - _start);
+    }
+
+    vector& operator=(const vector &other);
+    vector& operator=(vector &&other);
+    vector& operator=(std::initializer_list<T> ilist);
+
+    void assign(size_type count, const T &value);
+    template <typename InputIterator >
+    void assign(InputIterator first, InputIterator last);
+    void assign(std::initializer_list<T> ilist);
+
+    void clear();
+
+    void push_back(const T &value);
+    void push_back(T &&value);
+    template<class... Args>
+    void emplace_back(Args&&... args);
+    void pop_back();
+
+    iterator insert(const_iterator position, const value_type &val);
+    iterator insert(const_iterator position, size_type n,
+                    const value_type &val);
+    template <typename InputIterator>
+    iterator insert(const_iterator position, InputIterator first,
+                    InputIterator last);
+    iterator insert(const_iterator position, value_type &&val);
+    iterator insert(const_iterator position, initializer_list<value_type> il);
+
+    template <class... Args>
+    iterator emplace(const_iterator position, Args&&... args);
+
+    iterator erase(const_iterator position);
+    iterator erase(const_iterator first, const_iterator last);
+
+    T &operator[](size_t n) {
+      return _start[n];
+    }
+
+    const T &operator[](size_t n) const {
+      return _start[n];
+    }
+
+    iterator begin() { return iterator(_start); }
+    const_iterator begin() const { return const_iterator(_start); }
+    const_iterator cbegin() const { return const_iterator(_start); }
+    iterator end() { return iterator(_finish); }
+    const_iterator end() const { return const_iterator(_finish); }
+    const_iterator cend() const { return const_iterator(_finish); }
+    T& front() { return *begin(); }
+    const T& front() const { return *begin(); }
+    T& back() { return *(end() - 1); }
+    const T& back() const { re...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-tools-extra

Author: None (vabridgers)

Changes

RFC: This PR is not ready to merge yet, preliminary comments are welcome.
Here are some questions to consider...

Q) Is the name and file location ok in the directory structure?
I initially chose bugprone and NondeterministicPointerUsage as a
starting point for review.
Q) I wrote an explanation for the check based on internal discussion
since I felt like this was missing. Please check and comment.
Q) There are more ways to iterate over an unordered container
than captured here. Do those need to be detected as well?
Q) I squashed PointerIteration and PointerSorting. I think that works
in this case, but interested to hear your comments about that.
Q) I ended up expanding upon the C++ simulation header, but had thoughts
about breaking that up into multiple smaller files. Maybe there's an
opportunity to refactor some C++ simulation files across multiple
checkers as a seperate PR first, or maybe even as part of this one?

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as clang-tidy checks wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open source projects.


Patch is 75.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110471.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp (+67)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h (+36)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst (+31)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h (+1450)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/nondeterministic-pointer-usage.cpp (+69)
  • (modified) clang/docs/analyzer/checkers.rst (-31)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (-18)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (-2)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerIterationChecker.cpp (-101)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp (-115)
  • (removed) clang/test/Analysis/ptr-sort.cpp (-36)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index 689eb92a3d8d17..7c177196d76f58 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "MultipleStatementMacroCheck.h"
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
+#include "NondeterministicPointerUsageCheck.h"
 #include "NotNullTerminatedResultCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
@@ -179,6 +180,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<cppcoreguidelines::NarrowingConversionsCheck>(
         "bugprone-narrowing-conversions");
     CheckFactories.registerCheck<NoEscapeCheck>("bugprone-no-escape");
+    CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
+        "bugprone-nondeterministic-pointer-usage");
     CheckFactories.registerCheck<NonZeroEnumToBoolConversionCheck>(
         "bugprone-non-zero-enum-to-bool-conversion");
     CheckFactories.registerCheck<NotNullTerminatedResultCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index cb0d8ae98bac58..5628572b984226 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   MultipleNewInOneExpressionCheck.cpp
   MultipleStatementMacroCheck.cpp
   NoEscapeCheck.cpp
+  NondeterministicPointerUsageCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
   OptionalValueConversionCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
new file mode 100644
index 00000000000000..ddc314af739d97
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
@@ -0,0 +1,67 @@
+//===--- NondetermnisticPointerUsageCheck.cpp - clang-tidy ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NondeterministicPointerUsageCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Lex/Lexer.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+void NondeterministicPointerUsageCheck::registerMatchers(MatchFinder *Finder) {
+
+  auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));
+
+  auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
+      anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
+            hasName("std::unordered_multiset"),
+            hasName("std::unordered_multimap")))))));
+
+  Finder->addMatcher(
+      stmt(cxxForRangeStmt(hasRangeInit(RangeInit.bind("rangeinit")),
+                           hasLoopVariable(LoopVariable.bind("loopVar"))))
+          .bind("cxxForRangeStmt"),
+      this);
+
+  auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
+                         callee(functionDecl(hasName("std::nth_element"))),
+                         callee(functionDecl(hasName("std::sort"))),
+                         callee(functionDecl(hasName("std::partial_sort"))),
+                         callee(functionDecl(hasName("std::partition"))),
+                         callee(functionDecl(hasName("std::stable_partition"))),
+                         callee(functionDecl(hasName("std::stable_sort"))));
+
+  auto IteratesPointerEltsM = hasArgument(
+      0,
+      cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
+          hasCanonicalType(pointsTo(hasCanonicalType(pointerType())))))))))));
+
+  Finder->addMatcher(stmt(callExpr(allOf(SortFuncM, IteratesPointerEltsM)))
+                         .bind("sortsemantic"),
+                     this);
+}
+
+void NondeterministicPointerUsageCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  const auto *ForRangePointers =
+      Result.Nodes.getNodeAs<Stmt>("cxxForRangeStmt");
+  const auto *SortPointers = Result.Nodes.getNodeAs<Stmt>("sortsemantic");
+
+  if ((ForRangePointers) && !(ForRangePointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<CXXForRangeStmt>(ForRangePointers);
+    diag(Node->getRParenLoc(), "Iteration of pointers is nondeterministic");
+  }
+
+  if ((SortPointers) && !(SortPointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<Stmt>(SortPointers);
+    diag(Node->getBeginLoc(), "Sorting pointers is nondeterministic");
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
new file mode 100644
index 00000000000000..556a4fce4dffe2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
@@ -0,0 +1,36 @@
+//===--- NondeterministicPointerUsageCheck.h - clang-tidy ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+
+#include "../ClangTidyCheck.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds temporaries that look like RAII objects.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.html
+class NondeterministicPointerUsageCheck : public ClangTidyCheck {
+public:
+  NondeterministicPointerUsageCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+  std::optional<TraversalKind> getCheckTraversalKind() const override {
+    return TK_IgnoreUnlessSpelledInSource;
+  }
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
new file mode 100644
index 00000000000000..8625487c6d694a
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
@@ -0,0 +1,31 @@
+.. title:: clang-tidy - bugprone-nondeterministic-pointer-usage
+
+nondeterministic-pointer-usage
+==============================
+
+Finds nondeterministic usages of pointers in unordered containers.
+
+One canonical example is iteration across a container of pointers.
+
+.. code-block:: c++
+
+  {
+    for (auto i : UnorderedPtrSet)
+      f(i);
+  }
+
+Another such example is sorting a container of pointers.
+
+.. code-block:: c++
+
+  {
+    std::sort(VectorOfPtr.begin(), VectorOfPtr.end());
+  }
+
+Iteration of a containers of pointers may present the order of different
+pointers differently across different runs of a program. In some cases this
+may be acceptable behavior, in others this may be unexpected behavior. This
+check is advisory for this reason.
+
+This check only detects range-based for loops over unordered sets. Other
+similar usages will not be found and are false negatives.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
new file mode 100644
index 00000000000000..b279a8f20d8ddd
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
@@ -0,0 +1,1450 @@
+// Like the compiler, the static analyzer treats some functions differently if
+// they come from a system header -- for example, it is assumed that system
+// functions do not arbitrarily free() their parameters, and that some bugs
+// found in system headers cannot be fixed by the user and should be
+// suppressed.
+#pragma clang system_header
+
+typedef unsigned char uint8_t;
+
+typedef __typeof__(sizeof(int)) size_t;
+typedef __typeof__((char*)0-(char*)0) ptrdiff_t;
+void *memmove(void *s1, const void *s2, size_t n);
+
+namespace std {
+  typedef size_t size_type;
+#if __cplusplus >= 201103L
+  using nullptr_t = decltype(nullptr);
+#endif
+}
+
+namespace std {
+  struct input_iterator_tag { };
+  struct output_iterator_tag { };
+  struct forward_iterator_tag : public input_iterator_tag { };
+  struct bidirectional_iterator_tag : public forward_iterator_tag { };
+  struct random_access_iterator_tag : public bidirectional_iterator_tag { };
+
+  template <typename Iterator> struct iterator_traits {
+    typedef typename Iterator::difference_type difference_type;
+    typedef typename Iterator::value_type value_type;
+    typedef typename Iterator::pointer pointer;
+    typedef typename Iterator::reference reference;
+    typedef typename Iterator::iterator_category iterator_category;
+  };
+}
+
+template <typename T, typename Ptr, typename Ref> struct __vector_iterator {
+  typedef __vector_iterator<T, T *, T &> iterator;
+  typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __vector_iterator(const Ptr p = 0) : ptr(p) {}
+  __vector_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __vector_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __vector_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __vector_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  template<typename U, typename Ptr2, typename Ref2>
+  difference_type operator-(const __vector_iterator<U, Ptr2, Ref2> &rhs);
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __deque_iterator {
+  typedef __deque_iterator<T, T *, T &> iterator;
+  typedef __deque_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __deque_iterator(const Ptr p = 0) : ptr(p) {}
+  __deque_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __deque_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __deque_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __deque_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __list_iterator {
+  typedef __list_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __list_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::bidirectional_iterator_tag iterator_category;
+
+  __list_iterator(T* it = 0) : item(it) {}
+  __list_iterator(const iterator &rhs): item(rhs.item) {}
+  __list_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __list_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  __list_iterator<T, Ptr, Ref> operator--() { item = item->prev; return *this; }
+  __list_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this;
+    item = item->prev;
+    return tmp;
+  }
+
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __list_iterator;
+
+private:
+  T* item;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __fwdl_iterator {
+  typedef __fwdl_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __fwdl_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::forward_iterator_tag iterator_category;
+
+  __fwdl_iterator(T* it = 0) : item(it) {}
+  __fwdl_iterator(const iterator &rhs): item(rhs.item) {}
+  __fwdl_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __fwdl_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __fwdl_iterator;
+
+private:
+  T* item;
+};
+
+namespace std {
+  template <class T1, class T2>
+  struct pair {
+    T1 first;
+    T2 second;
+
+    pair() : first(), second() {}
+    pair(const T1 &a, const T2 &b) : first(a), second(b) {}
+
+    template<class U1, class U2>
+    pair(const pair<U1, U2> &other) : first(other.first),
+                                      second(other.second) {}
+  };
+
+  template<class T2, class T1>
+  T2& get(pair<T1, T2>& p) ;
+  template<class T1, class T2>
+  T1& get(const pair<T1, T2>& p) ;
+
+  typedef __typeof__(sizeof(int)) size_t;
+
+  template <class T> class initializer_list;
+
+  template< class T > struct remove_reference      {typedef T type;};
+  template< class T > struct remove_reference<T&>  {typedef T type;};
+  template< class T > struct remove_reference<T&&> {typedef T type;};
+
+  template<typename T> typename remove_reference<T>::type&& move(T&& a);
+  template <typename T> T *__addressof(T &x);
+  template <typename T> T *addressof(T &x);
+  template <typename T> const T& as_const(T& x);
+  template <typename T> T&& forward(T&& x);
+  // FIXME: Declare forward_like
+  // FIXME: Declare move_if_noexcept
+
+  template< class T >
+  using remove_reference_t = typename remove_reference<T>::type;
+
+  template <class T>
+  void swap(T &a, T &b) {
+    T c(std::move(a));
+    a = std::move(b);
+    b = std::move(c);
+  }
+
+  template<typename T>
+  class vector {
+    T *_start;
+    T *_finish;
+    T *_end_of_storage;
+
+  public:
+    typedef T value_type;
+    typedef size_t size_type;
+    typedef __vector_iterator<T, T *, T &> iterator;
+    typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+    vector() : _start(0), _finish(0), _end_of_storage(0) {}
+    template <typename InputIterator>
+    vector(InputIterator first, InputIterator last);
+    vector(const vector &other);
+    vector(vector &&other);
+    ~vector();
+
+    size_t size() const {
+      return size_t(_finish - _start);
+    }
+
+    vector& operator=(const vector &other);
+    vector& operator=(vector &&other);
+    vector& operator=(std::initializer_list<T> ilist);
+
+    void assign(size_type count, const T &value);
+    template <typename InputIterator >
+    void assign(InputIterator first, InputIterator last);
+    void assign(std::initializer_list<T> ilist);
+
+    void clear();
+
+    void push_back(const T &value);
+    void push_back(T &&value);
+    template<class... Args>
+    void emplace_back(Args&&... args);
+    void pop_back();
+
+    iterator insert(const_iterator position, const value_type &val);
+    iterator insert(const_iterator position, size_type n,
+                    const value_type &val);
+    template <typename InputIterator>
+    iterator insert(const_iterator position, InputIterator first,
+                    InputIterator last);
+    iterator insert(const_iterator position, value_type &&val);
+    iterator insert(const_iterator position, initializer_list<value_type> il);
+
+    template <class... Args>
+    iterator emplace(const_iterator position, Args&&... args);
+
+    iterator erase(const_iterator position);
+    iterator erase(const_iterator first, const_iterator last);
+
+    T &operator[](size_t n) {
+      return _start[n];
+    }
+
+    const T &operator[](size_t n) const {
+      return _start[n];
+    }
+
+    iterator begin() { return iterator(_start); }
+    const_iterator begin() const { return const_iterator(_start); }
+    const_iterator cbegin() const { return const_iterator(_start); }
+    iterator end() { return iterator(_finish); }
+    const_iterator end() const { return const_iterator(_finish); }
+    const_iterator cend() const { return const_iterator(_finish); }
+    T& front() { return *begin(); }
+    const T& front() const { return *begin(); }
+    T& back() { return *(end() - 1); }
+    const T& back() const { re...
[truncated]

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from faf3644 to 60c0977 Compare September 30, 2024 12:54
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we had something like this. It might be possible, but really really challenging to implement.
To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:

int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}

But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:

for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}

Only looking at the for loop here is not enough to decide this.


Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like is_sorted or partition is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that begin might be an ADL free-function, and not a memer function, such as for int *arr[10]. C++ also has std::ranges::*, which we may wanna cover too in the future.


You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

Copy link
Contributor

@EugeneZelenko EugeneZelenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention new check in Release Notes.

@whisperity whisperity marked this pull request as draft September 30, 2024 19:45
@whisperity
Copy link
Member

Q) I ended up expanding upon the C++ simulation header, but had thoughts about breaking that up into multiple smaller files. Maybe there's an opportunity to refactor some C++ simulation files across multiple checkers as a seperate PR first, or maybe even as part of this one?

@vabridgers Definitely do such things in separate PRs. It makes the current patch cleaner if the additions to the sysroot simulation only contains what is needed for your patch.

Copy link
Member

@whisperity whisperity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, for reference, I dug out the original CSA checkers' initial reviews from the history: https://reviews.llvm.org/D50488 and https://reviews.llvm.org/D59279.)

Even though the removed checks are alpha, maybe their removal is worth a mention in the release notes for CSA, if we have such a thing in the first place. @Szelethus?

@whisperity
Copy link
Member

And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

And the issue here is that, in the extreme case, even a completely omniscient call graph oracle would still fail at proving immunity to non-determinism even in your second case, because the determinism requirement could come from something like FileCheck ran externally to the code that executed a non-deterministic order.

I'm unsure as to what the right approach would be here. I was thinking about instead marking every variable or field declared with a type like known-unordered-container-type <T*> instead of the for loops themselves, but then we are essentially trying to say that this partial specialisation is "deprecated" in a way... "you shouldn't put pointers into this because no matter what you do, it will have a non-deterministic order"... and???

But your second example, @steakhal, touches on the problem nicely. What if we tried to match archetypes of "container elements are printed" or "container elements are sent to the network"? It will never be a complete list, but we could reasonably assume (at least this is a safe-ish generalisation, and for everything else, there is NOLINT...) that if the order escapes via some I/O, then non-determinism is... maybe if not outright bad, but definitely uncanny, if for nothing else, due to the troubles it causes when debugging. 😉

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 60c0977 to 52e3908 Compare September 30, 2024 21:11
@vabridgers
Copy link
Contributor Author

I wish we had something like this. It might be possible, but really really challenging to implement. To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:

int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}

But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:

for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}

Only looking at the for loop here is not enough to decide this.

Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like is_sorted or partition is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that begin might be an ADL free-function, and not a memer function, such as for int *arr[10]. C++ also has std::ranges::*, which we may wanna cover too in the future.

You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

Hi @steakhal , I completed a test run on the LLVM source base and detected no issues with this check. I will be expanding the tests, and wanted to mention I just found

I wish we had something like this. It might be possible, but really really challenging to implement. To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:

int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}

But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:

for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}

Only looking at the for loop here is not enough to decide this.

Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like is_sorted or partition is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that begin might be an ADL free-function, and not a memer function, such as for int *arr[10]. C++ also has std::ranges::*, which we may wanna cover too in the future.

You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

I completed an initial test run on the llvm/clang repo with no crashes or results at all. Meaning no FPs or TPs. I will be expanding the tests to more repos, and would love to hear suggestions if you have any. I suppose abseil would be a good one. Also just found @Xazax-hun 's testbench to help with this here - https://github.com/Xazax-hun/csa-testbench. Really looking forward to trying that :)

Thanks for the comments. Will keep driving this forward.

@carlosgalvezp
Copy link
Contributor

carlosgalvezp commented Sep 30, 2024

creation of an advisory group.

Why isn't misc suitable for this use case? It would be confusing to have both groups I think, and may lead to bike-shedding discussions in the future about whether a check belongs in misc or advisory.

@vabridgers
Copy link
Contributor Author

Why isn't misc suitable for this use case?

bugprone was just an initial thought. If the group think leads us to misc I'm ok with moving in that direction.
Other thoughts?

@carlosgalvezp
Copy link
Contributor

Why isn't misc suitable for this use case?

bugprone was just an initial thought. If the group think leads us to misc I'm ok with moving in that direction. Other thoughts?

Maybe i misunderstood, I thought you meant creating a new clang-tidy module called advisory. bugprone sounds good to me!

@vabridgers
Copy link
Contributor Author

vabridgers commented Oct 6, 2024

As I understand the collective comments so far:

  • bugprone group is fine. I've heard no objections to bugprone so will stay with that.
  • Break up the simulation header files, only include what's actually needed for the test
  • change the checker name to something like bugprone-nondeterministic-pointer-iteration-order
  • provide testing results, expand upon testing
  • Update the documentation to meet requirements
  • I've heard no concrete comments on expanding beyond the few cases this check currently addresses. I'll assume that's ok for now.
  • While there has been at least one comment to add additional patterns, I'd like to keep the scope limited for this PR to just moving the static analysis checks to a single clang-tidy check. Would that be ok with the reviewers?

Next update should address these points. Thanks for the comments!

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 52e3908 to 4d8d105 Compare October 6, 2024 13:31
@5chmidti
Copy link
Contributor

5chmidti commented Oct 8, 2024

I'd like to keep the scope limited for this PR to just moving the static analysis checks to a single clang-tidy check. Would that be ok with the reviewers?

Fine with me, yes, but there needs to be some sort of plan on how to address some of the pointed out challenges with this check, that were raised by others.


It might be possible, but really really challenging to implement.
...
Only looking at the for loop here is not enough ...

+1

because we apply a sideffect that is dependent on the nondeterministic iteration order

What if we tried to match archetypes of "container elements are printed" or "container elements are sent to the network"? It will never be a complete list, but we could reasonably assume (at least this is a safe-ish generalisation, and for everything else, there is NOLINT...) that if the order escapes via some I/O, then non-determinism is... maybe if not outright bad, but definitely uncanny, if for nothing else, due to the troubles it causes when debugging.

That's of course possible, but may grow to 'too many' patterns fast, and then there comes the question of which ones to add and which are considered to be too infrequent. The other issue is that doing this would mean that either clang-tidy has to encode the use of many different libraries, and how side effects can happen by using them, or by providing customization options, which would in turn be limited by the patterns that are considered generic enough to warrant the customizable detection pattern (e.g., printing and formatting, and then there are formatToString member functions as well, that e.g., print into a preallocated buffer).

Maybe Expr::hasSideEffets can be used as a baseline?
I.e.,

for all references r to var of type unordered map
  traverse up the AST until the top-level expression e (not stmt) is found (containing r)
    if e->hasSideEffects()
      diagnose()

(maybe not literally, to avoid using the parent map too much for traversal)

For calling functions, there could also be a (cached) lookup of their bodies (if available), using Expr::hasSideEffects for the DeclRefExpr of the parameter. bugprone-exception-escape is doing something similar, but for throw statements, instead of checking for variable uses with side effects.

@vabridgers
Copy link
Contributor Author

@5chmidti , thanks for the comments. I'll start to work through those.

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 3ab0347 to 1cf734f Compare October 20, 2024 21:45
@vabridgers
Copy link
Contributor Author

Thanks for the comments, I've resolved most, but not all. Not requested updated review comments yet, but if you're so inclined, would appreciate a quick scan to see if if I missed anything. Thanks.

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 1cf734f to ec58222 Compare October 22, 2024 23:56
@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch 2 times, most recently from 6b6f578 to 75b49ca Compare October 23, 2024 09:41
@vabridgers vabridgers requested a review from 5chmidti October 23, 2024 09:42
@vabridgers
Copy link
Contributor Author

Thanks for the review comments. I believe I've resolved all review comments to date, including a small increase in scope for the original change (this check detects a few cases not detectable by the original CSA based check), and am ready for another round of review. @5chmidti , would you mind having a look?

I'm happy to document future changes beyond this check, but would like to keep this PR's scope limited to just porting the original CSA based check to a tidy check as much as possible.

Also, I've not yet completed a full systems test on a large open source project (like a clang build), but will do so after I get some indications this PR is good to proceed with that. I'd like to minimize re-testing if possible since it is not a small task.

Thank you

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch 2 times, most recently from f9142f6 to cb8e452 Compare October 23, 2024 11:58
@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch 3 times, most recently from ec6c1f8 to 95cd675 Compare October 23, 2024 23:33
@whisperity
Copy link
Member

I'm happy to document future changes beyond this check, but would like to keep this PR's scope limited to just porting the original CSA based check to a tidy check as much as possible.

@vabridgers I believe it is perfectly fine to add a "Limitations" section to the documentation where the "future problems" are outlined, even if only to a "mention" level and not as a full solution plan.

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch 2 times, most recently from 3fb376d to 5be6cce Compare October 24, 2024 23:51
Copy link
Contributor

@5chmidti 5chmidti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to document future changes beyond this check, but would like to keep this PR's scope limited to just porting the original CSA based check to a tidy check as much as possible.

+1


Also, I've not yet completed a full systems test on a large open source project (like a clang build), but will do so after I get some indications this PR is good to proceed with that. I'd like to minimize re-testing if possible since it is not a small task.

I've run the current implementation on clang;clang-tools-extra and found a single diagnosed issue for

for (auto &I : FrameSamples) {
and no crashes when running a debug build.

Minus my new comments and maybe documenting the current limitations, this looks good to me. Do you plan to work on those improvements?

Another approval would be nice, but I don't think it should be a blocker for this PR.

Comment on lines 60 to 68
const llvm::StringRef AlgoName = ClassTemplate->getName();
const bool Unordered = AlgoName.contains("unordered");
const bool SetLike = AlgoName.contains("set");
const bool MapLike = AlgoName.contains("map");
const bool IsAlgoArgPointer =
TemplateArgs[0].getAsType()->isPointerType();
assert(Unordered && "Expecting unordered algo!");
assert((SetLike ^ MapLike) &&
"Expecting either SetLike or Maplike Unordered Algo!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: swap Algo for Class

@vabridgers
Copy link
Contributor Author

vabridgers commented Oct 27, 2024

Thanks @5chmidti ! I do plan to work on improvements, but would like to get this baseline checker in for now as a "scope of work" for this PR. I don't mind documenting the improvement steps per @whisperity suggestions, and addressing your most recent comments. Do you have suggestions on where to document improvements? Same as @whisperity ?

I'll plan on a "next iteration" of changes to address your recent comments, and document improvement opportunities.

Thanks for the comments, and happy this found something interesting.

@5chmidti
Copy link
Contributor

I do plan to work on improvements

Great

would like to get this baseline checker in for now as a "scope of work" for this PR

Yep, what I meant as well

I don't mind documenting the improvement step
Do you have suggestions on where to document improvements?

As suggested, simply do this in the docs. Because that documentation is user-facing, you can be very high-level and don't need to point out everything, IMO. E.g.: This check currently does not check if a nondeterministic iteration order is likely to be a mistake, and instead marks all such iterations as bugprone.

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 5be6cce to 4dc4e52 Compare October 27, 2024 23:25
…tidy

This change moves the alpha.nondeterministic.PointerSorting and
alpha.nondeterministic.PointerIteration static analyzer checkers to
a single clang-tidy check. Those checkers were implemented as clang-tidy
checks wrapped in the static analyzer framework. The documentation was
updated to describe what the checks can and cannot do, and testing
was completed on a broad set of open source projects.
@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 4dc4e52 to ba04e96 Compare October 28, 2024 00:56
@vabridgers vabridgers merged commit 3d6923d into llvm:main Oct 28, 2024
9 checks passed
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…k to tidy (llvm#110471)

This change moves the `alpha.nondeterministic.PointerSorting` and
`alpha.nondeterministic.PointerIteration` static analyzer checkers to a
single `clang-tidy` check. Those checkers were implemented as simple
`clang-tidy` check-like code, wrapped in the static analyzer framework.
The documentation was updated to describe what the checks can and cannot
do, and testing was completed on a broad set of open-source projects.

Co-authored-by: Vince Bridgers <[email protected]>
@5chmidti
Copy link
Contributor

5chmidti commented Nov 9, 2024

FYI: #115620 & #115621

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category clang-tidy clang-tools-extra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants