Skip to content

[libc++][hardening] Categorize assertions related to strict weak ordering #77405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

var-const
Copy link
Member

@var-const var-const commented Jan 9, 2024

If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general) to
a sorting algorithm, the algorithm can produce an incorrect result or even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see the related RFC).
As a result, we have three independent sets of checks:

  • assertions that catch out-of-bounds accesses within the algorithms'
    implementation. These are relatively cheap; however, they cannot catch
    the underlying cause and cannot prevent the case where an invalid
    comparator would result in an incorrectly-sorted sequence without
    actually triggering an OOB access;

  • debug comparators that wrap a given comparator and on each comparison
    check that if (a < b), then !(b < a), where < stands for the
    user-provided comparator. This performs up to 2x number of comparisons
    but doesn't affect the algorithmic complexity. While this approach can
    find more issues, it is still a heuristic;

  • a comprehensive check of the comparator that validates up to 100
    elements in the resulting sorted sequence (see the RFC above for
    details). The check is expensive but the 100 element limit can somewhat
    compensate for that, especially for large values of N.

The first set of checks is enabled in the fast hardening mode while the
other two are only enabled in the debug mode.

This patch also removes the
_LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK macro that
previously was used to selectively enable the 100-element check.
Now this check is enabled unconditionally in the debug mode.

Also, introduce a new category
_LIBCPP_ASSERT_SEMANTIC_REQUIREMENT. This category is
intended for checking the semantic requirements from the Standard.
Typically, these are hard or impossible to completely validate, so
these checks are expected to be heuristic in nature and potentially
quite expensive.

See https://reviews.llvm.org/D150264 for additional background.
Fixes #71496

@var-const var-const requested a review from a team as a code owner January 9, 2024 03:11
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jan 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 9, 2024

@llvm/pr-subscribers-libcxx

Author: Konstantin Varlamov (var-const)

Changes

If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general) to
a sorting algorithm, this can produce an incorrect result or even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217).
As a result, we have three independent sets of checks, in order from
least to most expensive:

  • assertions that catch out-of-bounds accesses within the algorithms'
    implementation. These are relatively cheap; however, they cannot catch
    the underlying cause and cannot prevent the case where an invalid
    comparator would result in an uncorrectly-sorted sequence without
    producing an OOB access;

  • debug comparators that wrap a given comparator and on each comparison
    check that if (a &lt; b), then !(b &lt; a), where &lt; stands for the
    user-provided comparator. This performs up to 2x number of comparisons
    but doesn't affect the algorithmic complexity. While this approach can
    find more issues, it is still a heuristic;

  • a comprehensive check of the comparator that validates up to 100
    elements in the resulting sorted sequence (see the RFC above for
    details), imposing a significant performance overhead.

Accordingly, the first set of checks is enabled in the fast hardening
mode, the second in the extensive mode, and the third only in the debug
mode.

For the most expensive checks, introduce a new category
_LIBCPP_ASSERT_INTRUSIVE. This category is intended for expensive
checks that perform intrusive, extensive validation within the
implementation (either of the user input or of invariants or
postconditions of a function).

See https://reviews.llvm.org/D150264 for additional background.


Full diff: https://github.com/llvm/llvm-project/pull/77405.diff

6 Files Affected:

  • (modified) libcxx/include/__algorithm/comp_ref_type.h (+1-1)
  • (modified) libcxx/include/__algorithm/nth_element.h (+4-4)
  • (modified) libcxx/include/__algorithm/sort.h (+11-11)
  • (modified) libcxx/include/__algorithm/three_way_comp_ref_type.h (+6-4)
  • (modified) libcxx/include/__config (+23)
  • (modified) libcxx/include/__debug_utils/strict_weak_ordering_check.h (+5-5)
diff --git a/libcxx/include/__algorithm/comp_ref_type.h b/libcxx/include/__algorithm/comp_ref_type.h
index 15f4a535a30bf0..7550bc3d9d7b9d 100644
--- a/libcxx/include/__algorithm/comp_ref_type.h
+++ b/libcxx/include/__algorithm/comp_ref_type.h
@@ -44,7 +44,7 @@ struct __debug_less {
   _LIBCPP_CONSTEXPR_SINCE_CXX14 inline _LIBCPP_HIDE_FROM_ABI decltype((void)std::declval<_Compare&>()(
       std::declval<_LHS&>(), std::declval<_RHS&>()))
   __do_compare_assert(int, _LHS& __l, _RHS& __r) {
-    _LIBCPP_ASSERT_UNCATEGORIZED(!__comp_(__l, __r), "Comparator does not induce a strict weak ordering");
+    _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(!__comp_(__l, __r), "Comparator does not induce a strict weak ordering");
     (void)__l;
     (void)__r;
   }
diff --git a/libcxx/include/__algorithm/nth_element.h b/libcxx/include/__algorithm/nth_element.h
index a0597051259518..37ddfbdacf044d 100644
--- a/libcxx/include/__algorithm/nth_element.h
+++ b/libcxx/include/__algorithm/nth_element.h
@@ -114,12 +114,12 @@ __nth_element(
         while (true) {
           while (!__comp(*__first, *__i)) {
             ++__i;
-            _LIBCPP_ASSERT_UNCATEGORIZED(
+            _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
                 __i != __last,
                 "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
           }
           do {
-            _LIBCPP_ASSERT_UNCATEGORIZED(
+            _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
                 __j != __first,
                 "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
             --__j;
@@ -150,13 +150,13 @@ __nth_element(
         // __m still guards upward moving __i
         while (__comp(*__i, *__m)) {
           ++__i;
-          _LIBCPP_ASSERT_UNCATEGORIZED(
+          _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
               __i != __last,
               "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
         }
         // It is now known that a guard exists for downward moving __j
         do {
-          _LIBCPP_ASSERT_UNCATEGORIZED(
+          _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
               __j != __first,
               "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
           --__j;
diff --git a/libcxx/include/__algorithm/sort.h b/libcxx/include/__algorithm/sort.h
index ac47489af0aac9..451133a2d193dd 100644
--- a/libcxx/include/__algorithm/sort.h
+++ b/libcxx/include/__algorithm/sort.h
@@ -325,7 +325,7 @@ __insertion_sort_unguarded(_RandomAccessIterator const __first, _RandomAccessIte
       do {
         *__j = _Ops::__iter_move(__k);
         __j  = __k;
-        _LIBCPP_ASSERT_UNCATEGORIZED(
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
             __k != __leftmost,
             "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
       } while (__comp(__t, *--__k)); // No need for bounds check due to the assumption stated above.
@@ -544,7 +544,7 @@ __bitset_partition(_RandomAccessIterator __first, _RandomAccessIterator __last,
     // Not guarded since we know the last element is greater than the pivot.
     do {
       ++__first;
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __first != __end,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
     } while (!__comp(__pivot, *__first));
@@ -557,7 +557,7 @@ __bitset_partition(_RandomAccessIterator __first, _RandomAccessIterator __last,
     // It will be always guarded because __introsort will do the median-of-three
     // before calling this.
     do {
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __last != __begin,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
       --__last;
@@ -635,7 +635,7 @@ __partition_with_equals_on_right(_RandomAccessIterator __first, _RandomAccessIte
   // this.
   do {
     ++__first;
-    _LIBCPP_ASSERT_UNCATEGORIZED(
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
         __first != __end,
         "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
   } while (__comp(*__first, __pivot));
@@ -647,7 +647,7 @@ __partition_with_equals_on_right(_RandomAccessIterator __first, _RandomAccessIte
   } else {
     // Guarded.
     do {
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __last != __begin,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
       --__last;
@@ -665,12 +665,12 @@ __partition_with_equals_on_right(_RandomAccessIterator __first, _RandomAccessIte
     _Ops::iter_swap(__first, __last);
     do {
       ++__first;
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __first != __end,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
     } while (__comp(*__first, __pivot));
     do {
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __last != __begin,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
       --__last;
@@ -702,7 +702,7 @@ __partition_with_equals_on_left(_RandomAccessIterator __first, _RandomAccessIter
     // Guarded.
     do {
       ++__first;
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __first != __end,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
     } while (!__comp(__pivot, *__first));
@@ -715,7 +715,7 @@ __partition_with_equals_on_left(_RandomAccessIterator __first, _RandomAccessIter
     // It will be always guarded because __introsort will do the
     // median-of-three before calling this.
     do {
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __last != __begin,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
       --__last;
@@ -725,12 +725,12 @@ __partition_with_equals_on_left(_RandomAccessIterator __first, _RandomAccessIter
     _Ops::iter_swap(__first, __last);
     do {
       ++__first;
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __first != __end,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
     } while (!__comp(__pivot, *__first));
     do {
-      _LIBCPP_ASSERT_UNCATEGORIZED(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
           __last != __begin,
           "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
       --__last;
diff --git a/libcxx/include/__algorithm/three_way_comp_ref_type.h b/libcxx/include/__algorithm/three_way_comp_ref_type.h
index 8fd15c5d66f2fc..19533270b497d2 100644
--- a/libcxx/include/__algorithm/three_way_comp_ref_type.h
+++ b/libcxx/include/__algorithm/three_way_comp_ref_type.h
@@ -50,15 +50,17 @@ struct __debug_three_way_comp {
       __expected = _Order::greater;
     if (__o == _Order::greater)
       __expected = _Order::less;
-    _LIBCPP_ASSERT_UNCATEGORIZED(__comp_(__l, __r) == __expected, "Comparator does not induce a strict weak ordering");
+    _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(
+        __comp_(__l, __r) == __expected, "Comparator does not induce a strict weak ordering");
     (void)__l;
     (void)__r;
   }
 };
 
-// Pass the comparator by lvalue reference. Or in debug mode, using a
-// debugging wrapper that stores a reference.
-#  if _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG
+// Pass the comparator by lvalue reference. Or in the extensive hardening mode and above, using a debugging wrapper that
+// stores a reference.
+#  if _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_EXTENSIVE ||                                                    \
+      _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG
 template <class _Comp>
 using __three_way_comp_ref_type = __debug_three_way_comp<_Comp>;
 #  else
diff --git a/libcxx/include/__config b/libcxx/include/__config
index 082c73e672c749..f2b60f9fad3645 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -283,9 +283,20 @@
 // - `_LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR` -- checks any operations that exchange nodes between containers to make sure
 //   the containers have compatible allocators.
 //
+// - `_LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN` -- checks that the given argument is within the domain of valid arguments
+//   for the function. Violating this typically produces an incorrect result (e.g. the clamp algorithm returns the
+//   original value without clamping it due to incorrect functors) or puts an object into an invalid state (e.g.
+//   a string view where only a subset of elements is possible to access). This doesn't cause an immediate issue within
+//   the library but is always a logic bug and is likely to cause problems within user code.
+//   This is somewhat of a catch-all (or fallback) category -- it covers errors triggered by user input that don't have
+//   a more specific category defined (which is always preferable when available).
+//
 // - `_LIBCPP_ASSERT_PEDANTIC` -- checks prerequisites which are imposed by the Standard, but violating which happens to
 //   be benign in our implementation.
 //
+// - `_LIBCPP_ASSERT_INTRUSIVE` -- for assertions that perform intrusive and typically very expensive validations of
+//   user input or function results.
+//
 // - `_LIBCPP_ASSERT_INTERNAL` -- checks that internal invariants of the library hold. These assertions don't depend on
 //   user input.
 //
@@ -327,8 +338,10 @@ _LIBCPP_HARDENING_MODE_DEBUG
 // Overlapping ranges will make algorithms produce incorrect results but don't directly lead to a security
 // vulnerability.
 #    define _LIBCPP_ASSERT_NON_OVERLAPPING_RANGES(expression, message)   _LIBCPP_ASSUME(expression)
+#    define _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(expression, message)   _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR(expression, message)     _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                 _LIBCPP_ASSUME(expression)
+#    define _LIBCPP_ASSERT_INTRUSIVE(expression, message)                _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                 _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)            _LIBCPP_ASSUME(expression)
 
@@ -341,23 +354,31 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(expression, message)     _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_NON_NULL(expression, message)                 _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_NON_OVERLAPPING_RANGES(expression, message)   _LIBCPP_ASSERT(expression, message)
+#    define _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(expression, message)   _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR(expression, message)     _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)            _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                 _LIBCPP_ASSERT(expression, message)
 // Disabled checks.
+#    define _LIBCPP_ASSERT_INTRUSIVE(expression, message)                _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                 _LIBCPP_ASSUME(expression)
 
 // Debug hardening mode checks.
 
 #  elif _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG
 
+#ifndef _LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK
+#define _LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK
+#endif
+
 // All checks enabled.
 #    define _LIBCPP_ASSERT_VALID_INPUT_RANGE(expression, message)         _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(expression, message)      _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_NON_NULL(expression, message)                  _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_NON_OVERLAPPING_RANGES(expression, message)    _LIBCPP_ASSERT(expression, message)
+#    define _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(expression, message)    _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR(expression, message)      _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                  _LIBCPP_ASSERT(expression, message)
+#    define _LIBCPP_ASSERT_INTRUSIVE(expression, message)                 _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                  _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)             _LIBCPP_ASSERT(expression, message)
 
@@ -370,8 +391,10 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(expression, message)      _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_NON_NULL(expression, message)                  _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_NON_OVERLAPPING_RANGES(expression, message)    _LIBCPP_ASSUME(expression)
+#    define _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(expression, message)    _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR(expression, message)      _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                  _LIBCPP_ASSUME(expression)
+#    define _LIBCPP_ASSERT_INTRUSIVE(expression, message)                 _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                  _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)             _LIBCPP_ASSUME(expression)
 
diff --git a/libcxx/include/__debug_utils/strict_weak_ordering_check.h b/libcxx/include/__debug_utils/strict_weak_ordering_check.h
index 99200ad65eac92..7082019d0c61ae 100644
--- a/libcxx/include/__debug_utils/strict_weak_ordering_check.h
+++ b/libcxx/include/__debug_utils/strict_weak_ordering_check.h
@@ -31,7 +31,7 @@ __check_strict_weak_ordering_sorted(_RandomAccessIterator __first, _RandomAccess
   using _Comp_ref = __comp_ref_type<_Comp>;
   if (!__libcpp_is_constant_evaluated()) {
     // Check if the range is actually sorted.
-    _LIBCPP_ASSERT_UNCATEGORIZED(
+    _LIBCPP_ASSERT_INTRUSIVE(
         (std::is_sorted<_RandomAccessIterator, _Comp_ref>(__first, __last, _Comp_ref(__comp))),
         "The range is not sorted after the sort, your comparator is not a valid strict-weak ordering");
     // Limit the number of elements we need to check.
@@ -46,18 +46,18 @@ __check_strict_weak_ordering_sorted(_RandomAccessIterator __first, _RandomAccess
       // Check that the elements from __p to __q are equal between each other.
       for (__diff_t __b = __p; __b < __q; ++__b) {
         for (__diff_t __a = __p; __a <= __b; ++__a) {
-          _LIBCPP_ASSERT_UNCATEGORIZED(
+          _LIBCPP_ASSERT_INTRUSIVE(
               !__comp(*(__first + __a), *(__first + __b)), "Your comparator is not a valid strict-weak ordering");
-          _LIBCPP_ASSERT_UNCATEGORIZED(
+          _LIBCPP_ASSERT_INTRUSIVE(
               !__comp(*(__first + __b), *(__first + __a)), "Your comparator is not a valid strict-weak ordering");
         }
       }
       // Check that elements between __p and __q are less than between __q and __size.
       for (__diff_t __a = __p; __a < __q; ++__a) {
         for (__diff_t __b = __q; __b < __size; ++__b) {
-          _LIBCPP_ASSERT_UNCATEGORIZED(
+          _LIBCPP_ASSERT_INTRUSIVE(
               __comp(*(__first + __a), *(__first + __b)), "Your comparator is not a valid strict-weak ordering");
-          _LIBCPP_ASSERT_UNCATEGORIZED(
+          _LIBCPP_ASSERT_INTRUSIVE(
               !__comp(*(__first + __b), *(__first + __a)), "Your comparator is not a valid strict-weak ordering");
         }
       }

@var-const var-const added the hardening Issues related to the hardening effort label Jan 9, 2024
@var-const
Copy link
Member Author

Cc @danlark1

…ring

If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general) to
a sorting algorithm, this can produce an incorrect result or even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217).
As a result, we have three independent sets of checks, in order from
least to most expensive:

- assertions that catch out-of-bounds accesses within the algorithms'
  implementation. These are relatively cheap; however, they cannot catch
  the underlying cause and cannot prevent the case where an invalid
  comparator would result in an uncorrectly-sorted sequence without
  producing an OOB access;

- debug comparators that wrap a given comparator and on each comparison
  check that if `(a < b)`, then `!(b < a)`, where `<` stands for the
  user-provided comparator. This performs up to 2x number of comparisons
  but doesn't affect the algorithmic complexity. While this approach can
  find more issues, it is still a heuristic;

- a comprehensive check of the comparator that validates up to 100
  elements in the resulting sorted sequence (see the RFC above for
  details), imposing a significant performance overhead.

Accordingly, the first set of checks is enabled in the fast hardening
mode, the second in the extensive mode, and the third only in the debug
mode.

For the most expensive checks, introduce a new category
`_LIBCPP_ASSERT_INTRUSIVE`. This category is intended for expensive
checks that perform intrusive, extensive validation within the
implementation (either of the user input or of invariants or
postconditions of a function).

See https://reviews.llvm.org/D150264 for additional background.
@var-const var-const force-pushed the varconst/hardening-categorization-ordering branch from b265633 to f6a3ba6 Compare January 9, 2024 03:22
@@ -283,9 +283,20 @@
// - `_LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR` -- checks any operations that exchange nodes between containers to make sure
// the containers have compatible allocators.
//
// - `_LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN` -- checks that the given argument is within the domain of valid arguments
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This category is also added by #77183 (I made these sibling patches to make managing the patches a little easier for me).

// - `_LIBCPP_ASSERT_PEDANTIC` -- checks prerequisites which are imposed by the Standard, but violating which happens to
// be benign in our implementation.
//
// - `_LIBCPP_ASSERT_INTRUSIVE` -- for assertions that perform intrusive and typically very expensive validations of
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a separate category for expensive assertions makes sense -- I'm a little hesitant to add a separate category for strict weak ordering specifically, thinking it might be too narrow (but open to be persuaded otherwise). Some alternative names:

  • expensive -- I think it's a reasonable name but "intrusive" seemed a little more specific to me (I expect these kinds of checks to be intrusive, although if that's not true, that would be an argument against "intrusive");
  • debug -- while I expect these to only be enabled in the debug mode, I think we should still aim to maintain some orthogonality between categories and modes;
  • extensive -- overlaps with the category name in a confusing manner;
  • paranoid -- there's definitely precedent for this term in the security field but perhaps I'd prefer a more neutral-sounding name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a huge fan of the word "intrusive", but I could live with it. However, I would suggest maybe piggy-backing on the notion of a "semantic constraint" or "semantic requirement" would be fruitful, since that's what this strict-weak-ordering checking is doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with _LIBCPP_ASSERT_SEMANTIC_REQUIREMENT -- I think this captures the intention well and implies two important properties of the check (usually a heuristic and often expensive).

template <class _Comp>
// Pass the comparator by lvalue reference. Or in the extensive hardening mode and above, using a debugging wrapper that
// stores a reference.
#if _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_EXTENSIVE || _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate and breaks the elegant separation between assertion categories and hardening modes to some extent. However, even if the assertion is a no-op, I'm not sure we can allow the potential extra overhead of the comparator wrapper in the unchecked or fast modes (keeping in mind that not all builds are optimized).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @danlark1 's comment, I suggest we start by keeping this in the debug mode only. This is also the status quo, so it's more conservative.

// - `_LIBCPP_ASSERT_PEDANTIC` -- checks prerequisites which are imposed by the Standard, but violating which happens to
// be benign in our implementation.
//
// - `_LIBCPP_ASSERT_INTRUSIVE` -- for assertions that perform intrusive and typically very expensive validations of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a huge fan of the word "intrusive", but I could live with it. However, I would suggest maybe piggy-backing on the notion of a "semantic constraint" or "semantic requirement" would be fruitful, since that's what this strict-weak-ordering checking is doing.

# if _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG
// Pass the comparator by lvalue reference. Or in the extensive hardening mode and above, using a debugging wrapper that
// stores a reference.
# if _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_EXTENSIVE || \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danlark1 @var-const I am not certain about this decision:

  • Extensive mode -> turn on __debug_less
  • Debug mode -> also turn on the 100-or-less algorithm checking

I feel like real-world experience might help us tell whether this is the right split. @danlark1 do you have any insight? Other options would be:

  • Extensive mode -> turn on the 100-or-less algorithm checking
  • Debug mode -> also turn on __debug_less

or

  • Extensive mode -> nothing
  • Debug mode -> turn on both

For example, I know that Chromium is using the Extensive mode (probably still via the Safe mode enablement). @nico Would you expect Chrome in production to use __debug_less, the 100-or-less algorithm, both or neither?

Copy link
Contributor

@danlark1 danlark1 Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer enabling both in debug mode in order to remove the macro

The second best option is to turn on the 100 elements checks for a more extensive than debug mode. Because if __debug_less asserts, the 100 element check will almost certainly assert.

Edit(ldionne): reformulated for clarity, this looks like this was typed on a phone

@ldionne ldionne added this to the LLVM 18.0.X Release milestone Jan 16, 2024
@var-const
Copy link
Member Author

Pinging @nico and @danlark1 again -- we would strongly appreciate your input on this patch if possible.

@vitalybuka vitalybuka requested a review from nico January 21, 2024 00:34
@vitalybuka
Copy link
Collaborator

I suspect we will have to wait ~1mo for @danlark1 reply.
Maybe @nilayvaish can comment?

Copy link

github-actions bot commented Jan 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -360,11 +365,13 @@ _LIBCPP_HARDENING_MODE_DEBUG
# define _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(expression, message) _LIBCPP_ASSERT(expression, message)
# define _LIBCPP_ASSERT_NON_NULL(expression, message) _LIBCPP_ASSERT(expression, message)
# define _LIBCPP_ASSERT_NON_OVERLAPPING_RANGES(expression, message) _LIBCPP_ASSERT(expression, message)
# define _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(expression, message) _LIBCPP_ASSERT(expression, message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a leftover.

@@ -52,204 +51,234 @@

class ComparisonResults {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to take the parts of this test that would reproduce an OOB access inside std::sort & friends to a new test file. And that one should be enabled whenever the valid-element-access checks are enabled.

And then this test becomes a test for the debug mode, so we should be able to remove XFAIL: libcpp-hardening-mode=debug.

// algorithm actually runs and goes out of bounds, so the test will terminate before the tested assertions are
// triggered.
// UNSUPPORTED: libcpp-hardening-mode=none, libcpp-hardening-mode=debug
// XFAIL: libcpp-hardening-mode=debug && availability-verbose_abort-missing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This XFAIL is superfluous since the test is UNSUPPORTED in the debug mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately wrote it that way with the idea that it's easier to just blindly annotate all hardening tests; it's also a little more future-proof if a check gets reclassified. I plan to add a Lit feature to cover all the check_assertion.h-related feature together, which will hopefully make this moot. Unless you feel strongly about it, I'd rather leave this check as is.

std::vector<std::size_t*> copy = fixture.create_elements();
TEST_LIBCPP_ASSERT_FAILURE(
std::sort(copy.begin(), copy.end(), fixture.checked_predicate()),
"Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
Copy link
Member

@ldionne ldionne Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should figure out why this doesn't fail earlier with the strict weak ordering checks, but this can be done in a follow-up. (tracked by #79080)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for filing the issue! Wrote a comment -- briefly, this seems to be expected given our current state, but our current state is probably not ideal. What I'm trying to say, this is a little unexpected but nothing seems horribly broken.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM once the CI is green. I like that we split up the OOB checks again, I think it was always a mistake to merge them with the strict-weak-ordering checks since it led to the test essentially never being enabled. This is much better.

@var-const
Copy link
Member Author

I will go ahead and merge while waiting for the backlogged Windows and FreeBSD jobs to finish (everything else is green).

@var-const var-const merged commit 8938bc0 into llvm:main Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardening Issues related to the hardening effort libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK should be made part of some hardening modes
5 participants