Skip to content

[libc++] Add tombstone traits and use them in optional #98498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

philnik777
Copy link
Contributor

@philnik777 philnik777 commented Jul 11, 2024

This adds tombstone traits, which describe how (and whether) the bit representation of a given type can be checked to know whether there is a valid object of that type in a memory location. This can be used to optimize the size of sum types like optional and variant. This patch only optimizes optional to keep it as small as possible.


_LIBCPP_HIDE_FROM_ABI constexpr bool __is_engaged() const noexcept {
if (__libcpp_is_constant_evaluated())
return !__builtin_constant_p(__tombstone_.__is_disengaged_ == _TombstoneLayout::__disengaged_value_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get the compiler folks to actually stamp this kind of usage of __builtin_constant_p, just to make sure we don't get ourselves in trouble.

static constexpr size_t __is_disengaged_offset_ = 0;
};

struct __tombstone_pointer_layout {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of defining __tombstone_pointer_layout since there are unwritten assumptions you're making in that type (namely the alignment of the pointed-to type).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the new name makes it better. I'd rather avoid conditionally defining __is_disengaged_offset_ everywhere.

@philnik777 philnik777 changed the title [libc++] Add tombstone traits to use in optional, variant [libc++] Add tombstone traits to use in optional Nov 7, 2024
@philnik777 philnik777 changed the title [libc++] Add tombstone traits to use in optional [libc++] Add tombstone traits and use them in optional Nov 7, 2024
Copy link

github-actions bot commented Nov 7, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 811186764d1add4d83972db3ad0d2e7c96bb15a7 0ec30761300b6ef974deea804b217cca5b506bf7 --extensions ,cpp,h -- libcxx/include/__memory/tombstone_traits.h libcxx/test/std/utilities/optional/tombstone_types.pass.cpp libcxx/include/__configuration/abi.h libcxx/include/__functional/reference_wrapper.h libcxx/include/__locale libcxx/include/__memory/shared_ptr.h libcxx/include/__memory/unique_ptr.h libcxx/include/__type_traits/enable_if.h libcxx/include/__type_traits/is_fundamental.h libcxx/include/__utility/pair.h libcxx/include/__vector/vector.h libcxx/include/optional libcxx/include/string libcxx/include/string_view libcxx/test/libcxx/utilities/optional/optional.object/optional_size.pass.cpp
View the diff from clang-format here.
diff --git a/libcxx/include/string b/libcxx/include/string
index 59a35877ef..681409b7bd 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -2336,7 +2336,7 @@ basic_string(from_range_t, _Range&&, _Allocator = _Allocator())
 template <class _CharT, class _Traits, class _Allocator>
 struct __tombstone_traits<basic_string<_CharT, _Traits, _Allocator>> {
   static constexpr uint8_t __disengaged_value_ = 65;
-#ifdef _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
+#  ifdef _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
   static constexpr size_t __is_disengaged_offset_ = -1;
   static_assert(__is_disengaged_offset_ == 0,
                 "The disengaged offset for the alternate string layout isn't correctly set");

@AaronBallman AaronBallman self-requested a review November 7, 2024 16:02
@philnik777 philnik777 force-pushed the tombstone_traits branch 2 times, most recently from 81cca50 to 2580273 Compare November 8, 2024 11:46
@philnik777 philnik777 marked this pull request as ready for review November 8, 2024 12:25
@philnik777 philnik777 requested a review from a team as a code owner November 8, 2024 12:25
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 8, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

This adds tombstone traits, which describe how (and whether) the bit representation of a given type can be checked to know whether there is a valid object of that type in a memory location. This can be used to optimize the size of sum types like optional and variant. This patch only optimizes optional to keep it as small as possible.


Patch is 33.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/98498.diff

16 Files Affected:

  • (modified) libcxx/include/CMakeLists.txt (+1)
  • (modified) libcxx/include/__configuration/abi.h (+5)
  • (modified) libcxx/include/__functional/reference_wrapper.h (+10-1)
  • (modified) libcxx/include/__locale (+6)
  • (modified) libcxx/include/__memory/shared_ptr.h (+17-2)
  • (added) libcxx/include/__memory/tombstone_traits.h (+277)
  • (modified) libcxx/include/__memory/unique_ptr.h (+11)
  • (modified) libcxx/include/__type_traits/enable_if.h (+3)
  • (modified) libcxx/include/__utility/pair.h (+19-1)
  • (modified) libcxx/include/__vector/vector.h (+8)
  • (modified) libcxx/include/module.modulemap (+4)
  • (modified) libcxx/include/optional (+65-3)
  • (modified) libcxx/include/string (+14)
  • (modified) libcxx/include/string_view (+8)
  • (modified) libcxx/test/libcxx/utilities/optional/optional.object/optional_size.pass.cpp (+29)
  • (added) libcxx/test/std/utilities/optional/tombstone_types.pass.cpp (+68)
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index ae2e8bcb32aaa4..aca8ea182ec511 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -560,6 +560,7 @@ set(files
   __memory/swap_allocator.h
   __memory/temp_value.h
   __memory/temporary_buffer.h
+  __memory/tombstone_traits.h
   __memory/uninitialized_algorithms.h
   __memory/unique_ptr.h
   __memory/unique_temporary_buffer.h
diff --git a/libcxx/include/__configuration/abi.h b/libcxx/include/__configuration/abi.h
index c6ef6fdcdf96e6..124e349e09024b 100644
--- a/libcxx/include/__configuration/abi.h
+++ b/libcxx/include/__configuration/abi.h
@@ -124,6 +124,8 @@
 // This setting disables the addition of such artificial padding, leading to a more optimal
 // representation for several types.
 #  define _LIBCPP_ABI_NO_COMPRESSED_PAIR_PADDING
+// Use __tombstone_traits to optimize the memory layout of std::optional.
+#  define _LIBCPP_ABI_OPTIONAL_USE_TOMBSTONE_TRAITS
 #elif _LIBCPP_ABI_VERSION == 1
 #  if !(defined(_LIBCPP_OBJECT_FORMAT_COFF) || defined(_LIBCPP_OBJECT_FORMAT_XCOFF))
 // Enable compiling copies of now inline methods into the dylib to support
@@ -141,6 +143,9 @@
 #  if defined(__FreeBSD__) && __FreeBSD__ < 14
 #    define _LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR
 #  endif
+
+// TODO: This shouldn't be in the final commit - this just to test the changes across all the different configurations
+#  define _LIBCPP_ABI_OPTIONAL_USE_TOMBSTONE_TRAITS
 #endif
 
 // We had some bugs where we use [[no_unique_address]] together with construct_at,
diff --git a/libcxx/include/__functional/reference_wrapper.h b/libcxx/include/__functional/reference_wrapper.h
index a4a66a50cf84ca..c150bbc7736db0 100644
--- a/libcxx/include/__functional/reference_wrapper.h
+++ b/libcxx/include/__functional/reference_wrapper.h
@@ -13,8 +13,10 @@
 #include <__compare/synth_three_way.h>
 #include <__concepts/boolean_testable.h>
 #include <__config>
+#include <__cstddef/size_t.h>
 #include <__functional/weak_result_type.h>
 #include <__memory/addressof.h>
+#include <__memory/tombstone_traits.h>
 #include <__type_traits/enable_if.h>
 #include <__type_traits/invoke.h>
 #include <__type_traits/is_const.h>
@@ -22,6 +24,7 @@
 #include <__type_traits/void_t.h>
 #include <__utility/declval.h>
 #include <__utility/forward.h>
+#include <cstdint>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
@@ -120,7 +123,13 @@ class _LIBCPP_TEMPLATE_VIS reference_wrapper : public __weak_result_type<_Tp> {
 #if _LIBCPP_STD_VER >= 17
 template <class _Tp>
 reference_wrapper(_Tp&) -> reference_wrapper<_Tp>;
-#endif
+
+template <class _Tp>
+struct __tombstone_traits<reference_wrapper<_Tp>> {
+  static constexpr uintptr_t __disengaged_value_  = 0;
+  static constexpr size_t __is_disengaged_offset_ = 0;
+};
+#endif // _LIBCPP_STD_VER >= 17
 
 template <class _Tp>
 inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 reference_wrapper<_Tp> ref(_Tp& __t) _NOEXCEPT {
diff --git a/libcxx/include/__locale b/libcxx/include/__locale
index b07b9f3329f42c..3a40dc24ad1116 100644
--- a/libcxx/include/__locale
+++ b/libcxx/include/__locale
@@ -13,6 +13,7 @@
 #include <__config>
 #include <__locale_dir/locale_base_api.h>
 #include <__memory/shared_ptr.h> // __shared_count
+#include <__memory/tombstone_traits.h>
 #include <__mutex/once_flag.h>
 #include <__type_traits/make_unsigned.h>
 #include <__utility/no_destroy.h>
@@ -114,6 +115,11 @@ private:
   friend const _Facet& use_facet(const locale&);
 };
 
+#if _LIBCPP_STD_VER >= 17
+template <>
+struct __tombstone_traits<locale> : __tombstone_traits_assume_aligned_pointer {};
+#endif
+
 class _LIBCPP_EXPORTED_FROM_ABI locale::facet : public __shared_count {
 protected:
   _LIBCPP_HIDE_FROM_ABI explicit facet(size_t __refs = 0) : __shared_count(static_cast<long>(__refs) - 1) {}
diff --git a/libcxx/include/__memory/shared_ptr.h b/libcxx/include/__memory/shared_ptr.h
index 5a84c2ce9bfe17..1e88e38b6804c9 100644
--- a/libcxx/include/__memory/shared_ptr.h
+++ b/libcxx/include/__memory/shared_ptr.h
@@ -30,6 +30,7 @@
 #include <__memory/compressed_pair.h>
 #include <__memory/construct_at.h>
 #include <__memory/pointer_traits.h>
+#include <__memory/tombstone_traits.h>
 #include <__memory/uninitialized_algorithms.h>
 #include <__memory/unique_ptr.h>
 #include <__type_traits/add_lvalue_reference.h>
@@ -831,7 +832,14 @@ template <class _Tp>
 shared_ptr(weak_ptr<_Tp>) -> shared_ptr<_Tp>;
 template <class _Tp, class _Dp>
 shared_ptr(unique_ptr<_Tp, _Dp>) -> shared_ptr<_Tp>;
-#endif
+
+template <class _Tp>
+struct __tombstone_traits<shared_ptr<_Tp>> {
+  static constexpr auto __disengaged_value_ = __tombstone_traits_assume_aligned_pointer::__disengaged_value_;
+  static constexpr size_t __is_disengaged_offset_ =
+      sizeof(void*) + __tombstone_traits_assume_aligned_pointer::__is_disengaged_offset_;
+};
+#endif // _LIBCPP_STD_VER >= 17
 
 //
 // std::allocate_shared and std::make_shared
@@ -1381,7 +1389,14 @@ class _LIBCPP_SHARED_PTR_TRIVIAL_ABI _LIBCPP_TEMPLATE_VIS weak_ptr {
 #if _LIBCPP_STD_VER >= 17
 template <class _Tp>
 weak_ptr(shared_ptr<_Tp>) -> weak_ptr<_Tp>;
-#endif
+
+template <class _Tp>
+struct __tombstone_traits<weak_ptr<_Tp>> {
+  static constexpr auto __disengaged_value_ = __tombstone_traits_assume_aligned_pointer::__disengaged_value_;
+  static constexpr size_t __is_disengaged_offset_ =
+      sizeof(void*) + __tombstone_traits_assume_aligned_pointer::__is_disengaged_offset_;
+};
+#endif // _LIBCPP_STD_VER >= 17
 
 template <class _Tp>
 inline _LIBCPP_CONSTEXPR weak_ptr<_Tp>::weak_ptr() _NOEXCEPT : __ptr_(nullptr), __cntrl_(nullptr) {}
diff --git a/libcxx/include/__memory/tombstone_traits.h b/libcxx/include/__memory/tombstone_traits.h
new file mode 100644
index 00000000000000..4cb05ab3612ee9
--- /dev/null
+++ b/libcxx/include/__memory/tombstone_traits.h
@@ -0,0 +1,277 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___TYPE_TRAITS_DISENGAGED_TRAITS_H
+#define _LIBCPP___TYPE_TRAITS_DISENGAGED_TRAITS_H
+
+#include <__assert>
+#include <__config>
+#include <__cstddef/size_t.h>
+#include <__memory/construct_at.h>
+#include <__type_traits/datasizeof.h>
+#include <__type_traits/enable_if.h>
+#include <__type_traits/is_constant_evaluated.h>
+#include <__type_traits/is_fundamental.h>
+#include <__type_traits/is_integral.h>
+#include <__type_traits/is_trivial.h>
+#include <__type_traits/is_trivially_destructible.h>
+#include <__type_traits/remove_cv.h>
+#include <__type_traits/void_t.h>
+#include <__utility/forward.h>
+#include <__utility/move.h>
+#include <__utility/piecewise_construct.h>
+#include <cstdint>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+#if _LIBCPP_STD_VER >= 17
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+template <class>
+struct __tombstone_traits;
+
+// bools have always exactly one bit set. If there is more than one set it's disengaged.
+template <>
+struct __tombstone_traits<bool> {
+  static constexpr uint8_t __disengaged_value_    = 3;
+  static constexpr size_t __is_disengaged_offset_ = 0;
+};
+
+struct __tombstone_traits_assume_aligned_pointer {
+  static constexpr uint8_t __disengaged_value_ = 1;
+#ifdef _LIBCPP_LITTLE_ENDIAN
+  static constexpr size_t __is_disengaged_offset_ = 0;
+#else
+  static constexpr size_t __is_disengaged_offset_ = sizeof(void*) - 1;
+#endif
+};
+
+// TODO: Look into
+// - filesystem::directory_iterator
+// - vector<T> with alignof(T) == 1
+// - string_view (basic_string_view<T> works with alignof(T) >= 2)
+
+// This is constrained on fundamental types because we might not always know the alignment of a user-defined type.
+// For example, in one TU there may only be a forward declaration and in another there is already the definition
+// available. If we made this optimization conditional on the completeness of the type this would result in a non-benign
+// ODR violation.
+template <class _Tp>
+struct __tombstone_traits<__enable_specialization_if<is_fundamental_v<_Tp> && alignof(_Tp) >= 2, _Tp*>>
+    : __tombstone_traits_assume_aligned_pointer {};
+
+template <class _Tp>
+struct __tombstone_traits<_Tp**> : __tombstone_traits_assume_aligned_pointer {
+  static_assert(alignof(_Tp*) >= 2, "alignment of a pointer isn't at least 2!?");
+};
+
+inline constexpr struct __init_engaged_t {
+} __init_engaged;
+inline constexpr struct __init_disengaged_t {
+} __init_disengaged;
+
+template <class _Tp, class _Payload, bool = __tombstone_traits<_Tp>::__is_disengaged_offset_ == 0>
+struct __tombstone_is_disengaged {
+  using _TombstoneLayout = __tombstone_traits<_Tp>;
+  using _IsDisengagedT   = remove_cv_t<decltype(_TombstoneLayout::__disengaged_value_)>;
+
+  char __padding_[_TombstoneLayout::__is_disengaged_offset_];
+  _IsDisengagedT __is_disengaged_;
+};
+
+template <class _Tp, class _Payload>
+struct __tombstone_is_disengaged<_Tp, _Payload, true> {
+  using _TombstoneLayout = __tombstone_traits<_Tp>;
+  using _IsDisengagedT   = remove_cv_t<decltype(_TombstoneLayout::__disengaged_value_)>;
+
+  _IsDisengagedT __is_disengaged_;
+};
+
+template <class _Tp, class _Payload, bool = __tombstone_traits<_Tp>::__is_disengaged_offset_ == 0>
+struct __tombstone_data {
+  using _TombstoneLayout = __tombstone_traits<_Tp>;
+  using _IsDisengagedT   = remove_cv_t<decltype(_TombstoneLayout::__disengaged_value_)>;
+
+  static_assert(is_trivial<_IsDisengagedT>::value, "disengaged type has to be trivial!");
+  static_assert(_TombstoneLayout::__is_disengaged_offset_ >= __datasizeof_v<_Payload>);
+
+  _LIBCPP_NO_UNIQUE_ADDRESS _Payload __payload_;
+  char __padding_[_TombstoneLayout::__is_disengaged_offset_ - __datasizeof_v<_Payload>];
+  _IsDisengagedT __is_disengaged_;
+
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI constexpr __tombstone_data(_Args&&... __args)
+      : __payload_(std::forward<_Args>(__args)...), __is_disengaged_(_TombstoneLayout::__disengaged_value_) {}
+};
+
+template <class _Tp, class _Payload>
+struct __tombstone_data<_Tp, _Payload, true> {
+  using _TombstoneLayout = __tombstone_traits<_Tp>;
+  using _IsDisengagedT   = remove_cv_t<decltype(_TombstoneLayout::__disengaged_value_)>;
+
+  _IsDisengagedT __is_disengaged_;
+  _LIBCPP_NO_UNIQUE_ADDRESS _Payload __payload_;
+
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI constexpr __tombstone_data(_Args&&... __args)
+      : __is_disengaged_(_TombstoneLayout::__disengaged_value_), __payload_(std::forward<_Args>(__args)...) {}
+};
+
+template <class _Tp, class _Payload>
+struct __tombstoned_value final {
+  using _TombstoneLayout = __tombstone_traits<_Tp>;
+  using _TombstoneData   = __tombstone_data<_Tp, _Payload>;
+
+  template <bool = is_trivially_destructible_v<_Tp> && is_trivially_destructible_v<_Payload>>
+  union _MaybeTombstone {
+    _Tp __value_;
+    _TombstoneData __tombstone_;
+
+    template <class... _Args>
+    constexpr _MaybeTombstone(__init_disengaged_t, _Args&&... __args) : __tombstone_(std::forward<_Args>(__args)...) {}
+
+    template <class... _Args>
+    constexpr _MaybeTombstone(__init_engaged_t, _Args&&... __args) : __value_(std::forward<_Args>(__args)...) {}
+
+    _MaybeTombstone(const _MaybeTombstone&)            = default;
+    _MaybeTombstone(_MaybeTombstone&&)                 = default;
+    _MaybeTombstone& operator=(const _MaybeTombstone&) = default;
+    _MaybeTombstone& operator=(_MaybeTombstone&&)      = default;
+
+    _LIBCPP_HIDE_FROM_ABI constexpr bool __is_engaged() const noexcept {
+      if (__libcpp_is_constant_evaluated())
+        return !__builtin_constant_p(__tombstone_.__is_disengaged_ == _TombstoneLayout::__disengaged_value_);
+      __tombstone_is_disengaged<_Tp, _Payload> __is_disengaged;
+      static_assert(sizeof(__tombstone_is_disengaged<_Tp, _Payload>) <= sizeof(_MaybeTombstone));
+      __builtin_memcpy(&__is_disengaged, this, sizeof(__tombstone_is_disengaged<_Tp, _Payload>));
+
+      return __is_disengaged.__is_disengaged_ != _TombstoneLayout::__disengaged_value_;
+    }
+
+    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 ~_MaybeTombstone() {
+      if (__is_engaged()) {
+        std::destroy_at(&__value_);
+      } else {
+        std::destroy_at(&__tombstone_);
+      }
+    }
+  };
+
+  template <>
+  union _MaybeTombstone<true> {
+    _Tp __value_;
+    _TombstoneData __tombstone_;
+
+    template <class... _Args>
+    constexpr _MaybeTombstone(__init_disengaged_t, _Args&&... __args) : __tombstone_(std::forward<_Args>(__args)...) {}
+
+    template <class... _Args>
+    constexpr _MaybeTombstone(__init_engaged_t, _Args&&... __args) : __value_(std::forward<_Args>(__args)...) {}
+
+    _MaybeTombstone(const _MaybeTombstone&)            = default;
+    _MaybeTombstone(_MaybeTombstone&&)                 = default;
+    _MaybeTombstone& operator=(const _MaybeTombstone&) = default;
+    _MaybeTombstone& operator=(_MaybeTombstone&&)      = default;
+
+    _LIBCPP_HIDE_FROM_ABI constexpr bool __is_engaged() const noexcept {
+      if (__libcpp_is_constant_evaluated())
+        return !__builtin_constant_p(__tombstone_.__is_disengaged_ == _TombstoneLayout::__disengaged_value_);
+      __tombstone_is_disengaged<_Tp, _Payload> __is_disengaged;
+      static_assert(sizeof(__tombstone_is_disengaged<_Tp, _Payload>) <= sizeof(_MaybeTombstone));
+      __builtin_memcpy(&__is_disengaged, this, sizeof(__tombstone_is_disengaged<_Tp, _Payload>));
+
+      return __is_disengaged.__is_disengaged_ != _TombstoneLayout::__disengaged_value_;
+    }
+
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 ~_MaybeTombstone() = default;
+  };
+
+  _MaybeTombstone<> __data_;
+
+  static_assert(sizeof(__tombstone_data<_Tp, _Payload>) <= sizeof(_Tp));
+  static_assert(is_integral_v<decltype(_TombstoneLayout::__disengaged_value_)>);
+  static_assert(__builtin_offsetof(_TombstoneData, __is_disengaged_) == _TombstoneLayout::__is_disengaged_offset_);
+
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI constexpr __tombstoned_value(__init_disengaged_t, _Args&&... __args)
+      : __data_(__init_disengaged, std::forward<_Args>(__args)...) {}
+
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI constexpr __tombstoned_value(__init_engaged_t, _Args&&... __args)
+      : __data_(__init_engaged, std::forward<_Args>(__args)...) {}
+
+  _LIBCPP_HIDE_FROM_ABI constexpr bool __is_engaged() const noexcept { return __data_.__is_engaged(); }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_value() & noexcept {
+    _LIBCPP_ASSERT_INTERNAL(__self.__is_engaged(), "Trying to get the value of a disenagaged tombstoned value");
+    return __data_.__value_;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_value() const & noexcept {
+    _LIBCPP_ASSERT_INTERNAL(__self.__is_engaged(), "Trying to get the value of a disenagaged tombstoned value");
+    return __data_.__value_;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_value() && noexcept {
+    _LIBCPP_ASSERT_INTERNAL(__self.__is_engaged(), "Trying to get the value of a disenagaged tombstoned value");
+    return std::move(__data_.__value_);
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_value() const && noexcept {
+    _LIBCPP_ASSERT_INTERNAL(__self.__is_engaged(), "Trying to get the value of a disenagaged tombstoned value");
+    return std::move(__data_.__value_);
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_payload() & noexcept {
+    _LIBCPP_ASSERT_INTERNAL(!__self.__is_engaged(), "Trying to get the payload of an enagaged tombstoned value");
+    return __data_.__tombstone_.__payload_;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_payload() const & noexcept {
+    _LIBCPP_ASSERT_INTERNAL(!__self.__is_engaged(), "Trying to get the payload of an enagaged tombstoned value");
+    return __data_.__tombstone_.__payload_;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_payload() && noexcept {
+    _LIBCPP_ASSERT_INTERNAL(!__self.__is_engaged(), "Trying to get the payload of an enagaged tombstoned value");
+    return std::move(__data_.__tombstone_.__payload_);
+  }
+
+  _LIBCPP_HIDE_FROM_ABI constexpr auto&& __get_payload() const && noexcept {
+    _LIBCPP_ASSERT_INTERNAL(!__self.__is_engaged(), "Trying to get the payload of an enagaged tombstoned value");
+    return std::move(__data_.__tombstone_.__payload_);
+  }
+
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI constexpr void __engage(piecewise_construct_t, _Args&&... __args) {
+    _LIBCPP_ASSERT_INTERNAL(!__self.__is_engaged(), "Trying to enage a already engaged tombstoned value");
+    std::destroy_at(&__data_.__tombstone_);
+    std::__construct_at(&__data_.__value_, std::forward<_Args>(__args)...);
+  }
+
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI constexpr void __disengage(piecewise_construct_t, _Args&&... __args) {
+    _LIBCPP_ASSERT_INTERNAL(!__self.__is_engaged(), "Trying to disenage a disengaged tombstoned value");
+    std::destroy_at(&__data_.__value_);
+    std::__construct_at(&__data_.__tombstone_, std::forward<_Args>(__args)...);
+  }
+};
+
+template <class _Tp, class = void>
+inline constexpr bool __has_tombstone_v = false;
+
+template <class _Tp>
+inline constexpr bool __has_tombstone_v<_Tp, void_t<decltype(sizeof(__tombstone_traits<_Tp>))>> = true;
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // _LIBCPP_STD_VER >= 17
+
+#endif // _LIBCPP___TYPE_TRAITS_DISENGAGED_TRAITS_H
diff --git a/libcxx/include/__memory/unique_ptr.h b/libcxx/include/__memory/unique_ptr.h
index 28c62e13566e24..965395e0e79626 100644
--- a/libcxx/include/__memory/unique_ptr.h
+++ b/libcxx/include/__memory/unique_ptr.h
@@ -24,6 +24,7 @@
 #include <__memory/auto_ptr.h>
 #include <__memory/compressed_pair.h>
 #include <__memory/pointer_traits.h>
+#include <__memory/tombstone_traits.h>
 #include <__type_traits/add_lvalue_reference.h>
 #include <__type_traits/common_type.h>
 #include <__type_traits/conditional.h>
@@ -405,6 +406,16 @@ struct __unique_ptr_array_bounds_stored {
   size_t __size_;
 };
 
+#if _LIBCPP_STD_VER >= 17
+template <class _Tp>
+struct __tombstone_traits<__enable_specialization_if<__has_tombstone_v<_Tp*>, unique_ptr<_Tp>>>
+    : __tombstone_traits<_Tp*> {};
+
+template <class _Tp>
+struct __tombstone_traits<__enable_specialization_if<__has_tombstone_v<_Tp*>, unique_ptr<_Tp[]>>>
+    : __tombstone_traits<_Tp*> {};
+#endif // _LIBCPP_STD_VER >= 17
+
 template <class _Tp, class _Dp>
 class _LIBCPP_UNIQUE_PTR_TRIVIAL_ABI _LIBCPP_TEMPLATE_VIS unique_ptr<_Tp[], _Dp> {
 public:
diff --git a/libcxx/include/__type_traits/enable_if.h b/libcxx/include/__type_traits/enable_if.h
index 77da9622ca28fc..67c3f7ffa24c14 100644
--- a/libcxx/include/__type_traits/enable_if.h
+++ b/libcxx/include/__type_traits/enable_if.h
@@ -32,6 +32,9 @@ template <bool _Bp, class _Tp = void>
 using enable_if_t = typename enable_if<_Bp, _Tp>::type;
 #endif
 
+template <bool _Bp, class _Tp, class = __enable_if_t<_Bp> >
+using __enable_specialization_if = _Tp;
+
 _LIBCPP_END_NAMESPACE_STD
 
 #endif // _LIBCPP___TYPE_TRAITS_ENABLE_IF_H
diff --git a/libcxx/include/__utility/pair.h b/libcxx/include/__utility/pair.h
index f9d0f4e4723113..965b29bbae27f3 100644
--- a/libcxx/include/__utility/pair.h
+++ b/libcxx/include/__utility/pair.h
@@ -17,6 +17,7 @@
 #include <__fwd/array.h>
 #include <__fwd/pair.h>
 #include <__fwd/tuple.h>
+#include <__memory/tombstone_traits.h>
 #include <__tuple/tuple_indices.h>
 #include <__tuple/tuple_like_no_subrange.h>
 #include <__tuple/tuple_size.h>
@@ -444,7 +445,24 @@ struct _LIBCPP_TEMPLATE_VIS pair
 #if _LIBCPP_STD_VER >= 17
 template <class _T1, class _T2>
 pair(_T1, _T2) -> pair<_T1, _T2>;
-#endif
+
+template <class _Tp, class _Up, bool __first_tombstone>
+struct __tombstone_traits_pair;
+
+template <class _Tp, class _Up>
+struct __tombstone_traits_pair<_Tp, _Up, true> : __tombstone_traits<_Tp> {};
+
+template <class _Tp, class _Up>
+struct __tombstone_traits_pair<_Tp, _Up, false> {
+  static constexpr auto __disengaged_value_       = __tombstone_traits<_Up>::__disengaged_value_;
+  static constexpr size_t __is_disengaged_offset_ = sizeof(_Tp) + __tombstone_traits<_Up>::__is_disengaged_offset_;
+};
+
+templa...
[truncated]

@philnik777 philnik777 force-pushed the tombstone_traits branch 2 times, most recently from f67cde6 to e2bd728 Compare November 8, 2024 15:43
Comment on lines +39 to +40
template <class>
struct __tombstone_traits;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation?

Comment on lines 129 to 130
static constexpr uintptr_t __disengaged_value_ = __builtin_offsetof(reference_wrapper<_Tp>, __f_);
static constexpr size_t __is_disengaged_offset_ = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noticed during live review, these are reversed. This needs a test that would at least fail if we changed the offsetof value. In fact, I'd like a basic test with optional<T> for every T that we give tombstone_traits to, just to make sure that basic stuff works.

};

template <class _Tp, class _Payload>
struct __tombstoned_value final {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the "unexpected" object can't fit inside the tombstoned_value (either due to its size, or due to the need to place it at a higher offset inside the tombstoned_value), we need to ensure that a compile-time error is produced. There should be some libc++ specific tests for that, it's tricky and very important to get right.

_MaybeTombstone& operator=(const _MaybeTombstone&) = default;
_MaybeTombstone& operator=(_MaybeTombstone&&) = default;

_LIBCPP_HIDE_FROM_ABI constexpr bool __is_engaged() const noexcept {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it might not actually be possible to implement it that way, but I would find an interface like this for tombstone_traits to be more intuitive:

// Option #1
template <>
struct __tombstone_traits<std::string> {
  static constexpr bool __is_disengaged(std::byte (&__bytes)[sizeof(std::string)]) {
    bool __is_short = __bytes[__builtin_offsetof(std::string, something)] == 0;
    std::size_t __size = ...;
    return __is_short && __size > __min_cap;
  }
};


// Option #2
template <class _Tp, std::size_t _Offset, auto _DisengagedValue>
struct __tombstone_traits_from_offset {
  static constexpr auto __disengaged_value_    = _DisengagedValue;
  static constexpr size_t __is_disengaged_offset_ = _Offset;

  static constexpr bool __is_disengaged(std::byte (&__bytes)[sizeof(_Tp)]) {
    __tombstone_is_disengaged<_Tp, _Payload> __is_disengaged;
    static_assert(sizeof(__tombstone_is_disengaged<_Tp, _Payload>) <= sizeof(_MaybeTombstone));
    __builtin_memcpy(&__is_disengaged, this, sizeof(__tombstone_is_disengaged<_Tp, _Payload>));
    return __is_disengaged.__is_disengaged_ != _TombstoneLayout::__disengaged_value_;
  }
};

template <>
struct __tombstone_traits<std::string> : __tombstone_traits_from_offset<std::string, 0, std::uint8_t(65)> { };

I think what I'm after is to avoid basing the whole tombstone mechanism on passing a magic value and a magic offset, and instead use something where (in theory) arbitrary code could run. In practice, I understand that we're limited in what we can do since there may not be an actual object in memory, but it still seems like defining __is_disengaged in terms of a function (with potentially a commonly-used helper) is more flexible. This would allow for example checking more than just integral equality to determine whether the object is engaged or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with this some more:

#define private public
#include <string>
#include <cassert>

template <class _Tp>
struct __tombstone_traits;

template <>
struct __tombstone_traits<std::string> {
  static bool __is_disengaged(char const (&__bytes)[sizeof(std::string)]) {
    std::string::__rep __representation;
    __builtin_memcpy(&__representation, __bytes, sizeof(__representation));
    return !__representation.__s.__is_long_ && __representation.__s.__size_ > std::string::__min_cap;
  }

  static void __set_disengaged(char (&__bytes)[sizeof(std::string)]) {
    std::string::__rep __representation;
    __builtin_memcpy(&__representation, __bytes, sizeof(__representation));
    __representation.__s.__is_long_ = false;
    __representation.__s.__size_    = std::string::__min_cap + 1;
    __builtin_memcpy(__bytes, &__representation, sizeof(__representation));
  }
};

template <class _Tp>
struct Optional {
  union {
    _Tp value;
    char bytes[sizeof(_Tp)];
  };

  constexpr Optional() : bytes{} {
    if (!std::is_constant_evaluated()) {
      __tombstone_traits<_Tp>::__set_disengaged(bytes);
    }
  }

  constexpr explicit Optional(_Tp const& val) : value(val) {}

  constexpr bool is_engaged() const {
    if (std::is_constant_evaluated()) {
      return !__builtin_constant_p(bytes[0]);
    } else {
      return !__tombstone_traits<_Tp>::__is_disengaged(bytes);
    }
  }

  constexpr _Tp& operator*() {
    assert(is_engaged());
    return value;
  }

  constexpr ~Optional() {
    if (is_engaged()) {
      std::destroy_at(std::addressof(value));
    }
  }
};

constexpr bool test() {
  {
    Optional<std::string> opt;
    assert(!opt.is_engaged());
  }

  // Short string
  {
    Optional<std::string> opt("short");
    assert(opt.is_engaged());
    assert(*opt == "short");
  }

  // Long string
  {
    Optional<std::string> opt(
        "looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong");
    assert(opt.is_engaged());
    assert(*opt == "looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong");
  }

  return true;
}

int main(int, char**) {
  test();
  static_assert(test());

  return 0;
}

Godbolt: https://godbolt.org/z/cfqev914a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of this approach, since it still requires type punning - which the current version doesn't. It's also not clear to me what the benefit of switching to a function is. IIUC we'd also have to introduce __rep versions for every type where we have a tombstone trait specialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants