Skip to content

[libc++] Split up ABI and platform configuration to their own headers #90863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 28, 2024

Conversation

ldionne
Copy link
Member

@ldionne ldionne commented May 2, 2024

This is a first step towards splitting up the <__config> header. The <__config> header is large and rather disorganized at this point, leading to confusion and subtle mistakes. For example, we never noticed that the string layout used on arm64 was only enabled for the Clang compiler, as the setting being in the compiler == clang block was probably never intentional.

The danger of splitting up the <__config> header is to implicitly use undefined macros that should have been defined prior to their usage, however this can be remediated with -Wundef and we've started moving towards -Wundef enforceable macros.

@ldionne ldionne requested a review from a team as a code owner May 2, 2024 15:08
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label May 2, 2024
@llvmbot
Copy link
Member

llvmbot commented May 2, 2024

@llvm/pr-subscribers-libcxx

Author: Louis Dionne (ldionne)

Changes

This is a first step towards splitting up the <__config> header. The <__config> header is large and rather disorganized at this point, leading to confusion and subtle mistakes. For example, we never noticed that the string layout used on arm64 was only enabled for the Clang compiler, as the setting being in the compiler == clang block was probably never intentional.

The danger of splitting up the <__config> header is to implicitly use undefined macros that should have been defined prior to their usage, however this can be remediated with -Wundef and we've started moving towards -Wundef enforceable macros.


Patch is 23.71 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90863.diff

7 Files Affected:

  • (modified) libcxx/include/CMakeLists.txt (+3)
  • (modified) libcxx/include/__config (+3-191)
  • (added) libcxx/include/__configuration/abi.h (+147)
  • (added) libcxx/include/__configuration/compiler.h (+51)
  • (added) libcxx/include/__configuration/platform.h (+53)
  • (modified) libcxx/include/module.modulemap (+3)
  • (modified) libcxx/utils/generate_iwyu_mapping.py (+2)
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index 1296c536bc882c..d3c9553d8bdd0f 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -327,6 +327,9 @@ set(files
   __concepts/totally_ordered.h
   __condition_variable/condition_variable.h
   __config
+  __configuration/abi.h
+  __configuration/compiler.h
+  __configuration/platform.h
   __coroutine/coroutine_handle.h
   __coroutine/coroutine_traits.h
   __coroutine/noop_coroutine_handle.h
diff --git a/libcxx/include/__config b/libcxx/include/__config
index e4c5c685a45645..192756258947df 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -11,41 +11,16 @@
 #define _LIBCPP___CONFIG
 
 #include <__config_site>
+#include <__configuration/abi.h>
+#include <__configuration/compiler.h>
+#include <__configuration/platform.h>
 
 #ifndef _LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER
 #  pragma GCC system_header
 #endif
 
-#if defined(__apple_build_version__)
-// Given AppleClang XX.Y.Z, _LIBCPP_APPLE_CLANG_VER is XXYZ (e.g. AppleClang 14.0.3 => 1403)
-#  define _LIBCPP_COMPILER_CLANG_BASED
-#  define _LIBCPP_APPLE_CLANG_VER (__apple_build_version__ / 10000)
-#elif defined(__clang__)
-#  define _LIBCPP_COMPILER_CLANG_BASED
-#  define _LIBCPP_CLANG_VER (__clang_major__ * 100 + __clang_minor__)
-#elif defined(__GNUC__)
-#  define _LIBCPP_COMPILER_GCC
-#  define _LIBCPP_GCC_VER (__GNUC__ * 100 + __GNUC_MINOR__)
-#endif
-
 #ifdef __cplusplus
 
-// Warn if a compiler version is used that is not supported anymore
-// LLVM RELEASE Update the minimum compiler versions
-#  if defined(_LIBCPP_CLANG_VER)
-#    if _LIBCPP_CLANG_VER < 1700
-#      warning "Libc++ only supports Clang 17 and later"
-#    endif
-#  elif defined(_LIBCPP_APPLE_CLANG_VER)
-#    if _LIBCPP_APPLE_CLANG_VER < 1500
-#      warning "Libc++ only supports AppleClang 15 and later"
-#    endif
-#  elif defined(_LIBCPP_GCC_VER)
-#    if _LIBCPP_GCC_VER < 1300
-#      warning "Libc++ only supports GCC 13 and later"
-#    endif
-#  endif
-
 // The attributes supported by clang are documented at https://clang.llvm.org/docs/AttributeReference.html
 
 // _LIBCPP_VERSION represents the version of libc++, which matches the version of LLVM.
@@ -79,135 +54,6 @@
 #  endif // _LIBCPP_STD_VER
 // NOLINTEND(libcpp-cpp-version-check)
 
-#  if defined(__ELF__)
-#    define _LIBCPP_OBJECT_FORMAT_ELF 1
-#  elif defined(__MACH__)
-#    define _LIBCPP_OBJECT_FORMAT_MACHO 1
-#  elif defined(_WIN32)
-#    define _LIBCPP_OBJECT_FORMAT_COFF 1
-#  elif defined(__wasm__)
-#    define _LIBCPP_OBJECT_FORMAT_WASM 1
-#  elif defined(_AIX)
-#    define _LIBCPP_OBJECT_FORMAT_XCOFF 1
-#  else
-// ... add new file formats here ...
-#  endif
-
-// ABI {
-
-#  if _LIBCPP_ABI_VERSION >= 2
-// Change short string representation so that string data starts at offset 0,
-// improving its alignment in some cases.
-#    define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
-// Fix deque iterator type in order to support incomplete types.
-#    define _LIBCPP_ABI_INCOMPLETE_TYPES_IN_DEQUE
-// Fix undefined behavior in how std::list stores its linked nodes.
-#    define _LIBCPP_ABI_LIST_REMOVE_NODE_POINTER_UB
-// Fix undefined behavior in  how __tree stores its end and parent nodes.
-#    define _LIBCPP_ABI_TREE_REMOVE_NODE_POINTER_UB
-// Fix undefined behavior in how __hash_table stores its pointer types.
-#    define _LIBCPP_ABI_FIX_UNORDERED_NODE_POINTER_UB
-#    define _LIBCPP_ABI_FORWARD_LIST_REMOVE_NODE_POINTER_UB
-#    define _LIBCPP_ABI_FIX_UNORDERED_CONTAINER_SIZE_TYPE
-// Override the default return value of exception::what() for bad_function_call::what()
-// with a string that is specific to bad_function_call (see http://wg21.link/LWG2233).
-// This is an ABI break on platforms that sign and authenticate vtable function pointers
-// because it changes the mangling of the virtual function located in the vtable, which
-// changes how it gets signed.
-#    define _LIBCPP_ABI_BAD_FUNCTION_CALL_GOOD_WHAT_MESSAGE
-// Enable optimized version of __do_get_(un)signed which avoids redundant copies.
-#    define _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
-// Give reverse_iterator<T> one data member of type T, not two.
-// Also, in C++17 and later, don't derive iterator types from std::iterator.
-#    define _LIBCPP_ABI_NO_ITERATOR_BASES
-// Use the smallest possible integer type to represent the index of the variant.
-// Previously libc++ used "unsigned int" exclusively.
-#    define _LIBCPP_ABI_VARIANT_INDEX_TYPE_OPTIMIZATION
-// Unstable attempt to provide a more optimized std::function
-#    define _LIBCPP_ABI_OPTIMIZED_FUNCTION
-// All the regex constants must be distinct and nonzero.
-#    define _LIBCPP_ABI_REGEX_CONSTANTS_NONZERO
-// Re-worked external template instantiations for std::string with a focus on
-// performance and fast-path inlining.
-#    define _LIBCPP_ABI_STRING_OPTIMIZED_EXTERNAL_INSTANTIATION
-// Enable clang::trivial_abi on std::unique_ptr.
-#    define _LIBCPP_ABI_ENABLE_UNIQUE_PTR_TRIVIAL_ABI
-// Enable clang::trivial_abi on std::shared_ptr and std::weak_ptr
-#    define _LIBCPP_ABI_ENABLE_SHARED_PTR_TRIVIAL_ABI
-// std::random_device holds some state when it uses an implementation that gets
-// entropy from a file (see _LIBCPP_USING_DEV_RANDOM). When switching from this
-// implementation to another one on a platform that has already shipped
-// std::random_device, one needs to retain the same object layout to remain ABI
-// compatible. This switch removes these workarounds for platforms that don't care
-// about ABI compatibility.
-#    define _LIBCPP_ABI_NO_RANDOM_DEVICE_COMPATIBILITY_LAYOUT
-// Don't export the legacy __basic_string_common class and its methods from the built library.
-#    define _LIBCPP_ABI_DO_NOT_EXPORT_BASIC_STRING_COMMON
-// Don't export the legacy __vector_base_common class and its methods from the built library.
-#    define _LIBCPP_ABI_DO_NOT_EXPORT_VECTOR_BASE_COMMON
-// According to the Standard, `bitset::operator[] const` returns bool
-#    define _LIBCPP_ABI_BITSET_VECTOR_BOOL_CONST_SUBSCRIPT_RETURN_BOOL
-// Fix the implementation of CityHash used for std::hash<fundamental-type>.
-// This is an ABI break because `std::hash` will return a different result,
-// which means that hashing the same object in translation units built against
-// different versions of libc++ can return inconsistent results. This is especially
-// tricky since std::hash is used in the implementation of unordered containers.
-//
-// The incorrect implementation of CityHash has the problem that it drops some
-// bits on the floor.
-#    define _LIBCPP_ABI_FIX_CITYHASH_IMPLEMENTATION
-// Remove the base 10 implementation of std::to_chars from the dylib.
-// The implementation moved to the header, but we still export the symbols from
-// the dylib for backwards compatibility.
-#    define _LIBCPP_ABI_DO_NOT_EXPORT_TO_CHARS_BASE_10
-// Define std::array/std::string_view iterators to be __wrap_iters instead of raw
-// pointers, which prevents people from relying on a non-portable implementation
-// detail. This is especially useful because enabling bounded iterators hardening
-// requires code not to make these assumptions.
-#    define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_ARRAY
-#    define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW
-#  elif _LIBCPP_ABI_VERSION == 1
-#    if !(defined(_LIBCPP_OBJECT_FORMAT_COFF) || defined(_LIBCPP_OBJECT_FORMAT_XCOFF))
-// Enable compiling copies of now inline methods into the dylib to support
-// applications compiled against older libraries. This is unnecessary with
-// COFF dllexport semantics, since dllexport forces a non-inline definition
-// of inline functions to be emitted anyway. Our own non-inline copy would
-// conflict with the dllexport-emitted copy, so we disable it. For XCOFF,
-// the linker will take issue with the symbols in the shared object if the
-// weak inline methods get visibility (such as from -fvisibility-inlines-hidden),
-// so disable it.
-#      define _LIBCPP_DEPRECATED_ABI_LEGACY_LIBRARY_DEFINITIONS_FOR_INLINE_FUNCTIONS
-#    endif
-// Feature macros for disabling pre ABI v1 features. All of these options
-// are deprecated.
-#    if defined(__FreeBSD__) && __FreeBSD__ < 14
-#      define _LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR
-#    endif
-#  endif
-
-// We had some bugs where we use [[no_unique_address]] together with construct_at,
-// which causes UB as the call on construct_at could write to overlapping subobjects
-//
-// https://github.com/llvm/llvm-project/issues/70506
-// https://github.com/llvm/llvm-project/issues/70494
-//
-// To fix the bug we had to change the ABI of some classes to remove [[no_unique_address]] under certain conditions.
-// The macro below is used for all classes whose ABI have changed as part of fixing these bugs.
-#  define _LIBCPP_ABI_LLVM18_NO_UNIQUE_ADDRESS __attribute__((__abi_tag__("llvm18_nua")))
-
-// Changes the iterator type of select containers (see below) to a bounded iterator that keeps track of whether it's
-// within the bounds of the original container and asserts it on every dereference.
-//
-// ABI impact: changes the iterator type of the relevant containers.
-//
-// Supported containers:
-// - `span`;
-// - `string_view`;
-// - `array`.
-// #define _LIBCPP_ABI_BOUNDED_ITERATORS
-
-// } ABI
-
 // HARDENING {
 
 // TODO(hardening): deprecate this in LLVM 19.
@@ -425,31 +271,10 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_HAS_NO_EXPERIMENTAL_SYNCSTREAM
 #  endif
 
-// Need to detect which libc we're using if we're on Linux.
-#  if defined(__linux__)
-#    include <features.h>
-#    if defined(__GLIBC_PREREQ)
-#      define _LIBCPP_GLIBC_PREREQ(a, b) __GLIBC_PREREQ(a, b)
-#    else
-#      define _LIBCPP_GLIBC_PREREQ(a, b) 0
-#    endif // defined(__GLIBC_PREREQ)
-#  endif   // defined(__linux__)
-
 #  if defined(__MVS__)
 #    include <features.h> // for __NATIVE_ASCII_F
 #  endif
 
-#  ifndef __BYTE_ORDER__
-#    error                                                                                                             \
-        "Your compiler doesn't seem to define __BYTE_ORDER__, which is required by libc++ to know the endianness of your target platform"
-#  endif
-
-#  if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
-#    define _LIBCPP_LITTLE_ENDIAN
-#  elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
-#    define _LIBCPP_BIG_ENDIAN
-#  endif // __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
-
 #  if defined(_WIN32)
 #    define _LIBCPP_WIN32API
 #    define _LIBCPP_SHORT_WCHAR 1
@@ -561,19 +386,6 @@ typedef __char32_t char32_t;
 
 #  if defined(_LIBCPP_COMPILER_CLANG_BASED)
 
-#    if defined(__APPLE__)
-#      if defined(__i386__) || defined(__x86_64__)
-// use old string layout on x86_64 and i386
-#      elif defined(__arm__)
-// use old string layout on arm (which does not include aarch64/arm64), except on watch ABIs
-#        if defined(__ARM_ARCH_7K__) && __ARM_ARCH_7K__ >= 2
-#          define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
-#        endif
-#      else
-#        define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
-#      endif
-#    endif
-
 // Objective-C++ features (opt-in)
 #    if __has_feature(objc_arc)
 #      define _LIBCPP_HAS_OBJC_ARC
diff --git a/libcxx/include/__configuration/abi.h b/libcxx/include/__configuration/abi.h
new file mode 100644
index 00000000000000..c570b9aaba48d0
--- /dev/null
+++ b/libcxx/include/__configuration/abi.h
@@ -0,0 +1,147 @@
+// -*- C++ -*-
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___CONFIGURATION_ABI_H
+#define _LIBCPP___CONFIGURATION_ABI_H
+
+#include <__config_site>
+#include <__configuration/compiler.h>
+#include <__configuration/platform.h>
+
+#ifndef _LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER
+#  pragma GCC system_header
+#endif
+
+#if _LIBCPP_ABI_VERSION >= 2
+// Change short string representation so that string data starts at offset 0,
+// improving its alignment in some cases.
+#  define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
+// Fix deque iterator type in order to support incomplete types.
+#  define _LIBCPP_ABI_INCOMPLETE_TYPES_IN_DEQUE
+// Fix undefined behavior in how std::list stores its linked nodes.
+#  define _LIBCPP_ABI_LIST_REMOVE_NODE_POINTER_UB
+// Fix undefined behavior in  how __tree stores its end and parent nodes.
+#  define _LIBCPP_ABI_TREE_REMOVE_NODE_POINTER_UB
+// Fix undefined behavior in how __hash_table stores its pointer types.
+#  define _LIBCPP_ABI_FIX_UNORDERED_NODE_POINTER_UB
+#  define _LIBCPP_ABI_FORWARD_LIST_REMOVE_NODE_POINTER_UB
+#  define _LIBCPP_ABI_FIX_UNORDERED_CONTAINER_SIZE_TYPE
+// Override the default return value of exception::what() for bad_function_call::what()
+// with a string that is specific to bad_function_call (see http://wg21.link/LWG2233).
+// This is an ABI break on platforms that sign and authenticate vtable function pointers
+// because it changes the mangling of the virtual function located in the vtable, which
+// changes how it gets signed.
+#  define _LIBCPP_ABI_BAD_FUNCTION_CALL_GOOD_WHAT_MESSAGE
+// Enable optimized version of __do_get_(un)signed which avoids redundant copies.
+#  define _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
+// Give reverse_iterator<T> one data member of type T, not two.
+// Also, in C++17 and later, don't derive iterator types from std::iterator.
+#  define _LIBCPP_ABI_NO_ITERATOR_BASES
+// Use the smallest possible integer type to represent the index of the variant.
+// Previously libc++ used "unsigned int" exclusively.
+#  define _LIBCPP_ABI_VARIANT_INDEX_TYPE_OPTIMIZATION
+// Unstable attempt to provide a more optimized std::function
+#  define _LIBCPP_ABI_OPTIMIZED_FUNCTION
+// All the regex constants must be distinct and nonzero.
+#  define _LIBCPP_ABI_REGEX_CONSTANTS_NONZERO
+// Re-worked external template instantiations for std::string with a focus on
+// performance and fast-path inlining.
+#  define _LIBCPP_ABI_STRING_OPTIMIZED_EXTERNAL_INSTANTIATION
+// Enable clang::trivial_abi on std::unique_ptr.
+#  define _LIBCPP_ABI_ENABLE_UNIQUE_PTR_TRIVIAL_ABI
+// Enable clang::trivial_abi on std::shared_ptr and std::weak_ptr
+#  define _LIBCPP_ABI_ENABLE_SHARED_PTR_TRIVIAL_ABI
+// std::random_device holds some state when it uses an implementation that gets
+// entropy from a file (see _LIBCPP_USING_DEV_RANDOM). When switching from this
+// implementation to another one on a platform that has already shipped
+// std::random_device, one needs to retain the same object layout to remain ABI
+// compatible. This switch removes these workarounds for platforms that don't care
+// about ABI compatibility.
+#  define _LIBCPP_ABI_NO_RANDOM_DEVICE_COMPATIBILITY_LAYOUT
+// Don't export the legacy __basic_string_common class and its methods from the built library.
+#  define _LIBCPP_ABI_DO_NOT_EXPORT_BASIC_STRING_COMMON
+// Don't export the legacy __vector_base_common class and its methods from the built library.
+#  define _LIBCPP_ABI_DO_NOT_EXPORT_VECTOR_BASE_COMMON
+// According to the Standard, `bitset::operator[] const` returns bool
+#  define _LIBCPP_ABI_BITSET_VECTOR_BOOL_CONST_SUBSCRIPT_RETURN_BOOL
+// Fix the implementation of CityHash used for std::hash<fundamental-type>.
+// This is an ABI break because `std::hash` will return a different result,
+// which means that hashing the same object in translation units built against
+// different versions of libc++ can return inconsistent results. This is especially
+// tricky since std::hash is used in the implementation of unordered containers.
+//
+// The incorrect implementation of CityHash has the problem that it drops some
+// bits on the floor.
+#  define _LIBCPP_ABI_FIX_CITYHASH_IMPLEMENTATION
+// Remove the base 10 implementation of std::to_chars from the dylib.
+// The implementation moved to the header, but we still export the symbols from
+// the dylib for backwards compatibility.
+#  define _LIBCPP_ABI_DO_NOT_EXPORT_TO_CHARS_BASE_10
+// Define std::array/std::string_view iterators to be __wrap_iters instead of raw
+// pointers, which prevents people from relying on a non-portable implementation
+// detail. This is especially useful because enabling bounded iterators hardening
+// requires code not to make these assumptions.
+#  define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_ARRAY
+#  define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW
+#elif _LIBCPP_ABI_VERSION == 1
+#  if !(defined(_LIBCPP_OBJECT_FORMAT_COFF) || defined(_LIBCPP_OBJECT_FORMAT_XCOFF))
+// Enable compiling copies of now inline methods into the dylib to support
+// applications compiled against older libraries. This is unnecessary with
+// COFF dllexport semantics, since dllexport forces a non-inline definition
+// of inline functions to be emitted anyway. Our own non-inline copy would
+// conflict with the dllexport-emitted copy, so we disable it. For XCOFF,
+// the linker will take issue with the symbols in the shared object if the
+// weak inline methods get visibility (such as from -fvisibility-inlines-hidden),
+// so disable it.
+#    define _LIBCPP_DEPRECATED_ABI_LEGACY_LIBRARY_DEFINITIONS_FOR_INLINE_FUNCTIONS
+#  endif
+// Feature macros for disabling pre ABI v1 features. All of these options
+// are deprecated.
+#  if defined(__FreeBSD__) && __FreeBSD__ < 14
+#    define _LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR
+#  endif
+#endif
+
+// We had some bugs where we use [[no_unique_address]] together with construct_at,
+// which causes UB as the call on construct_at could write to overlapping subobjects
+//
+// https://github.com/llvm/llvm-project/issues/70506
+// https://github.com/llvm/llvm-project/issues/70494
+//
+// To fix the bug we had to change the ABI of some classes to remove [[no_unique_address]] under certain conditions.
+// The macro below is used for all classes whose ABI have changed as part of fixing these bugs.
+#define _LIBCPP_ABI_LLVM18_NO_UNIQUE_ADDRESS __attribute__((__abi_tag__("llvm18_nua")))
+
+// Changes the iterator type of select containers (see below) to a bounded iterator that keeps track of whether it's
+// within the bounds of the original container and asserts it on every dereference.
+//
+// ABI impact: changes the iterator type of the relevant containers.
+//
+// Supported containers:
+// - `span`;
+// - `string_view`;
+// - `array`.
+// #define _LIBCPP_ABI_BOUNDED_ITERATORS
+
+#if defined(_LIBCPP_COMPILER_CLANG_BASED)
+#  if defined(__APPLE__)
+#    if defined(__i386__) || defined(__x86_64__)
+// use old string layout on x86_64 and i386
+#    elif defined(__arm__)
+// use old string layout on arm (which does not include aarch64/arm64), except on watch ABIs
+#      if defined(__ARM_ARCH_7K__) && __ARM_ARCH_7K__ >= 2
+#        define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
+#      endif
+#    else
+#      define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
+#    endif
+#  endif
+#endif
+
+#endif // _LIBCPP___CONFIGURATION_ABI_H
diff --git a/libcxx/include/__configuration/compiler.h b/libcxx/include/__configuration/compiler.h
new file mode 100644
index 00000000000000..a9fc3498220ab9
--- /dev/null
+++ b/libcxx/include/__configuration/compiler.h
@@ -0,0 +1,51 @@
+// -*- C++ -*-
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___CONFIGURATION_COMPILER_H
+#define _LIBCPP___CONFIGURATION_COMPILER_H
+
+#include <__config_site>
+
+#ifndef _LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER
+#  pragma GCC system_header
+#endif
+
+#if defined(__apple_build_version__)
+// Given AppleClang XX.Y.Z, _LIBCPP_APPLE_CLANG_VER is XXYZ (e.g. AppleClang 14.0.3 => 1403)
+#  define _LIBCPP_COMPILER_CLANG_BASED
+#  define _LIBCPP_APPLE_CLANG_VER (__apple_build_version__ / 10000)
+#elif defined(__clang__)
+#  define _LIBCPP_COMPILER_CLANG_BASED
+#  define _LIBCPP_CLANG_VER (__clang_major__ * 100 + __clang_minor__)
+#elif defined(__GNUC__)
+#  define _LIBCPP_COMPILER_GCC
+#  define _LIBCPP_GCC_VER (__GNUC__ * 100 + __GNUC_MIN...
[truncated]

@EricWF
Copy link
Member

EricWF commented May 2, 2024

I would really really really like to keep config as a single file.

I find tracing context across multiple split files to be harder than doing it across a single file.

It's certainly true that __config is a monolith, and it is a bit hairy.
I think this new approach adds more complexity than it addresses. In particular I'm concerned about how this may affect modules. Additionally, __config has restrictions on what it can do (it can't include other headers*, it can't depend on other internals, etc). I'm concerned it'll be harder to treat a group of headers with the same care and concern that __config currently gets.

Most of the current issues could be addressed by using #else and #endif comments, and adding comment banners to separate logical sections.

I think the issues with _LIBCPP_ABI_ALTERNATIVE_LAYOUT was that it was defined in the wrong part of the file entirely, as we have normally configured the ABI. It should have also been tested better?

@ldionne
Copy link
Member Author

ldionne commented May 3, 2024

I would really really really like to keep config as a single file.

I find tracing context across multiple split files to be harder than doing it across a single file.

I believe this is subjective, because I find the exact opposite. I find it a lot easier to understand the organization of things when they are grouped logically and separated into headers as makes sense. I'm not arguing for a hundred small configuration headers here, just a few broad categories as makes sense. __config has 1200 lines, it's very difficult to navigate as-is.

It's certainly true that __config is a monolith, and it is a bit hairy. I think this new approach adds more complexity than it addresses. In particular I'm concerned about how this may affect modules.

These headers are all textual, so I don't think this split would cause any issues. Do you have specific concerns in mind w.r.t. modules?

Additionally, __config has restrictions on what it can do (it can't include other headers*, it can't depend on other internals, etc). I'm concerned it'll be harder to treat a group of headers with the same care and concern that __config currently gets.

Most of the current issues could be addressed by using #else and #endif comments, and adding comment banners to separate logical sections.

This comment makes me think that we both agree that grouping the configuration settings logically makes sense. It then becomes a matter of whether they should live in separate headers or not. I will argue that seeing different headers in the navigation bar (for example) makes it easier to jump to the group that you want, instead of having to scroll to all the lines in __config to find the commented block you're looking for.

I think the issues with _LIBCPP_ABI_ALTERNATIVE_LAYOUT was that it was defined in the wrong part of the file entirely, as we have normally configured the ABI. It should have also been tested better?

This kind of issue is less likely to happen when things are split in a logical manner instead of crammed in a file with 1200 lines.

This change is motivated by the desire to expand our ABI configuration facilities and formalize them better in a way similar to what we have for __availability. I am reluctant to do that in the current state of things due to __config being basically a huge monolithic mess, and expanding the ABI configuration would just make that even bigger. Note that __availability is a configuration header exactly like __configuration/abi.h would become, and in fact I'd move __availability to __configuration/availability.h in the future. I don't know if you'd agree that __config would not be made simpler by the concatenation of __availability to it, but if so this change basically operates under the same principle.

@ldionne ldionne force-pushed the review/config-refactor branch from b141966 to 3578d18 Compare May 3, 2024 20:13
@ldionne
Copy link
Member Author

ldionne commented May 3, 2024

Put differently: I want to split out ABI macros so I can improve that system without things getting out of hand inside __config, but I can't do that unless I also split out the platform configuration from __config because I run into circular dependencies otherwise. So my goal is not to split as much as possible, it's to do the minimum split that makes sense to get ABI macros out of __config.

@EricWF
Copy link
Member

EricWF commented May 3, 2024

This kind of issue is less likely to happen when things are split in a logical manner instead of crammed in a file with 1200 lines.

I disagree. But that's OK, we can disagree on that. I'm not going to block this patch based wholly on my opinion here.

Would you mind being explicit about the dependency graph between the various units?

For example, we never noticed that the string layout used on arm64 was only enabled for the Clang compiler, as the setting being in the compiler == clang block was probably never intentional.

I see that this underlying issue hasn't been addressed and still exists in this change. What's the plan for addressing it?
Since that bug is the cited motivation for this change, I would like to see how this change would have improved things, which is hard to imagine when the issue remains.

Copy link
Member

@mordante mordante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've no strong feelings regarding keeping one __config header or multiple headers.

@EricWF
Copy link
Member

EricWF commented May 9, 2024

@ldionne I'm open to this change, but I would like the last comments addressed before proceeding.

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this quite a bit. IMO it makes things much tidier. My main concern is that we don't properly include things, which would result in misconfigurations. As long as we keep including <__config> that shouldn't be a bigger issue than it already is though. Given that the ABI configurations probably needs some significant rework, I think it's a good idea to have that separately.

@ldionne
Copy link
Member Author

ldionne commented May 22, 2024

Would you mind being explicit about the dependency graph between the various units?

I don't know what the final state will look like. However, I imagine having a small number of headers that are dependency-free (apart from including __config_site) since they only need to use compiler-provided macros to detect stuff, and then a few headers that build on top of that (like the ABI configuration) which basically include what they use like we do in the rest of the library.

For example, we never noticed that the string layout used on arm64 was only enabled for the Clang compiler, as the setting being in the compiler == clang block was probably never intentional.

I see that this underlying issue hasn't been addressed and still exists in this change. What's the plan for addressing it? Since that bug is the cited motivation for this change, I would like to see how this change would have improved things, which is hard to imagine when the issue remains.

That's right, this change doesn't attempt to change that, since that would break the ABI of libc++ on GCC / arm64. I'll have to think about whether this is something we want to (and can) do, but that change belongs in a separate patch due to how tricky it is.

This is a first step towards splitting up the <__config> header. The
<__config> header is large and rather disorganized at this point, leading
to confusion and subtle mistakes. For example, we never noticed that the
string layout used on arm64 was only enabled for the Clang compiler, as
the setting being in the compiler == clang block was probably never
intentional.

The danger of splitting up the <__config> header is to implicitly use
undefined macros that should have been defined prior to their usage,
however this can be remediated with -Wundef and we've started moving
towards -Wundef enforceable macros.
@ldionne ldionne force-pushed the review/config-refactor branch from 3578d18 to 10bcd68 Compare May 22, 2024 18:09
@ldionne ldionne merged commit 23e1ed6 into llvm:main May 28, 2024
53 of 54 checks passed
@ldionne ldionne deleted the review/config-refactor branch May 28, 2024 11:22
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…llvm#90863)

This is a first step towards splitting up the <__config> header. The
<__config> header is large and rather disorganized at this point,
leading to confusion and subtle mistakes. For example, we never noticed
that the string layout used on arm64 was only enabled for the Clang
compiler, as the setting being in the compiler == clang block was
probably never intentional.

The danger of splitting up the <__config> header is to implicitly use
undefined macros that should have been defined prior to their usage,
however this can be remediated with -Wundef and we've started moving
towards -Wundef enforceable macros.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants