[libc++] Optimize num_put integral functions #120859

philnik777 · 2024-12-21T23:12:28Z

-------------------------------------------------------
Benchmark                              old          new
-------------------------------------------------------
BM_num_put<bool>                   76.2 ns      32.0 ns
BM_num_put<long>                   76.9 ns      33.1 ns
BM_num_put<long long>              77.9 ns      34.2 ns
BM_num_put<unsigned long>          78.4 ns      33.1 ns
BM_num_put<unsigned long long>     78.0 ns      34.4 ns
BM_num_put<double>                  224 ns       228 ns
BM_num_put<long double>             239 ns       230 ns
BM_num_put<const void*>            68.7 ns      35.1 ns

Fixes #40109.

github-actions · 2024-12-21T23:16:04Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvmbot · 2024-12-27T15:41:29Z

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

-------------------------------------------------------
Benchmark                              old          new
-------------------------------------------------------
BM_num_put&lt;bool&gt;                   76.2 ns      32.0 ns
BM_num_put&lt;long&gt;                   76.9 ns      33.1 ns
BM_num_put&lt;long long&gt;              77.9 ns      34.2 ns
BM_num_put&lt;unsigned long&gt;          78.4 ns      33.1 ns
BM_num_put&lt;unsigned long long&gt;     78.0 ns      34.4 ns
BM_num_put&lt;double&gt;                  224 ns       228 ns
BM_num_put&lt;long double&gt;             239 ns       230 ns
BM_num_put&lt;const void*&gt;            68.7 ns      35.1 ns

Fixes #40109.

Patch is 35.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/120859.diff

14 Files Affected:

(modified) libcxx/docs/ReleaseNotes/20.rst (+2)
(modified) libcxx/include/__charconv/tables.h (+8-12)
(modified) libcxx/include/__charconv/to_chars_base_10.h (+14-18)
(modified) libcxx/include/__charconv/to_chars_integral.h (+45-27)
(modified) libcxx/include/__charconv/to_chars_result.h (+9)
(modified) libcxx/include/__charconv/traits.h (+3-7)
(modified) libcxx/include/__format/formatter_floating_point.h (+1)
(modified) libcxx/include/__format/formatter_integral.h (+1-1)
(modified) libcxx/include/__format/formatter_output.h (-18)
(modified) libcxx/include/locale (+65-39)
(modified) libcxx/include/module.modulemap (+5-1)
(added) libcxx/test/benchmarks/locale/num_put.bench.cpp (+39)
(modified) libcxx/test/std/input.output/iostream.format/output.streams/ostream.formatted/ostream.inserters.arithmetic/minus1.pass.cpp (+2-1)
(modified) libcxx/test/std/localization/locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_pointer.pass.cpp (+21-26)

diff --git a/libcxx/docs/ReleaseNotes/20.rst b/libcxx/docs/ReleaseNotes/20.rst
index c8a07fb8b73348..4cac642f22948d 100644
--- a/libcxx/docs/ReleaseNotes/20.rst
+++ b/libcxx/docs/ReleaseNotes/20.rst
@@ -73,6 +73,8 @@ Improvements and New Features
   optimized, resulting in a performance improvement of up to 2x for trivial element types (e.g., `std::vector<int>`),
   and up to 3.4x for non-trivial element types (e.g., `std::vector<std::vector<int>>`).
 
+- The ``num_get::do_put`` integral overloads have been optimized, resulting in a performance improvement of up to 2.4x.
+
 Deprecations and Removals
 -------------------------
 
diff --git a/libcxx/include/__charconv/tables.h b/libcxx/include/__charconv/tables.h
index 9568bf841cd029..b8c6fd8af0a0f8 100644
--- a/libcxx/include/__charconv/tables.h
+++ b/libcxx/include/__charconv/tables.h
@@ -19,16 +19,14 @@
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
-#if _LIBCPP_STD_VER >= 17
-
 namespace __itoa {
 
-inline constexpr char __base_2_lut[64] = {
+inline _LIBCPP_CONSTEXPR const char __base_2_lut[64] = {
     '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '1', '0', '0', '0', '1', '1', '0', '1', '0', '0', '0', '1',
     '0', '1', '0', '1', '1', '0', '0', '1', '1', '1', '1', '0', '0', '0', '1', '0', '0', '1', '1', '0', '1', '0',
     '1', '0', '1', '1', '1', '1', '0', '0', '1', '1', '0', '1', '1', '1', '1', '0', '1', '1', '1', '1'};
 
-inline constexpr char __base_8_lut[128] = {
+inline _LIBCPP_CONSTEXPR const char __base_8_lut[128] = {
     '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '1', '0', '1', '1', '1', '2',
     '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5',
     '2', '6', '2', '7', '3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '4', '0',
@@ -36,7 +34,7 @@ inline constexpr char __base_8_lut[128] = {
     '5', '4', '5', '5', '5', '6', '5', '7', '6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6',
     '6', '7', '7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7'};
 
-inline constexpr char __base_16_lut[512] = {
+inline _LIBCPP_CONSTEXPR const char __base_16_lut[512] = {
     '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9', '0', 'a', '0',
     'b', '0', 'c', '0', 'd', '0', 'e', '0', 'f', '1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6',
     '1', '7', '1', '8', '1', '9', '1', 'a', '1', 'b', '1', 'c', '1', 'd', '1', 'e', '1', 'f', '2', '0', '2', '1', '2',
@@ -61,7 +59,7 @@ inline constexpr char __base_16_lut[512] = {
     '1', 'f', '2', 'f', '3', 'f', '4', 'f', '5', 'f', '6', 'f', '7', 'f', '8', 'f', '9', 'f', 'a', 'f', 'b', 'f', 'c',
     'f', 'd', 'f', 'e', 'f', 'f'};
 
-inline constexpr uint32_t __pow10_32[10] = {
+inline _LIBCPP_CONSTEXPR const uint32_t __pow10_32[10] = {
     UINT32_C(0),
     UINT32_C(10),
     UINT32_C(100),
@@ -73,7 +71,7 @@ inline constexpr uint32_t __pow10_32[10] = {
     UINT32_C(100000000),
     UINT32_C(1000000000)};
 
-inline constexpr uint64_t __pow10_64[20] = {
+inline _LIBCPP_CONSTEXPR const uint64_t __pow10_64[20] = {
     UINT64_C(0),
     UINT64_C(10),
     UINT64_C(100),
@@ -96,8 +94,8 @@ inline constexpr uint64_t __pow10_64[20] = {
     UINT64_C(10000000000000000000)};
 
 #  if _LIBCPP_HAS_INT128
-inline constexpr int __pow10_128_offset      = 0;
-inline constexpr __uint128_t __pow10_128[40] = {
+inline _LIBCPP_CONSTEXPR const int __pow10_128_offset      = 0;
+inline _LIBCPP_CONSTEXPR const __uint128_t __pow10_128[40] = {
     UINT64_C(0),
     UINT64_C(10),
     UINT64_C(100),
@@ -140,7 +138,7 @@ inline constexpr __uint128_t __pow10_128[40] = {
     (__uint128_t(UINT64_C(10000000000000000000)) * UINT64_C(10000000000000000000)) * 10};
 #  endif
 
-inline constexpr char __digits_base_10[200] = {
+inline _LIBCPP_CONSTEXPR const char __digits_base_10[200] = {
     // clang-format off
     '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
     '1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
@@ -156,8 +154,6 @@ inline constexpr char __digits_base_10[200] = {
 
 } // namespace __itoa
 
-#endif // _LIBCPP_STD_VER >= 17
-
 _LIBCPP_END_NAMESPACE_STD
 
 #endif // _LIBCPP___CHARCONV_TABLES
diff --git a/libcxx/include/__charconv/to_chars_base_10.h b/libcxx/include/__charconv/to_chars_base_10.h
index 06e4e692337df5..d90952ea71f356 100644
--- a/libcxx/include/__charconv/to_chars_base_10.h
+++ b/libcxx/include/__charconv/to_chars_base_10.h
@@ -26,55 +26,53 @@ _LIBCPP_PUSH_MACROS
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
-#if _LIBCPP_STD_VER >= 17
-
 namespace __itoa {
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append1(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append1(char* __first, uint32_t __value) _NOEXCEPT {
   *__first = '0' + static_cast<char>(__value);
   return __first + 1;
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append2(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append2(char* __first, uint32_t __value) _NOEXCEPT {
   return std::copy_n(&__digits_base_10[__value * 2], 2, __first);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append3(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append3(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append2(__itoa::__append1(__first, __value / 100), __value % 100);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append4(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append4(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append2(__itoa::__append2(__first, __value / 100), __value % 100);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append5(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append5(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append4(__itoa::__append1(__first, __value / 10000), __value % 10000);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append6(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append6(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append4(__itoa::__append2(__first, __value / 10000), __value % 10000);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append7(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append7(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append6(__itoa::__append1(__first, __value / 1000000), __value % 1000000);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append8(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append8(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append6(__itoa::__append2(__first, __value / 1000000), __value % 1000000);
 }
 
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append9(char* __first, uint32_t __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char* __append9(char* __first, uint32_t __value) _NOEXCEPT {
   return __itoa::__append8(__itoa::__append1(__first, __value / 100000000), __value % 100000000);
 }
 
 template <class _Tp>
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI char* __append10(char* __first, _Tp __value) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI char* __append10(char* __first, _Tp __value) _NOEXCEPT {
   return __itoa::__append8(__itoa::__append2(__first, static_cast<uint32_t>(__value / 100000000)),
                            static_cast<uint32_t>(__value % 100000000));
 }
 
 _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char*
-__base_10_u32(char* __first, uint32_t __value) noexcept {
+__base_10_u32(char* __first, uint32_t __value) _NOEXCEPT {
   if (__value < 1000000) {
     if (__value < 10000) {
       if (__value < 100) {
@@ -110,7 +108,7 @@ __base_10_u32(char* __first, uint32_t __value) noexcept {
 }
 
 _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char*
-__base_10_u64(char* __buffer, uint64_t __value) noexcept {
+__base_10_u64(char* __buffer, uint64_t __value) _NOEXCEPT {
   if (__value <= UINT32_MAX)
     return __itoa::__base_10_u32(__buffer, static_cast<uint32_t>(__value));
 
@@ -132,13 +130,13 @@ __base_10_u64(char* __buffer, uint64_t __value) noexcept {
 /// \note The lookup table contains a partial set of exponents limiting the
 /// range that can be used. However the range is sufficient for
 /// \ref __base_10_u128.
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline __uint128_t __pow_10(int __exp) noexcept {
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline __uint128_t __pow_10(int __exp) _NOEXCEPT {
   _LIBCPP_ASSERT_INTERNAL(__exp >= __pow10_128_offset, "Index out of bounds");
   return __pow10_128[__exp - __pow10_128_offset];
 }
 
 _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI inline char*
-__base_10_u128(char* __buffer, __uint128_t __value) noexcept {
+__base_10_u128(char* __buffer, __uint128_t __value) _NOEXCEPT {
   _LIBCPP_ASSERT_INTERNAL(
       __value > numeric_limits<uint64_t>::max(), "The optimizations for this algorithm fails when this isn't true.");
 
@@ -179,8 +177,6 @@ __base_10_u128(char* __buffer, __uint128_t __value) noexcept {
 #  endif
 } // namespace __itoa
 
-#endif // _LIBCPP_STD_VER >= 17
-
 _LIBCPP_END_NAMESPACE_STD
 
 _LIBCPP_POP_MACROS
diff --git a/libcxx/include/__charconv/to_chars_integral.h b/libcxx/include/__charconv/to_chars_integral.h
index 710299df9b4da8..238c96d7c7a041 100644
--- a/libcxx/include/__charconv/to_chars_integral.h
+++ b/libcxx/include/__charconv/to_chars_integral.h
@@ -39,16 +39,12 @@ _LIBCPP_PUSH_MACROS
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
-#if _LIBCPP_STD_VER >= 17
-
-to_chars_result to_chars(char*, char*, bool, int = 10) = delete;
-
 template <typename _Tp>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
 __to_chars_itoa(char* __first, char* __last, _Tp __value, false_type);
 
 template <typename _Tp>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
 __to_chars_itoa(char* __first, char* __last, _Tp __value, true_type) {
   auto __x = std::__to_unsigned_like(__value);
   if (__value < 0 && __first != __last) {
@@ -60,7 +56,7 @@ __to_chars_itoa(char* __first, char* __last, _Tp __value, true_type) {
 }
 
 template <typename _Tp>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
 __to_chars_itoa(char* __first, char* __last, _Tp __value, false_type) {
   using __tx  = __itoa::__traits<_Tp>;
   auto __diff = __last - __first;
@@ -73,7 +69,7 @@ __to_chars_itoa(char* __first, char* __last, _Tp __value, false_type) {
 
 #  if _LIBCPP_HAS_INT128
 template <>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
 __to_chars_itoa(char* __first, char* __last, __uint128_t __value, false_type) {
   // When the value fits in 64-bits use the 64-bit code path. This reduces
   // the number of expensive calculations on 128-bit values.
@@ -92,20 +88,20 @@ __to_chars_itoa(char* __first, char* __last, __uint128_t __value, false_type) {
 }
 #  endif
 
-template <class _Tp>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
-__to_chars_integral(char* __first, char* __last, _Tp __value, int __base, false_type);
+template <class _Tp, __enable_if_t<!is_signed<_Tp>::value, int> = 0>
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
+__to_chars_integral(char* __first, char* __last, _Tp __value, int __base);
 
-template <typename _Tp>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
-__to_chars_integral(char* __first, char* __last, _Tp __value, int __base, true_type) {
+template <class _Tp, __enable_if_t<is_signed<_Tp>::value, int> = 0>
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
+__to_chars_integral(char* __first, char* __last, _Tp __value, int __base) {
   auto __x = std::__to_unsigned_like(__value);
   if (__value < 0 && __first != __last) {
     *__first++ = '-';
     __x        = std::__complement(__x);
   }
 
-  return std::__to_chars_integral(__first, __last, __x, __base, false_type());
+  return std::__to_chars_integral(__first, __last, __x, __base);
 }
 
 namespace __itoa {
@@ -116,7 +112,7 @@ struct _LIBCPP_HIDDEN __integral;
 template <>
 struct _LIBCPP_HIDDEN __integral<2> {
   template <typename _Tp>
-  _LIBCPP_HIDE_FROM_ABI static constexpr int __width(_Tp __value) noexcept {
+  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR int __width(_Tp __value) _NOEXCEPT {
     // If value == 0 still need one digit. If the value != this has no
     // effect since the code scans for the most significant bit set. (Note
     // that __libcpp_clz doesn't work for 0.)
@@ -124,7 +120,7 @@ struct _LIBCPP_HIDDEN __integral<2> {
   }
 
   template <typename _Tp>
-  _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI static to_chars_result
+  _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI static __to_chars_result
   __to_chars(char* __first, char* __last, _Tp __value) {
     ptrdiff_t __cap = __last - __first;
     int __n         = __width(__value);
@@ -152,7 +148,7 @@ struct _LIBCPP_HIDDEN __integral<2> {
 template <>
 struct _LIBCPP_HIDDEN __integral<8> {
   template <typename _Tp>
-  _LIBCPP_HIDE_FROM_ABI static constexpr int __width(_Tp __value) noexcept {
+  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR int __width(_Tp __value) _NOEXCEPT {
     // If value == 0 still need one digit. If the value != this has no
     // effect since the code scans for the most significat bit set. (Note
     // that __libcpp_clz doesn't work for 0.)
@@ -160,7 +156,7 @@ struct _LIBCPP_HIDDEN __integral<8> {
   }
 
   template <typename _Tp>
-  _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI static to_chars_result
+  _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI static __to_chars_result
   __to_chars(char* __first, char* __last, _Tp __value) {
     ptrdiff_t __cap = __last - __first;
     int __n         = __width(__value);
@@ -188,7 +184,7 @@ struct _LIBCPP_HIDDEN __integral<8> {
 template <>
 struct _LIBCPP_HIDDEN __integral<16> {
   template <typename _Tp>
-  _LIBCPP_HIDE_FROM_ABI static constexpr int __width(_Tp __value) noexcept {
+  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR int __width(_Tp __value) _NOEXCEPT {
     // If value == 0 still need one digit. If the value != this has no
     // effect since the code scans for the most significat bit set. (Note
     // that __libcpp_clz doesn't work for 0.)
@@ -196,7 +192,7 @@ struct _LIBCPP_HIDDEN __integral<16> {
   }
 
   template <typename _Tp>
-  _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI static to_chars_result
+  _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI static __to_chars_result
   __to_chars(char* __first, char* __last, _Tp __value) {
     ptrdiff_t __cap = __last - __first;
     int __n         = __width(__value);
@@ -235,13 +231,13 @@ _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI int __to_chars_integral_widt
 }
 
 template <unsigned _Base, typename _Tp, __enable_if_t<(sizeof(_Tp) >= sizeof(unsigned)), int> = 0>
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
 __to_chars_integral(char* __first, char* __last, _Tp __value) {
   return __itoa::__integral<_Base>::__to_chars(__first, __last, __value);
 }
 
 template <unsigned _Base, typename _Tp, __enable_if_t<(sizeof(_Tp) < sizeof(unsigned)), int> = 0>
-_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
+_LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
 __to_chars_integral(char* __first, char* __last, _Tp __value) {
   return std::__to_chars_integral<_Base>(__first, __last, static_cast<unsigned>(__value));
 }
@@ -272,9 +268,9 @@ _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI int __to_chars_integral_widt
   __libcpp_unreachable();
 }
 
-template <typename _Tp>
-inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
-__to_chars_integral(char* __first, char* __last, _Tp __value, int __base, false_type) {
+template <class _Tp, __enable_if_t<!is_signed<_Tp>::value, int> >
+inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI __to_chars_result
+__to_chars_integral(char* __first, char* __last, _Tp __value, int __base) {
   if (__base == 10) [[likely]]
     return std::__to_chars_itoa(__first, __last, __value, false_type());
 
@@ -302,6 +298,28 @@ __to_chars_integral(char* __first, char* __last, _Tp __value, int __base, false_
   return {__last, errc(0)};
 }
 
+_LIBCPP_HIDE_FROM_ABI inline _LIBCPP_CONSTEXPR_SINCE_CXX14 char __hex_to_upper(char __c) {
+  switch (__c) {
+  case 'a':
+    return 'A';
+  case 'b':
+    return 'B';
+  case 'c':
+    return 'C';
+  case 'd':
+    return 'D';
+  case 'e':
+    return 'E';
+  case 'f':
+    return 'F';
+  }
+  return __c;
+}
+
+#if _LIBCPP_STD_VER >= 17
+
+to_chars_result to_chars(char*, char*, bool, int = 10) = delete;
+
 template <typename _Tp, __enable_if_t<is_integral<_Tp>::value, int> = 0>
 inline _LIBCPP_CONSTEXPR_SINCE_CXX23 _LIBCPP_HIDE_FROM_ABI to_chars_result
 to_chars(char* __first, char* __last, _Tp __value) {
@@ -316,7 +334,7 @@ to_chars(char* __first, char* __last, _Tp __value, int __base) {
   _LIBCPP_ASSERT_UNCATEGORIZED(2 <= __base && __base <= 36, "base not in [2, 36]");
 
   using _Type = __make_32_64_or_128_bit_t<_Tp>;
-  return std::__to_chars_integral(__first, __last, static_cast<_Type>(__value), __base, is_signed<_Tp>());
+  return std::__to_chars_integral(__first, __last, static_cast<_Type>(__value), __base);
 }
 
 #endif // _LIBCPP_STD_VER >= 17
diff --git a/libcxx/include/__charconv/to_chars_result.h b/libcxx/include/__charconv/to_chars_result.h
index 8df0897a49fbbd..2ff7bd39ea3e1d 100644
--- a/libcxx/include/__charconv/to_chars_result.h
+++ b/libcxx/include/__charconv/to_chars_result.h
@@ -34,6 +34,15 @@ struct _LIBCPP_EXPORTED_FROM_ABI to_chars_result {
 
 #endif // _LIBCPP_STD_VER >= 17
 
+struct _LIBCPP_EXPORTED_FROM_ABI __to_chars_result {
+  char* __ptr;
+  errc __ec;
+
+#if _LIBCPP_STD_VER >= 17
+  _LIBCPP_HIDE_FROM_ABI constexpr operator to_chars_result() { return {__ptr, __ec}; }
+#endif
+};
+
 _LIBCPP_END_NAMESPACE_STD
 
 #endif // _LIBCPP___CHARCONV_TO_CHARS_RESULT_H
diff --git a/libcxx/include/__charconv/traits.h b/libcxx/include/__charconv/traits.h
index 2cb37c8cfb0238..36339c8c5ebf1b 100644
--- a/libcxx/include/__charconv/traits.h
+++ b/libcxx/include/__charconv/traits.h
@@ -29,15 +29,13 @@ _LIBCPP_PUSH_MACROS
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
-#if _LIBCPP_STD_VER >= 17
-
 namespace __itoa {
 
 template <typename _Tp, typename = void>
 struct _LIBCPP_HIDDEN __traits_base;
 
 template <typename _Tp>
-struct _LIBCPP_HIDDEN __traits_base<_Tp, __enable_if_t<sizeof(_Tp) <= sizeof(uint32_t)>> {
+struct _LIBCPP_HIDDEN __traits_base<_Tp, __enable_if_t<sizeof(_Tp) <= sizeof(uint32_t)> > {
   using type = uint32_t;
 
   /// The width estimation using a log10 algorithm.
@@ -63,7 +61,7 @@ struct _LIBCPP_HIDDEN __traits_base<_Tp, __enable_if_t<sizeof(_Tp) <= sizeof(uin
 };
 
 template <typename _Tp>
-struct _LIBCPP_HIDDEN __traits_base<_Tp, __enable_if_t<sizeof(_Tp) == sizeof(uint64_t)>> {
+struct _LIBCPP_HIDDEN __traits_base<_Tp, __enable_if_t<sizeof(_Tp) == sizeof(uint64_t)> > {
   using type = uint64_t;
 
   /// The width estimation using a log10 algorithm.
@@ -152,7 +150,7 @@ inline _LIBCPP_HIDE_FROM_ABI bool _LIBCPP_CONSTEXPR_SINCE_CXX23 __mul_overflowed
 
 template <typename _Tp>
 struct _LIBCPP_H...
[truncated]

ldionne

LGTM with a LWG issue (please post the link)

libcxx/include/__charconv/to_chars_result.h

libcxx/include/locale

...ostream.format/output.streams/ostream.formatted/ostream.inserters.arithmetic/minus1.pass.cpp

libcxx/include/module.modulemap

.../locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_pointer.pass.cpp

philnik777 · 2025-03-05T13:18:44Z

The failures look unrelated.

nico · 2025-03-05T19:18:51Z

We have a test that checks that streaming a pointer to an ostringstream prints a 0x prefix on non-win but not on win.

(I suppose the test is just supposed to document current behavior.)

This seems to change the behavior on windows to print a 0x prefix there too. Is that intentional?

nico · 2025-03-05T19:20:15Z

Also, one of these now seems to fail on chromeos: https://source.chromium.org/chromium/chromium/src/+/main:third_party/shell-encryption/src/int256_test.cc;l=527

philnik777 · 2025-03-05T19:29:55Z

We have a test that checks that streaming a pointer to an ostringstream prints a 0x prefix on non-win but not on win.

(I suppose the test is just supposed to document current behavior.)

This seems to change the behavior on windows to print a 0x prefix there too. Is that intentional?

Depends on what you mean by "intentional". We're aware that this changed and think that the improved performance is worth the change in behaviour. It also makes the output more consistent across platforms. Do you think this is important enough that we should try to mimic the platforms behaviour?

Also, one of these now seems to fail on chromeos: https://source.chromium.org/chromium/chromium/src/+/main:third_party/shell-encryption/src/int256_test.cc;l=527

Since int256 seems to be a custom type of yours, I'm not sure how it outputs stuff. Could you give a bit more detail on what change you observe?

nico · 2025-03-05T19:32:43Z

Depends on what you mean by "intentional". We're aware that this changed and think that the improved performance is worth the change in behaviour. It also makes the output more consistent across platforms. Do you think this is important enough that we should try to mimic the platforms behaviour?

No, I was just wondering if it's intentional, since the commit description says "optimize" and normally optimizations don't change behavior, and it added a libcxx/docs/ReleaseNotes/20.rst change but didn't mention the behavior change. We can update the test if it's intentional.

philnik777 · 2025-03-05T19:51:55Z

Depends on what you mean by "intentional". We're aware that this changed and think that the improved performance is worth the change in behaviour. It also makes the output more consistent across platforms. Do you think this is important enough that we should try to mimic the platforms behaviour?

No, I was just wondering if it's intentional, since the commit description says "optimize" and normally optimizations don't change behavior, and it added a libcxx/docs/ReleaseNotes/20.rst change but didn't mention the behavior change. We can update the test if it's intentional.

Yeah, maybe we should add a release notes about that. Seems like it's worth mentioning.

nico · 2025-03-11T17:55:45Z

Did you end up adding a release notes entry? (This would be useful to understand what exactly the intentional change is, and to point our upstream projects to as motivation for updating their tests.)

I reduced the int256 thing mentioned above to

#include <algorithm>
#include <sstream>
#include <utility>

#include <gtest/gtest.h>

struct MyType {};

std::ostream& operator<<(std::ostream& o, const MyType& b) {
  return o << 0;
}

TEST(Int256, OStream) {
  struct {
    std::ios_base::fmtflags flags;
    const char* rep;
  } cases[] = {
      // showbase does nothing on zero
      { std::ios::dec | std::ios::showbase, "0"},
      { std::ios::oct | std::ios::showbase, "0"},
      { std::ios::hex | std::ios::showbase, "0"},
  };
  for (size_t i = 0; i < ABSL_ARRAYSIZE(cases); ++i) {
    std::ostringstream os;
    os.flags(cases[i].flags);
    MyType type;
    os << type;
    EXPECT_EQ(cases[i].rep, os.str());
  }
}

That used to pass; now it outputs 0, 00, and 0x0 instead. (I.e. showbase on 0 used to not add a prefix and now it does.)

I'm guessing that's also intentional?

Upsteam libc++ is now also showing the octal `0` and hex `0x` prefixes with zero, see llvm/llvm-project#120859 These tests were failing with upstream libc++. The previous three lines test this enough; just remove the tests that now have inconsistent output across C++ standard libraries.

alexfh · 2025-03-14T07:54:32Z

No, I was just wondering if it's intentional, since the commit description says "optimize" and normally optimizations don't change behavior, and it added a libcxx/docs/ReleaseNotes/20.rst change but didn't mention the behavior change. We can update the test if it's intentional.

Yeah, maybe we should add a release notes about that. Seems like it's worth mentioning.

It's definitely worth adding release notes, especially given that the new behavior diverges both from the previous libc++ and the current libstdc++ behavior - https://gcc.godbolt.org/z/zK5689Y8P.

alexfh · 2025-03-14T08:22:53Z

Actually, this behavior change affects a non-trivial amount of code, both Google-internal and open-source (for now, I found breakages in Abseil and tinyformat - https://github.com/c42f/tinyformat). I wonder if the C++ standard says anything about the particular formatting that should be used in these cases? https://gcc.godbolt.org/z/zK5689Y8P

alexfh · 2025-03-14T14:10:20Z

I was pointed at https://bugzilla.redhat.com/show_bug.cgi?id=166735, which seems to claim that 0x0 should just be 0, IIUC.

alexfh · 2025-03-14T14:40:02Z

The relevant parts of the recent C++ and C standards are: https://eel.is/c++draft/facet.num.put.virtuals, https://eel.is/c++draft/intro.refs, and https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf 7.21.6.1 paragraph 6 (page 229).

mordante · 2025-03-14T16:45:19Z

I was pointed at https://bugzilla.redhat.com/show_bug.cgi?id=166735, which seems to claim that 0x0 should just be 0, IIUC.

@nico @alexfh Thanks for the information. I agree that # for octal and hexadeximal should print 0 instead of 00 and 0x0.

alexfh · 2025-03-15T01:15:32Z

I was pointed at https://bugzilla.redhat.com/show_bug.cgi?id=166735, which seems to claim that 0x0 should just be 0, IIUC.

@nico @alexfh Thanks for the information. I agree that # for octal and hexadeximal should print 0 instead of 00 and 0x0.

Apart from that, showpos should always print sign, i.e. for 0 the formatting output should be +0 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf says + The result of a signed conversion always begins with a plus or minus sign.)

As for pointers with null value, The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner., so this part isn't regulated by the standard.

@ldionne @philnik777 do you agree with the above arguments?

This reverts commit 15edf87.

alexfh · 2025-03-17T13:44:13Z

I've sent #131613 to test the feasibility of a revert at this point, but a timely fix forward is definitely an option as well. @ldionne @philnik777 what's your take on this?

Reverts #120859 This change breaks formatting of `0` with `std::showbase` + `std::hex` or `std::oct`, as well as `+0` with `std::showpos`. I believe the new behavior is violating the standard. See #120859 (comment) and later comments for details and explanation.

…1613) Reverts llvm/llvm-project#120859 This change breaks formatting of `0` with `std::showbase` + `std::hex` or `std::oct`, as well as `+0` with `std::showpos`. I believe the new behavior is violating the standard. See llvm/llvm-project#120859 (comment) and later comments for details and explanation.

ldionne · 2025-03-18T16:05:08Z

Thanks a lot for the investigation. I filed #131710 for @philnik777 to take a look at this again when he's back from vacation.

FWIW I'm not surprised that our tests in this area were not 100% sufficient: we talked about it during the review but I'm not very surprised that something was missed, since this is a very widely used API. During review we had discussed testing against other implementations as well but I think this wasn't followed through -- when we re-land this we should try to also increase coverage.

``` ------------------------------------------------------- Benchmark old new ------------------------------------------------------- BM_num_put<bool> 76.2 ns 32.0 ns BM_num_put<long> 76.9 ns 33.1 ns BM_num_put<long long> 77.9 ns 34.2 ns BM_num_put<unsigned long> 78.4 ns 33.1 ns BM_num_put<unsigned long long> 78.0 ns 34.4 ns BM_num_put<double> 224 ns 228 ns BM_num_put<long double> 239 ns 230 ns BM_num_put<const void*> 68.7 ns 35.1 ns ``` Fixes llvm#40109.

philnik777 changed the title ~~[libc++] Optimize num_get integral functions~~ [libc++] Optimize num_put integral functions Dec 22, 2024

philnik777 force-pushed the optimize_num_put branch 8 times, most recently from b6118e1 to bb7789e Compare December 27, 2024 10:11

philnik777 marked this pull request as ready for review December 27, 2024 15:40

philnik777 requested a review from a team as a code owner December 27, 2024 15:40

llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Dec 27, 2024

ldionne approved these changes Jan 15, 2025

View reviewed changes

philnik777 force-pushed the optimize_num_put branch from bb7789e to edeb995 Compare February 26, 2025 12:22

[libc++] Optimize num_get integral functions

92fa847

philnik777 force-pushed the optimize_num_put branch from edeb995 to 92fa847 Compare February 28, 2025 12:41

philnik777 merged commit 15edf87 into llvm:main Mar 5, 2025
68 of 78 checks passed

nico mentioned this pull request Mar 11, 2025

Remove int256 operator<< test for showbase with 0 google/shell-encryption#22

Merged

alexfh added a commit that referenced this pull request Mar 17, 2025

Revert "[libc++] Optimize num_put integral functions (#120859)"

5c1188d

This reverts commit 15edf87.

alexfh mentioned this pull request Mar 17, 2025

Revert "[libc++] Optimize num_put integral functions" #131613

Merged

philnik777 deleted the optimize_num_put branch March 29, 2025 08:08

[libc++] Optimize num_put integral functions #120859

[libc++] Optimize num_put integral functions #120859

Uh oh!

Conversation

philnik777 commented Dec 21, 2024

Uh oh!

github-actions bot commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 27, 2024

Uh oh!

ldionne left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

philnik777 commented Mar 5, 2025

Uh oh!

Uh oh!

nico commented Mar 5, 2025

Uh oh!

nico commented Mar 5, 2025

Uh oh!

philnik777 commented Mar 5, 2025

Uh oh!

nico commented Mar 5, 2025

Uh oh!

philnik777 commented Mar 5, 2025

Uh oh!

nico commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexfh commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexfh commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexfh commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexfh commented Mar 14, 2025

Uh oh!

mordante commented Mar 14, 2025

Uh oh!

alexfh commented Mar 15, 2025

Uh oh!

alexfh commented Mar 17, 2025

Uh oh!

ldionne commented Mar 18, 2025

Uh oh!

Uh oh!

github-actions bot commented Dec 21, 2024 •

edited

Loading

nico commented Mar 11, 2025 •

edited

Loading

alexfh commented Mar 14, 2025 •

edited

Loading

alexfh commented Mar 14, 2025 •

edited

Loading

alexfh commented Mar 14, 2025 •

edited

Loading