Skip to content

[libc++] Define an internal locale API as a shim on top of the current one #114596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ldionne
Copy link
Member

@ldionne ldionne commented Nov 1, 2024

Our current locale base API is a mix of non-reserved system names that we incorrectly (re)define and internal functions and macros starting with __libcpp. This patch introduces a function-based internal interface to isolate the rest of the code base from that mess, so that we can work on refactoring how each platform implements the base API in subsequent patches. This makes it possible to refactor how each platform implements the base localization API without impacting the rest of the code base.

@ldionne ldionne requested a review from a team as a code owner November 1, 2024 19:24
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 1, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 1, 2024

@llvm/pr-subscribers-libcxx

Author: Louis Dionne (ldionne)

Changes

Our current locale base API is a mix of non-reserved system names that we incorrectly (re)define and internal functions and macros starting with __libcpp. This patch introduces a function-based internal interface to isolate the rest of the code base from that mess, so that we can work on refactoring how each platform implements the base API in subsequent patches. This makes it possible to refactor how each platform implements the base localization API without impacting the rest of the code base.


Patch is 63.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114596.diff

7 Files Affected:

  • (modified) libcxx/include/__locale (+5-5)
  • (modified) libcxx/include/__locale_dir/locale_base_api.h (+237-64)
  • (modified) libcxx/include/__locale_dir/locale_base_api/bsd_locale_defaults.h (+1)
  • (modified) libcxx/include/__locale_dir/locale_guard.h (+5-5)
  • (modified) libcxx/include/locale (+13-13)
  • (modified) libcxx/src/iostream.cpp (+2-2)
  • (modified) libcxx/src/locale.cpp (+143-143)
diff --git a/libcxx/include/__locale b/libcxx/include/__locale
index 5fb0c19903cd5b..f251eb638af4c0 100644
--- a/libcxx/include/__locale
+++ b/libcxx/include/__locale
@@ -247,7 +247,7 @@ class _LIBCPP_TEMPLATE_VIS collate_byname;
 
 template <>
 class _LIBCPP_EXPORTED_FROM_ABI collate_byname<char> : public collate<char> {
-  locale_t __l_;
+  __locale::__locale_t __l_;
 
 public:
   typedef char char_type;
@@ -266,7 +266,7 @@ protected:
 #ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
 template <>
 class _LIBCPP_EXPORTED_FROM_ABI collate_byname<wchar_t> : public collate<wchar_t> {
-  locale_t __l_;
+  __locale::__locale_t __l_;
 
 public:
   typedef wchar_t char_type;
@@ -616,7 +616,7 @@ class _LIBCPP_TEMPLATE_VIS ctype_byname;
 
 template <>
 class _LIBCPP_EXPORTED_FROM_ABI ctype_byname<char> : public ctype<char> {
-  locale_t __l_;
+  __locale::__locale_t __l_;
 
 public:
   explicit ctype_byname(const char*, size_t = 0);
@@ -633,7 +633,7 @@ protected:
 #ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
 template <>
 class _LIBCPP_EXPORTED_FROM_ABI ctype_byname<wchar_t> : public ctype<wchar_t> {
-  locale_t __l_;
+  __locale::__locale_t __l_;
 
 public:
   explicit ctype_byname(const char*, size_t = 0);
@@ -824,7 +824,7 @@ protected:
 #ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
 template <>
 class _LIBCPP_EXPORTED_FROM_ABI codecvt<wchar_t, char, mbstate_t> : public locale::facet, public codecvt_base {
-  locale_t __l_;
+  __locale::__locale_t __l_;
 
 public:
   typedef wchar_t intern_type;
diff --git a/libcxx/include/__locale_dir/locale_base_api.h b/libcxx/include/__locale_dir/locale_base_api.h
index bef10db1e1e59b..253799a049ceee 100644
--- a/libcxx/include/__locale_dir/locale_base_api.h
+++ b/libcxx/include/__locale_dir/locale_base_api.h
@@ -11,6 +11,93 @@
 
 #include <__config>
 
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+// The platform-specific headers have to provide the following interface.
+//
+// These functions are equivalent to their C counterparts, except that __locale::__locale_t
+// is used instead of the current global locale.
+//
+// Variadic functions may be implemented as templates with a parameter pack instead
+// of C-style variadic functions.
+//
+// TODO: I think __uselocale() is not necessary if we refactor a bit.
+// TODO: __localeconv shouldn't take a reference, but the Windows implementation doesn't allow copying __locale_t
+//
+// Locale management
+// -----------------
+// namespace __locale {
+//  using __locale_t = implementation-defined;
+//  __locale_t  __uselocale(__locale_t);
+//  __locale_t  __newlocale(int, const char*, __locale_t);
+//  void        __freelocale(__locale_t);
+//  lconv*      __localeconv(__locale_t&);
+// }
+//
+// Strtonum functions
+// ------------------
+// namespace __locale {
+//  float               __strtof(const char*, char**, __locale_t);
+//  double              __strtod(const char*, char**, __locale_t);
+//  long double         __strtold(const char*, char**, __locale_t);
+//  long long           __strtoll(const char*, char**, __locale_t);
+//  unsigned long long  __strtoull(const char*, char**, __locale_t);
+// }
+//
+// Character manipulation functions
+// --------------------------------
+// namespace __locale {
+//  int     __islower(int, __locale_t);
+//  int     __isupper(int, __locale_t);
+//  int     __isdigit(int, __locale_t);
+//  int     __isxdigit(int, __locale_t);
+//  int     __toupper(int, __locale_t);
+//  int     __tolower(int, __locale_t);
+//  int     __strcoll(const char*, const char*, __locale_t);
+//  size_t  __strxfrm(char*, const char*, size_t, __locale_t);
+//
+//  int     __iswspace(wint_t, __locale_t);
+//  int     __iswprint(wint_t, __locale_t);
+//  int     __iswcntrl(wint_t, __locale_t);
+//  int     __iswupper(wint_t, __locale_t);
+//  int     __iswlower(wint_t, __locale_t);
+//  int     __iswalpha(wint_t, __locale_t);
+//  int     __iswblank(wint_t, __locale_t);
+//  int     __iswdigit(wint_t, __locale_t);
+//  int     __iswpunct(wint_t, __locale_t);
+//  int     __iswxdigit(wint_t, __locale_t);
+//  wint_t  __towupper(wint_t, __locale_t);
+//  wint_t  __towlower(wint_t, __locale_t);
+//  int     __wcscoll(const wchar_t*, const wchar_t*, __locale_t);
+//  size_t  __wcsxfrm(wchar_t*, const wchar_t*, size_t, __locale_t);
+//
+//  size_t  __strftime(char*, size_t, const char*, const tm*, __locale_t);
+// }
+//
+// Other functions
+// ---------------
+// namespace __locale {
+//  implementation-defined __mb_cur_max(__locale_t);
+//  wint_t  __btowc(int, __locale_t);
+//  int     __wctob(wint_t, __locale_t);
+//  size_t  __wcsnrtombs(char*, const wchar_t**, size_t, size_t, mbstate_t*, __locale_t);
+//  size_t  __wcrtomb(char*, wchar_t, mbstate_t*, __locale_t);
+//  size_t  __mbsnrtowcs(wchar_t*, const char**, size_t, size_t, mbstate_t*, __locale_t);
+//  size_t  __mbrtowc(wchar_t*, const char*, size_t, mbstate_t*, __locale_t);
+//  int     __mbtowc(wchar_t*, const char*, size_t, __locale_t);
+//  size_t  __mbrlen(const char*, size_t, mbstate_t*, __locale_t);
+//  size_t  __mbsrtowcs(wchar_t*, const char**, size_t, mbstate_t*, __locale_t);
+//  int     __snprintf(char*, size_t, __locale_t, const char*, ...);
+//  int     __asprintf(char**, __locale_t, const char*, ...);
+//  int     __sscanf(const char*, __locale_t, const char*, ...);
+// }
+
+// TODO: This is a temporary definition to bridge between the old way we defined the locale base API
+//       (by providing global non-reserved names) and the new API. As we move individual platforms
+//       towards the new way of defining the locale base API, this should disappear since each platform
+//       will define those directly.
 #if defined(_LIBCPP_MSVCRT_LIKE)
 #  include <__locale_dir/locale_base_api/win32.h>
 #elif defined(_AIX) || defined(__MVS__)
@@ -35,71 +122,157 @@
 #  include <__locale_dir/locale_base_api/bsd_locale_fallbacks.h>
 #endif
 
-#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
-#  pragma GCC system_header
+#include <__cstddef/size_t.h>
+#include <__utility/forward.h>
+#include <string.h>
+#ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
+#  include <wctype.h>
+#endif
+_LIBCPP_BEGIN_NAMESPACE_STD
+namespace __locale {
+//
+// Locale management
+//
+using __locale_t = locale_t;
+
+inline _LIBCPP_HIDE_FROM_ABI __locale_t __uselocale(__locale_t __loc) { return uselocale(__loc); }
+
+inline _LIBCPP_HIDE_FROM_ABI __locale_t __newlocale(int __category_mask, const char* __name, __locale_t __loc) {
+  return newlocale(__category_mask, __name, __loc);
+}
+
+inline _LIBCPP_HIDE_FROM_ABI void __freelocale(__locale_t __loc) { freelocale(__loc); }
+
+inline _LIBCPP_HIDE_FROM_ABI lconv* __localeconv(__locale_t& __loc) { return __libcpp_localeconv_l(__loc); }
+
+//
+// Strtonum functions
+//
+inline _LIBCPP_HIDE_FROM_ABI float __strtof(const char* __nptr, char** __endptr, __locale_t __loc) {
+  return strtof_l(__nptr, __endptr, __loc);
+}
+
+inline _LIBCPP_HIDE_FROM_ABI double __strtod(const char* __nptr, char** __endptr, __locale_t __loc) {
+  return strtod_l(__nptr, __endptr, __loc);
+}
+
+inline _LIBCPP_HIDE_FROM_ABI long double __strtold(const char* __nptr, char** __endptr, __locale_t __loc) {
+  return strtold_l(__nptr, __endptr, __loc);
+}
+
+inline _LIBCPP_HIDE_FROM_ABI long long __strtoll(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
+  return strtoll_l(__nptr, __endptr, __base, __loc);
+}
+
+inline _LIBCPP_HIDE_FROM_ABI unsigned long long
+__strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
+  return strtoull_l(__nptr, __endptr, __base, __loc);
+}
+
+//
+// Character manipulation functions
+//
+inline _LIBCPP_HIDE_FROM_ABI int __islower(int __ch, __locale_t __loc) { return islower_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __isupper(int __ch, __locale_t __loc) { return isupper_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __isdigit(int __ch, __locale_t __loc) { return isdigit_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __isxdigit(int __ch, __locale_t __loc) { return isxdigit_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __strcoll(const char* __s1, const char* __s2, __locale_t __loc) {
+  return strcoll_l(__s1, __s2, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t __strxfrm(char* __dest, const char* __src, size_t __n, __locale_t __loc) {
+  return strxfrm_l(__dest, __src, __n, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI int __toupper(int __ch, __locale_t __loc) { return toupper_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __tolower(int __ch, __locale_t __loc) { return tolower_l(__ch, __loc); }
+
+#ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
+inline _LIBCPP_HIDE_FROM_ABI int __wcscoll(const wchar_t* __s1, const wchar_t* __s2, __locale_t __loc) {
+  return wcscoll_l(__s1, __s2, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t __wcsxfrm(wchar_t* __dest, const wchar_t* __src, size_t __n, __locale_t __loc) {
+  return wcsxfrm_l(__dest, __src, __n, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI int __iswspace(wint_t __ch, __locale_t __loc) { return iswspace_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswprint(wint_t __ch, __locale_t __loc) { return iswprint_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswcntrl(wint_t __ch, __locale_t __loc) { return iswcntrl_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswupper(wint_t __ch, __locale_t __loc) { return iswupper_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswlower(wint_t __ch, __locale_t __loc) { return iswlower_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswalpha(wint_t __ch, __locale_t __loc) { return iswalpha_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswblank(wint_t __ch, __locale_t __loc) { return iswblank_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswdigit(wint_t __ch, __locale_t __loc) { return iswdigit_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswpunct(wint_t __ch, __locale_t __loc) { return iswpunct_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __iswxdigit(wint_t __ch, __locale_t __loc) { return iswxdigit_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI wint_t __towupper(wint_t __ch, __locale_t __loc) { return towupper_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI wint_t __towlower(wint_t __ch, __locale_t __loc) { return towlower_l(__ch, __loc); }
+#endif
+
+inline _LIBCPP_HIDE_FROM_ABI size_t
+__strftime(char* __s, size_t __max, const char* __format, const tm* __tm, __locale_t __loc) {
+  return strftime_l(__s, __max, __format, __tm, __loc);
+}
+
+//
+// Other functions
+//
+inline _LIBCPP_HIDE_FROM_ABI decltype(__libcpp_mb_cur_max_l(__locale_t())) __mb_cur_max(__locale_t __loc) {
+  return __libcpp_mb_cur_max_l(__loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI wint_t __btowc(int __ch, __locale_t __loc) { return __libcpp_btowc_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI int __wctob(wint_t __ch, __locale_t __loc) { return __libcpp_wctob_l(__ch, __loc); }
+inline _LIBCPP_HIDE_FROM_ABI size_t
+__wcsnrtombs(char* __dest, const wchar_t** __src, size_t __nwc, size_t __len, mbstate_t* __ps, __locale_t __loc) {
+  return __libcpp_wcsnrtombs_l(__dest, __src, __nwc, __len, __ps, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t __wcrtomb(char* __s, wchar_t __ch, mbstate_t* __ps, __locale_t __loc) {
+  return __libcpp_wcrtomb_l(__s, __ch, __ps, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t
+__mbsnrtowcs(wchar_t* __dest, const char** __src, size_t __nms, size_t __len, mbstate_t* __ps, __locale_t __loc) {
+  return __libcpp_mbsnrtowcs_l(__dest, __src, __nms, __len, __ps, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t
+__mbrtowc(wchar_t* __pwc, const char* __s, size_t __n, mbstate_t* __ps, __locale_t __loc) {
+  return __libcpp_mbrtowc_l(__pwc, __s, __n, __ps, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI int __mbtowc(wchar_t* __pwc, const char* __pmb, size_t __max, __locale_t __loc) {
+  return __libcpp_mbtowc_l(__pwc, __pmb, __max, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t __mbrlen(const char* __s, size_t __n, mbstate_t* __ps, __locale_t __loc) {
+  return __libcpp_mbrlen_l(__s, __n, __ps, __loc);
+}
+inline _LIBCPP_HIDE_FROM_ABI size_t
+__mbsrtowcs(wchar_t* __dest, const char** __src, size_t __len, mbstate_t* __ps, __locale_t __loc) {
+  return __libcpp_mbsrtowcs_l(__dest, __src, __len, __ps, __loc);
+}
+
+_LIBCPP_DIAGNOSTIC_PUSH
+_LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wgcc-compat")
+_LIBCPP_GCC_DIAGNOSTIC_IGNORED("-Wformat-nonliteral") // GCC doesn't support [[gnu::format]] on variadic templates
+#ifdef _LIBCPP_COMPILER_CLANG_BASED
+#  define _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT(...) _LIBCPP_ATTRIBUTE_FORMAT(__VA_ARGS__)
+#else
+#  define _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT
 #endif
 
-/*
-The platform-specific headers have to provide the following interface:
-
-// TODO: rename this to __libcpp_locale_t
-using locale_t = implementation-defined;
-
-implementation-defined __libcpp_mb_cur_max_l(locale_t);
-wint_t __libcpp_btowc_l(int, locale_t);
-int __libcpp_wctob_l(wint_t, locale_t);
-size_t __libcpp_wcsnrtombs_l(char* dest, const wchar_t** src, size_t wide_char_count, size_t len, mbstate_t, locale_t);
-size_t __libcpp_wcrtomb_l(char* str, wchar_t wide_char, mbstate_t*, locale_t);
-size_t __libcpp_mbsnrtowcs_l(wchar_t* dest, const char** src, size_t max_out, size_t len, mbstate_t*, locale_t);
-size_t __libcpp_mbrtowc_l(wchar_t* dest, cosnt char* src, size_t count, mbstate_t*, locale_t);
-int __libcpp_mbtowc_l(wchar_t* dest, const char* src, size_t count, locale_t);
-size_t __libcpp_mbrlen_l(const char* str, size_t count, mbstate_t*, locale_t);
-// TODO: __libcpp_localeconv_l shouldn't take a reference, but the Windows implementation doesn't allow copying locale_t
-lconv* __libcpp_localeconv_l(locale_t&);
-size_t __libcpp_mbsrtowcs_l(wchar_t* dest, const char** src, size_t len, mbstate_t*, locale_t);
-int __libcpp_snprintf_l(char* dest, size_t buff_size, locale_t, const char* format, ...);
-int __libcpp_asprintf_l(char** dest, locale_t, const char* format, ...);
-int __libcpp_sscanf_l(const char* dest, locale_t, const char* format, ...);
-
-// TODO: change these to reserved names
-float strtof_l(const char* str, char** str_end, locale_t);
-double strtod_l(const char* str, char** str_end, locale_t);
-long double strtold_l(const char* str, char** str_end, locale_t);
-long long strtoll_l(const char* str, char** str_end, locale_t);
-unsigned long long strtoull_l(const char* str, char** str_end, locale_t);
-
-locale_t newlocale(int category_mask, const char* locale, locale_t base);
-void freelocale(locale_t);
-
-int islower_l(int ch, locale_t);
-int isupper_l(int ch, locale_t);
-int isdigit_l(int ch, locale_t);
-int isxdigit_l(int ch, locale_t);
-int strcoll_l(const char* lhs, const char* rhs, locale_t);
-size_t strxfrm_l(char* dst, const char* src, size_t n, locale_t);
-int wcscoll_l(const char* lhs, const char* rhs, locale_t);
-size_t wcsxfrm_l(wchar_t* dst, const wchar_t* src, size_t n, locale_t);
-int toupper_l(int ch, locale_t);
-int tolower_l(int ch, locale_t);
-int iswspace_l(wint_t ch, locale_t);
-int iswprint_l(wint_t ch, locale_t);
-int iswcntrl_l(wint_t ch, locale_t);
-int iswupper_l(wint_t ch, locale_t);
-int iswlower_l(wint_t ch, locale_t);
-int iswalpha_l(wint_t ch, locale_t);
-int iswblank_l(wint_t ch, locale_t);
-int iswdigit_l(wint_t ch, locale_t);
-int iswpunct_l(wint_t ch, locale_t);
-int iswxdigit_l(wint_t ch, locale_t);
-wint_t towupper_l(wint_t ch, locale_t);
-wint_t towlower_l(wint_t ch, locale_t);
-size_t strftime_l(char* str, size_t len, const char* format, const tm*, locale_t);
-
-
-These functions are equivalent to their C counterparts,
-except that locale_t is used instead of the current global locale.
-
-The variadic functions may be implemented as templates with a parameter pack instead of variadic functions.
-*/
+template <class... _Args>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT(__printf__, 4, 5) int __snprintf(
+    char* __s, size_t __n, __locale_t __loc, const char* __format, _Args&&... __args) {
+  return __libcpp_snprintf_l(__s, __n, __loc, __format, std::forward<_Args>(__args)...);
+}
+template <class... _Args>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT(__printf__, 3, 4) int __asprintf(
+    char** __s, __locale_t __loc, const char* __format, _Args&&... __args) {
+  return __libcpp_asprintf_l(__s, __loc, __format, std::forward<_Args>(__args)...);
+}
+template <class... _Args>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT(__scanf__, 3, 4) int __sscanf(
+    const char* __s, __locale_t __loc, const char* __format, _Args&&... __args) {
+  return __libcpp_sscanf_l(__s, __loc, __format, std::forward<_Args>(__args)...);
+}
+_LIBCPP_DIAGNOSTIC_POP
+#undef _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT
+
+} // namespace __locale
+_LIBCPP_END_NAMESPACE_STD
 
 #endif // _LIBCPP___LOCALE_DIR_LOCALE_BASE_API_H
diff --git a/libcxx/include/__locale_dir/locale_base_api/bsd_locale_defaults.h b/libcxx/include/__locale_dir/locale_base_api/bsd_locale_defaults.h
index 52f31fb37236c1..33009606aa923e 100644
--- a/libcxx/include/__locale_dir/locale_base_api/bsd_locale_defaults.h
+++ b/libcxx/include/__locale_dir/locale_base_api/bsd_locale_defaults.h
@@ -109,6 +109,7 @@ _LIBCPP_HIDE_FROM_ABI _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT(__scanf__, 3, 4) int __l
   return ::sscanf_l(__s, __loc, __format, std::forward<_Args>(__args)...);
 }
 _LIBCPP_DIAGNOSTIC_POP
+#undef _LIBCPP_VARIADIC_ATTRIBUTE_FORMAT
 
 _LIBCPP_END_NAMESPACE_STD
 
diff --git a/libcxx/include/__locale_dir/locale_guard.h b/libcxx/include/__locale_dir/locale_guard.h
index e0c414c001c41f..82b263de1f7f4c 100644
--- a/libcxx/include/__locale_dir/locale_guard.h
+++ b/libcxx/include/__locale_dir/locale_guard.h
@@ -10,7 +10,7 @@
 #define _LIBCPP___LOCALE_DIR_LOCALE_GUARD_H
 
 #include <__config>
-#include <__locale> // for locale_t
+#include <__locale>
 #include <clocale>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
@@ -21,7 +21,7 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 
 #if defined(_LIBCPP_MSVCRT_LIKE)
 struct __locale_guard {
-  __locale_guard(locale_t __l) : __status(_configthreadlocale(_ENABLE_PER_THREAD_LOCALE)) {
+  __locale_guard(__locale::__locale_t __l) : __status(_configthreadlocale(_ENABLE_PER_THREAD_LOCALE)) {
     // Setting the locale can be expensive even when the locale given is
     // already the current locale, so do an explicit check to see if the
     // current locale is already the one we want.
@@ -59,14 +59,14 @@ struct __locale_guard {
 };
 #else
 struct __locale_guard {
-  _LIBCPP_HIDE_FROM_ABI __locale_guard(locale_t& __loc) : __old_loc_(uselocale(__loc)) {}
+  _LIBCPP_HIDE_FROM_ABI __locale_guard(__locale::__locale_t& __loc) : __old_loc_(__locale::__uselocale(__loc)) {}
 
   _LIBCPP_HIDE_FROM_ABI ~__locale_guard() {
     if (__old_loc_)
-      uselocale(__old_loc_);
+      __locale::__uselocale(__old_loc_);
   }
 
-  locale_t __old_loc_;
+  __locale::__locale_t __old_loc_;
 
   __locale_guard(__locale_guard const&)            = delete;
   __locale_guard& operator=(__locale_guard const&) = delete;
diff --git a/libcxx/include/locale b/libcxx/include/locale
index 4706515b0a6c86..d6f1e3595f2da3 100644
--- a/libcxx/include/locale
+++ b/libcxx/include/locale
@@ -246,7 +246,7 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 #  else
 #    define _LIBCPP_GET_C_LOCALE __cloc()
 // Get the C locale object
-_LIBCPP_EXPORTED_FROM_ABI locale_t __cloc();
+_LIBCPP_EXPORTED_FROM_ABI __locale::__locale_t __cloc();
 #    define __cloc_defined
 #  endif
 
@@ -1042,7 +1042,7 @@ _InputIterator num_get<_CharT, _InputIterator>::do_get(
   }
   // Stage 3
   __buf.resize(__a_end - __a);
-  if (__libcpp_sscanf_l(__buf.c_str(), _LIBCPP_GET_C_LOCALE, "%p", &__v) != 1)
+  if (__locale::__sscanf(__buf.c_str(), _LIBCPP_GET_C_LOCALE, "%p", &__v) != 1)
     __err = ios_base::failbit;
   // EOF checked
   if (__b == __e)
@@ -1125,11 +1125,11 @@ void __num_put<_CharT>::__widen_and_group_float(
     *__oe++ = __ct.widen(*__nf++);
     *__oe++ = __ct.widen(*__nf++);
     for (__ns = __nf; __ns < __ne; ++__ns)
-      if (!isxdigit_l(*__ns, _LIBCPP_GET_C_LOCALE))
+      if (!__locale::__isxdigit(*__ns, _LIBCPP_GET_C_LOCALE))
         break;
   } else {
     for (__ns = __nf; __ns < __ne; ++__ns)
-      if (!isdigit_l(*__ns, _LIBCPP_GET_C_LOCALE))
+      if (!__locale::__isdigit(*__ns, _LIBCPP_GET_C_LOCALE))
         break;
   }
   if (__grouping.empty()) {
@@ -1330,7 +1330,7 @@ _LIBCPP_HIDE_FROM_ABI inline _OutputIterator num_put<_CharT, _OutputIterator>::_
   _LIBCPP_DIAGNO...
[truncated]

@ldionne ldionne force-pushed the review/locale-6-define-proper-shim-interface branch 2 times, most recently from 1b3cf4c to 827f749 Compare November 5, 2024 16:04
…t one

Our current locale base API is a mix of non-reserved system names that
we incorrectly (re)define and internal functions and macros starting
with __libcpp. This patch introduces a function-based internal interface
to isolate the rest of the code base from that mess, so that we can work
on refactoring how each platform implements the base API in subsequent
patches. This makes it possible to refactor how each platform implements
the base localization API without impacting the rest of the code base.
@ldionne ldionne force-pushed the review/locale-6-define-proper-shim-interface branch from 4a08252 to 2144d6b Compare November 6, 2024 13:10
@ldionne ldionne merged commit 5d8be4c into llvm:main Nov 6, 2024
61 of 62 checks passed
@ldionne ldionne deleted the review/locale-6-define-proper-shim-interface branch November 6, 2024 14:57
@perry-ca
Copy link
Contributor

@ldionne a number of the functions like __islower() are colliding with macro definitions in the ctype.h system header on z/OS. What do you think about renaming these to be like __libcpp_islower()?

return wcsxfrm_l(__dest, __src, __n, __loc);
}
inline _LIBCPP_HIDE_FROM_ABI int __iswspace(wint_t __ch, __locale_t __loc) { return iswspace_l(__ch, __loc); }
inline _LIBCPP_HIDE_FROM_ABI int __iswprint(wint_t __ch, __locale_t __loc) { return iswprint_l(__ch, __loc); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this commit landed, symbols like iswspace_l were only referenced if _LIBCPP_WCTYPE_IS_MASK was false. But now it looks like the guard here is _LIBCPP_HAS_WIDE_CHARACTERS. Is this intentional/expected?

If I made a patch to add #if defined(_LIBCPP_HAS_WIDE_CHARACTERS) && !_LIBCPP_HAS_WIDE_CHARACTERS to guard these implementations for __iswspace and friends -- would that be acceptable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it would be better if we tried to simplify this a bit. It looks like _LIBCPP_WCTYPE_IS_MASK and iswupper_l & friends were introduced in 0081892. I don't fully understand what's the problem they were working around, but I think iswctype_l should do the same thing on all platforms that provide it (contrary to the commit message). So instead I'd be curious to try assuming that iswctype_l behaves like it does on FreeBSD and Apple, and see what happens in that case. That would allow us to remove many of these __iswFOO functions in one go, unconditionally. So, basically, I'd try reverting 0081892 and see what happens with the CI bots.

Alternatively, if that doesn't work out, we could potentially move from iswctype_l to __iswFOO on all platforms unconditionally.

TLDR I'd like to try simplifying this instead of adding even more complexity by tweaking the locale base API using configuration options, since it's already quite a mess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just tried that out locally and I don't think that helps a lot, since we still reference all of these __iswFOO functions even after ripping out _LIBCPP_WCTYPE_IS_MASK.

I actually fail to see that we were not referencing iswspace_l outside of _LIBCPP_WCTYPE_IS_MASK before this patch. What's the actual issue you're running into and what's your configuration like?

Copy link
Member

@androm3da androm3da Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the actual issue you're running into and what's your configuration like?

Well, when we cherry-pick this commit downstream, it fails to compile libc++ now because our C library doesn't have iswspace_l et al.

My assumptions regarding _LIBCPP_WCTYPE_IS_MASK were based on a brief inspection of the change here and maybe I misunderstood. Let me try to reduce the problem and confirm whether indeed _LIBCPP_WCTYPE_IS_MASK is the critical factor or perhaps something else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we lack a platform-specific __locale_dir/locale_base_api/foo.h and that might be the source of our problem here. I will explore creating one of those that accepts the locale from these functions and discards it, calling the corresponding non-_l() function for these missing ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this series of changes aims to make it a lot easier for platforms to define the exact base API they want. It may result in a bit more lines of code than reusing the BSD-like code path, but in the end I'd expect it to be more robust and simpler to understand if you define the API for your downstream platform from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants