-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Add basic char*_t support for libc (partial WG14 N2653) #90360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-libc Author: Fabian Keßler (Febbe) Changes
Full diff: https://github.com/llvm/llvm-project/pull/90360.diff 9 Files Affected:
diff --git a/libc/config/baremetal/api.td b/libc/config/baremetal/api.td
index 25aa06aacb642e..a6547d843c85ee 100644
--- a/libc/config/baremetal/api.td
+++ b/libc/config/baremetal/api.td
@@ -85,5 +85,10 @@ def TimeAPI : PublicAPI<"time.h"> {
}
def UCharAPI : PublicAPI<"uchar.h"> {
- let Types = ["mbstate_t"];
+ let Types = [
+ "mbstate_t",
+ "char8_t",
+ "char16_t",
+ "char32_t",
+ ];
}
diff --git a/libc/config/linux/x86_64/headers.txt b/libc/config/linux/x86_64/headers.txt
index e51c7931942706..44d640b75e2bf7 100644
--- a/libc/config/linux/x86_64/headers.txt
+++ b/libc/config/linux/x86_64/headers.txt
@@ -29,6 +29,7 @@ set(TARGET_PUBLIC_HEADERS
libc.include.time
libc.include.unistd
libc.include.wchar
+ libc.include.uchar
libc.include.arpa_inet
diff --git a/libc/include/CMakeLists.txt b/libc/include/CMakeLists.txt
index aeef46aabfce5c..6dea8e539969d0 100644
--- a/libc/include/CMakeLists.txt
+++ b/libc/include/CMakeLists.txt
@@ -603,6 +603,9 @@ add_gen_header(
DEPENDS
.llvm_libc_common_h
.llvm-libc-types.mbstate_t
+ .llvm-libc-types.char8_t
+ .llvm-libc-types.char16_t
+ .llvm-libc-types.char32_t
)
add_gen_header(
diff --git a/libc/include/llvm-libc-types/CMakeLists.txt b/libc/include/llvm-libc-types/CMakeLists.txt
index 310374fb62ffe0..c8999f3d25f4cd 100644
--- a/libc/include/llvm-libc-types/CMakeLists.txt
+++ b/libc/include/llvm-libc-types/CMakeLists.txt
@@ -90,6 +90,9 @@ add_header(tcflag_t HDR tcflag_t.h)
add_header(struct_termios HDR struct_termios.h DEPENDS .cc_t .speed_t .tcflag_t)
add_header(__getoptargv_t HDR __getoptargv_t.h)
add_header(wchar_t HDR wchar_t.h)
+add_header(char8_t HDR char8_t.h)
+add_header(char16_t HDR char16_t.h)
+add_header(char32_t HDR char32_t.h)
add_header(wint_t HDR wint_t.h)
add_header(sa_family_t HDR sa_family_t.h)
add_header(socklen_t HDR socklen_t.h)
diff --git a/libc/include/llvm-libc-types/char16_t.h b/libc/include/llvm-libc-types/char16_t.h
new file mode 100644
index 00000000000000..59810d0f6e5d85
--- /dev/null
+++ b/libc/include/llvm-libc-types/char16_t.h
@@ -0,0 +1,17 @@
+//===-- Definition of clock_t type ----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TYPES_CHAR8_T_H
+#define LLVM_LIBC_TYPES_CHAR8_T_H
+
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
+#include <stdint.h>
+typedef uint_least16_t char16_t;
+#endif
+
+#endif // LLVM_LIBC_TYPES_CHAR8_T_H
\ No newline at end of file
diff --git a/libc/include/llvm-libc-types/char32_t.h b/libc/include/llvm-libc-types/char32_t.h
new file mode 100644
index 00000000000000..5cbd21e78a808a
--- /dev/null
+++ b/libc/include/llvm-libc-types/char32_t.h
@@ -0,0 +1,17 @@
+//===-- Definition of clock_t type ----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TYPES_CHAR8_T_H
+#define LLVM_LIBC_TYPES_CHAR8_T_H
+
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
+#include <stdint.h>
+typedef uint_least32_t char32_t;
+#endif
+
+#endif // LLVM_LIBC_TYPES_CHAR8_T_H
\ No newline at end of file
diff --git a/libc/include/llvm-libc-types/char8_t.h b/libc/include/llvm-libc-types/char8_t.h
new file mode 100644
index 00000000000000..12972161c7e466
--- /dev/null
+++ b/libc/include/llvm-libc-types/char8_t.h
@@ -0,0 +1,16 @@
+//===-- Definition of clock_t type ----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TYPES_CHAR8_T_H
+#define LLVM_LIBC_TYPES_CHAR8_T_H
+
+#if !defined(__cplusplus) && defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311L
+typedef unsigned char char8_t;
+#endif
+
+#endif // LLVM_LIBC_TYPES_CHAR8_T_H
\ No newline at end of file
diff --git a/libc/spec/spec.td b/libc/spec/spec.td
index 87bf4435e16724..ea8fa4cd373cf3 100644
--- a/libc/spec/spec.td
+++ b/libc/spec/spec.td
@@ -65,6 +65,9 @@ def SizeTType : NamedType<"size_t">;
def SizeTPtr : PtrType<SizeTType>;
def RestrictedSizeTPtr : RestrictedPtrType<SizeTType>;
+def Char8TType : NamedType<"char8_t">;
+def Char16TType : NamedType<"char16_t">;
+def Char32TType : NamedType<"char32_t">;
def WCharType : NamedType<"wchar_t">;
def WIntType : NamedType<"wint_t">;
diff --git a/libc/spec/stdc.td b/libc/spec/stdc.td
index 01aa7c70b3b9df..88758dec643fd4 100644
--- a/libc/spec/stdc.td
+++ b/libc/spec/stdc.td
@@ -1396,6 +1396,9 @@ def StdC : StandardSpec<"stdc"> {
[], // Macros
[ //Types
MBStateTType,
+ Char8TType,
+ Char16TType,
+ Char32TType,
],
[], // Enumerations
[]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
68236e5
to
c73c973
Compare
- Define C23 char8_t - Define C11 char16_t - Define C11 char32_t Preparation for functions like `mbrtoc8` and `c8rtomb` which are introduced in C23.
c73c973
to
1a3366e
Compare
@tahonermann I am asking for a review. |
288221b
to
40e9779
Compare
Thank you for the feedback. I applied all of your changes and added uchar (and wchar) also to the list of included headers on linux-aarch, linux-arm, linux-riscv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good with some minor style nits
- headers fixed - included `stdint-macros.h` instead of `stdint.h` - Updated dependencies of `char16_t` and `char32_t` - Added uchar support for linux-riscv - Added uchar & wchar support for linux-arm & linux-aarch64 - Added UCharAPI type to linux/api.td
40e9779
to
97ad4a7
Compare
Ok, now everything should be fixed. I can squash the commits together if that's desired. But I don't have merge rights, someone has to merge this. |
Github automatically squashes, I can merge this for you. |
@Febbe Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested Please check whether problems have been caused by your change specifically, as How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
This PR implements a part of WG14 N2653:
Missing goals are:
- The type of UTF-8 character literals is changed from unsigned char to char8_t. (Since UTF-8 character literals already have type unsigned char, this is not a semantic change).
- New mbrtoc8() and c8rtomb() functions declared in <uchar.h> enable conversions between multibyte characters and UTF-8.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro.
- A new atomic_char8_t typedef name.