Skip to content

[Clang] Update Unicode version to 15.1 #77147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 17, 2024
Merged

Conversation

cor3ntin
Copy link
Contributor

@cor3ntin cor3ntin commented Jan 5, 2024

This update all of our Unicode tables to Unicode 15.1. This is a minor version so only a relatively small numbers of characters are added, mainly ideographs

https://www.unicode.org/versions/Unicode15.1.0/#Appendices_nb

@cor3ntin cor3ntin added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Jan 5, 2024
@llvmbot llvmbot added clang Clang issues not falling into any other category llvm:support labels Jan 5, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 5, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-support

Author: cor3ntin (cor3ntin)

Changes

This update all of our Unicode tables to Unicode 15.1. This is a minor version so only a relatively small numbers of characters are added, mainly ideographs

https://www.unicode.org/versions/Unicode15.1.0/#Appendices_nb


Patch is 3.00 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77147.diff

9 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/lib/Lex/UnicodeCharSets.h (+68-66)
  • (modified) clang/test/Lexer/unicode.c (+4-3)
  • (modified) llvm/lib/Support/Unicode.cpp (+171-171)
  • (modified) llvm/lib/Support/UnicodeCaseFold.cpp (+11-2)
  • (modified) llvm/lib/Support/UnicodeNameToCodepoint.cpp (+2-1)
  • (modified) llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp (+19816-19812)
  • (modified) llvm/unittests/Support/UnicodeTest.cpp (+3)
  • (modified) llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp (+3-3)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 9b6e00b231216b..1a0fad1d9de8d4 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -238,6 +238,8 @@ Non-comprehensive list of changes in this release
 
 * Added ``#pragma clang fp reciprocal``.
 
+* The version of Unicode used by Clang (primarily to parse identifiers) has been updated to 15.1.
+
 New Compiler Flags
 ------------------
 
diff --git a/clang/lib/Lex/UnicodeCharSets.h b/clang/lib/Lex/UnicodeCharSets.h
index 5316d2540b76ce..9fee964562fe74 100644
--- a/clang/lib/Lex/UnicodeCharSets.h
+++ b/clang/lib/Lex/UnicodeCharSets.h
@@ -10,7 +10,7 @@
 
 #include "llvm/Support/UnicodeCharRanges.h"
 
-// Unicode 15.0 XID_Start
+// Unicode 15.1 XID_Start
 static const llvm::sys::UnicodeCharRange XIDStartRanges[] = {
     {0x0041, 0x005A},   {0x0061, 0x007A},   {0x00AA, 0x00AA},
     {0x00B5, 0x00B5},   {0x00BA, 0x00BA},   {0x00C0, 0x00D6},
@@ -233,9 +233,10 @@ static const llvm::sys::UnicodeCharRange XIDStartRanges[] = {
     {0x1EE8B, 0x1EE9B}, {0x1EEA1, 0x1EEA3}, {0x1EEA5, 0x1EEA9},
     {0x1EEAB, 0x1EEBB}, {0x20000, 0x2A6DF}, {0x2A700, 0x2B739},
     {0x2B740, 0x2B81D}, {0x2B820, 0x2CEA1}, {0x2CEB0, 0x2EBE0},
-    {0x2F800, 0x2FA1D}, {0x30000, 0x3134A}, {0x31350, 0x323AF}};
+    {0x2EBF0, 0x2EE5D}, {0x2F800, 0x2FA1D}, {0x30000, 0x3134A},
+    {0x31350, 0x323AF}};
 
-// Unicode 15.0 XID_Continue, excluding XID_Start
+// Unicode 15.1 XID_Continue, excluding XID_Start
 // The Unicode Property XID_Continue is a super set of XID_Start.
 // To save Space, the table below only contains the codepoints
 // that are not also in XID_Start.
@@ -302,69 +303,70 @@ static const llvm::sys::UnicodeCharRange XIDContinueRanges[] = {
     {0x203F, 0x2040},   {0x2054, 0x2054},   {0x20D0, 0x20DC},
     {0x20E1, 0x20E1},   {0x20E5, 0x20F0},   {0x2CEF, 0x2CF1},
     {0x2D7F, 0x2D7F},   {0x2DE0, 0x2DFF},   {0x302A, 0x302F},
-    {0x3099, 0x309A},   {0xA620, 0xA629},   {0xA66F, 0xA66F},
-    {0xA674, 0xA67D},   {0xA69E, 0xA69F},   {0xA6F0, 0xA6F1},
-    {0xA802, 0xA802},   {0xA806, 0xA806},   {0xA80B, 0xA80B},
-    {0xA823, 0xA827},   {0xA82C, 0xA82C},   {0xA880, 0xA881},
-    {0xA8B4, 0xA8C5},   {0xA8D0, 0xA8D9},   {0xA8E0, 0xA8F1},
-    {0xA8FF, 0xA909},   {0xA926, 0xA92D},   {0xA947, 0xA953},
-    {0xA980, 0xA983},   {0xA9B3, 0xA9C0},   {0xA9D0, 0xA9D9},
-    {0xA9E5, 0xA9E5},   {0xA9F0, 0xA9F9},   {0xAA29, 0xAA36},
-    {0xAA43, 0xAA43},   {0xAA4C, 0xAA4D},   {0xAA50, 0xAA59},
-    {0xAA7B, 0xAA7D},   {0xAAB0, 0xAAB0},   {0xAAB2, 0xAAB4},
-    {0xAAB7, 0xAAB8},   {0xAABE, 0xAABF},   {0xAAC1, 0xAAC1},
-    {0xAAEB, 0xAAEF},   {0xAAF5, 0xAAF6},   {0xABE3, 0xABEA},
-    {0xABEC, 0xABED},   {0xABF0, 0xABF9},   {0xFB1E, 0xFB1E},
-    {0xFE00, 0xFE0F},   {0xFE20, 0xFE2F},   {0xFE33, 0xFE34},
-    {0xFE4D, 0xFE4F},   {0xFF10, 0xFF19},   {0xFF3F, 0xFF3F},
-    {0xFF9E, 0xFF9F},   {0x101FD, 0x101FD}, {0x102E0, 0x102E0},
-    {0x10376, 0x1037A}, {0x104A0, 0x104A9}, {0x10A01, 0x10A03},
-    {0x10A05, 0x10A06}, {0x10A0C, 0x10A0F}, {0x10A38, 0x10A3A},
-    {0x10A3F, 0x10A3F}, {0x10AE5, 0x10AE6}, {0x10D24, 0x10D27},
-    {0x10D30, 0x10D39}, {0x10EAB, 0x10EAC}, {0x10EFD, 0x10EFF},
-    {0x10F46, 0x10F50}, {0x10F82, 0x10F85}, {0x11000, 0x11002},
-    {0x11038, 0x11046}, {0x11066, 0x11070}, {0x11073, 0x11074},
-    {0x1107F, 0x11082}, {0x110B0, 0x110BA}, {0x110C2, 0x110C2},
-    {0x110F0, 0x110F9}, {0x11100, 0x11102}, {0x11127, 0x11134},
-    {0x11136, 0x1113F}, {0x11145, 0x11146}, {0x11173, 0x11173},
-    {0x11180, 0x11182}, {0x111B3, 0x111C0}, {0x111C9, 0x111CC},
-    {0x111CE, 0x111D9}, {0x1122C, 0x11237}, {0x1123E, 0x1123E},
-    {0x11241, 0x11241}, {0x112DF, 0x112EA}, {0x112F0, 0x112F9},
-    {0x11300, 0x11303}, {0x1133B, 0x1133C}, {0x1133E, 0x11344},
-    {0x11347, 0x11348}, {0x1134B, 0x1134D}, {0x11357, 0x11357},
-    {0x11362, 0x11363}, {0x11366, 0x1136C}, {0x11370, 0x11374},
-    {0x11435, 0x11446}, {0x11450, 0x11459}, {0x1145E, 0x1145E},
-    {0x114B0, 0x114C3}, {0x114D0, 0x114D9}, {0x115AF, 0x115B5},
-    {0x115B8, 0x115C0}, {0x115DC, 0x115DD}, {0x11630, 0x11640},
-    {0x11650, 0x11659}, {0x116AB, 0x116B7}, {0x116C0, 0x116C9},
-    {0x1171D, 0x1172B}, {0x11730, 0x11739}, {0x1182C, 0x1183A},
-    {0x118E0, 0x118E9}, {0x11930, 0x11935}, {0x11937, 0x11938},
-    {0x1193B, 0x1193E}, {0x11940, 0x11940}, {0x11942, 0x11943},
-    {0x11950, 0x11959}, {0x119D1, 0x119D7}, {0x119DA, 0x119E0},
-    {0x119E4, 0x119E4}, {0x11A01, 0x11A0A}, {0x11A33, 0x11A39},
-    {0x11A3B, 0x11A3E}, {0x11A47, 0x11A47}, {0x11A51, 0x11A5B},
-    {0x11A8A, 0x11A99}, {0x11C2F, 0x11C36}, {0x11C38, 0x11C3F},
-    {0x11C50, 0x11C59}, {0x11C92, 0x11CA7}, {0x11CA9, 0x11CB6},
-    {0x11D31, 0x11D36}, {0x11D3A, 0x11D3A}, {0x11D3C, 0x11D3D},
-    {0x11D3F, 0x11D45}, {0x11D47, 0x11D47}, {0x11D50, 0x11D59},
-    {0x11D8A, 0x11D8E}, {0x11D90, 0x11D91}, {0x11D93, 0x11D97},
-    {0x11DA0, 0x11DA9}, {0x11EF3, 0x11EF6}, {0x11F00, 0x11F01},
-    {0x11F03, 0x11F03}, {0x11F34, 0x11F3A}, {0x11F3E, 0x11F42},
-    {0x11F50, 0x11F59}, {0x13440, 0x13440}, {0x13447, 0x13455},
-    {0x16A60, 0x16A69}, {0x16AC0, 0x16AC9}, {0x16AF0, 0x16AF4},
-    {0x16B30, 0x16B36}, {0x16B50, 0x16B59}, {0x16F4F, 0x16F4F},
-    {0x16F51, 0x16F87}, {0x16F8F, 0x16F92}, {0x16FE4, 0x16FE4},
-    {0x16FF0, 0x16FF1}, {0x1BC9D, 0x1BC9E}, {0x1CF00, 0x1CF2D},
-    {0x1CF30, 0x1CF46}, {0x1D165, 0x1D169}, {0x1D16D, 0x1D172},
-    {0x1D17B, 0x1D182}, {0x1D185, 0x1D18B}, {0x1D1AA, 0x1D1AD},
-    {0x1D242, 0x1D244}, {0x1D7CE, 0x1D7FF}, {0x1DA00, 0x1DA36},
-    {0x1DA3B, 0x1DA6C}, {0x1DA75, 0x1DA75}, {0x1DA84, 0x1DA84},
-    {0x1DA9B, 0x1DA9F}, {0x1DAA1, 0x1DAAF}, {0x1E000, 0x1E006},
-    {0x1E008, 0x1E018}, {0x1E01B, 0x1E021}, {0x1E023, 0x1E024},
-    {0x1E026, 0x1E02A}, {0x1E08F, 0x1E08F}, {0x1E130, 0x1E136},
-    {0x1E140, 0x1E149}, {0x1E2AE, 0x1E2AE}, {0x1E2EC, 0x1E2F9},
-    {0x1E4EC, 0x1E4F9}, {0x1E8D0, 0x1E8D6}, {0x1E944, 0x1E94A},
-    {0x1E950, 0x1E959}, {0x1FBF0, 0x1FBF9}, {0xE0100, 0xE01EF}};
+    {0x3099, 0x309A},   {0x30FB, 0x30FB},   {0xA620, 0xA629},
+    {0xA66F, 0xA66F},   {0xA674, 0xA67D},   {0xA69E, 0xA69F},
+    {0xA6F0, 0xA6F1},   {0xA802, 0xA802},   {0xA806, 0xA806},
+    {0xA80B, 0xA80B},   {0xA823, 0xA827},   {0xA82C, 0xA82C},
+    {0xA880, 0xA881},   {0xA8B4, 0xA8C5},   {0xA8D0, 0xA8D9},
+    {0xA8E0, 0xA8F1},   {0xA8FF, 0xA909},   {0xA926, 0xA92D},
+    {0xA947, 0xA953},   {0xA980, 0xA983},   {0xA9B3, 0xA9C0},
+    {0xA9D0, 0xA9D9},   {0xA9E5, 0xA9E5},   {0xA9F0, 0xA9F9},
+    {0xAA29, 0xAA36},   {0xAA43, 0xAA43},   {0xAA4C, 0xAA4D},
+    {0xAA50, 0xAA59},   {0xAA7B, 0xAA7D},   {0xAAB0, 0xAAB0},
+    {0xAAB2, 0xAAB4},   {0xAAB7, 0xAAB8},   {0xAABE, 0xAABF},
+    {0xAAC1, 0xAAC1},   {0xAAEB, 0xAAEF},   {0xAAF5, 0xAAF6},
+    {0xABE3, 0xABEA},   {0xABEC, 0xABED},   {0xABF0, 0xABF9},
+    {0xFB1E, 0xFB1E},   {0xFE00, 0xFE0F},   {0xFE20, 0xFE2F},
+    {0xFE33, 0xFE34},   {0xFE4D, 0xFE4F},   {0xFF10, 0xFF19},
+    {0xFF3F, 0xFF3F},   {0xFF65, 0xFF65},   {0xFF9E, 0xFF9F},
+    {0x101FD, 0x101FD}, {0x102E0, 0x102E0}, {0x10376, 0x1037A},
+    {0x104A0, 0x104A9}, {0x10A01, 0x10A03}, {0x10A05, 0x10A06},
+    {0x10A0C, 0x10A0F}, {0x10A38, 0x10A3A}, {0x10A3F, 0x10A3F},
+    {0x10AE5, 0x10AE6}, {0x10D24, 0x10D27}, {0x10D30, 0x10D39},
+    {0x10EAB, 0x10EAC}, {0x10EFD, 0x10EFF}, {0x10F46, 0x10F50},
+    {0x10F82, 0x10F85}, {0x11000, 0x11002}, {0x11038, 0x11046},
+    {0x11066, 0x11070}, {0x11073, 0x11074}, {0x1107F, 0x11082},
+    {0x110B0, 0x110BA}, {0x110C2, 0x110C2}, {0x110F0, 0x110F9},
+    {0x11100, 0x11102}, {0x11127, 0x11134}, {0x11136, 0x1113F},
+    {0x11145, 0x11146}, {0x11173, 0x11173}, {0x11180, 0x11182},
+    {0x111B3, 0x111C0}, {0x111C9, 0x111CC}, {0x111CE, 0x111D9},
+    {0x1122C, 0x11237}, {0x1123E, 0x1123E}, {0x11241, 0x11241},
+    {0x112DF, 0x112EA}, {0x112F0, 0x112F9}, {0x11300, 0x11303},
+    {0x1133B, 0x1133C}, {0x1133E, 0x11344}, {0x11347, 0x11348},
+    {0x1134B, 0x1134D}, {0x11357, 0x11357}, {0x11362, 0x11363},
+    {0x11366, 0x1136C}, {0x11370, 0x11374}, {0x11435, 0x11446},
+    {0x11450, 0x11459}, {0x1145E, 0x1145E}, {0x114B0, 0x114C3},
+    {0x114D0, 0x114D9}, {0x115AF, 0x115B5}, {0x115B8, 0x115C0},
+    {0x115DC, 0x115DD}, {0x11630, 0x11640}, {0x11650, 0x11659},
+    {0x116AB, 0x116B7}, {0x116C0, 0x116C9}, {0x1171D, 0x1172B},
+    {0x11730, 0x11739}, {0x1182C, 0x1183A}, {0x118E0, 0x118E9},
+    {0x11930, 0x11935}, {0x11937, 0x11938}, {0x1193B, 0x1193E},
+    {0x11940, 0x11940}, {0x11942, 0x11943}, {0x11950, 0x11959},
+    {0x119D1, 0x119D7}, {0x119DA, 0x119E0}, {0x119E4, 0x119E4},
+    {0x11A01, 0x11A0A}, {0x11A33, 0x11A39}, {0x11A3B, 0x11A3E},
+    {0x11A47, 0x11A47}, {0x11A51, 0x11A5B}, {0x11A8A, 0x11A99},
+    {0x11C2F, 0x11C36}, {0x11C38, 0x11C3F}, {0x11C50, 0x11C59},
+    {0x11C92, 0x11CA7}, {0x11CA9, 0x11CB6}, {0x11D31, 0x11D36},
+    {0x11D3A, 0x11D3A}, {0x11D3C, 0x11D3D}, {0x11D3F, 0x11D45},
+    {0x11D47, 0x11D47}, {0x11D50, 0x11D59}, {0x11D8A, 0x11D8E},
+    {0x11D90, 0x11D91}, {0x11D93, 0x11D97}, {0x11DA0, 0x11DA9},
+    {0x11EF3, 0x11EF6}, {0x11F00, 0x11F01}, {0x11F03, 0x11F03},
+    {0x11F34, 0x11F3A}, {0x11F3E, 0x11F42}, {0x11F50, 0x11F59},
+    {0x13440, 0x13440}, {0x13447, 0x13455}, {0x16A60, 0x16A69},
+    {0x16AC0, 0x16AC9}, {0x16AF0, 0x16AF4}, {0x16B30, 0x16B36},
+    {0x16B50, 0x16B59}, {0x16F4F, 0x16F4F}, {0x16F51, 0x16F87},
+    {0x16F8F, 0x16F92}, {0x16FE4, 0x16FE4}, {0x16FF0, 0x16FF1},
+    {0x1BC9D, 0x1BC9E}, {0x1CF00, 0x1CF2D}, {0x1CF30, 0x1CF46},
+    {0x1D165, 0x1D169}, {0x1D16D, 0x1D172}, {0x1D17B, 0x1D182},
+    {0x1D185, 0x1D18B}, {0x1D1AA, 0x1D1AD}, {0x1D242, 0x1D244},
+    {0x1D7CE, 0x1D7FF}, {0x1DA00, 0x1DA36}, {0x1DA3B, 0x1DA6C},
+    {0x1DA75, 0x1DA75}, {0x1DA84, 0x1DA84}, {0x1DA9B, 0x1DA9F},
+    {0x1DAA1, 0x1DAAF}, {0x1E000, 0x1E006}, {0x1E008, 0x1E018},
+    {0x1E01B, 0x1E021}, {0x1E023, 0x1E024}, {0x1E026, 0x1E02A},
+    {0x1E08F, 0x1E08F}, {0x1E130, 0x1E136}, {0x1E140, 0x1E149},
+    {0x1E2AE, 0x1E2AE}, {0x1E2EC, 0x1E2F9}, {0x1E4EC, 0x1E4F9},
+    {0x1E8D0, 0x1E8D6}, {0x1E944, 0x1E94A}, {0x1E950, 0x1E959},
+    {0x1FBF0, 0x1FBF9}, {0xE0100, 0xE01EF}};
 
 // Clang supports the "Mathematical notation profile" as an extension,
 // as described in https://www.unicode.org/L2/L2022/22230-math-profile.pdf
diff --git a/clang/test/Lexer/unicode.c b/clang/test/Lexer/unicode.c
index d86ac2d5e26049..6ae948c3122e15 100644
--- a/clang/test/Lexer/unicode.c
+++ b/clang/test/Lexer/unicode.c
@@ -38,9 +38,10 @@ extern int ༀ;
 extern int 𑩐;
 extern int 𐠈;
 extern int ꙮ;
-extern int  \u1B4C;     // BALINESE LETTER ARCHAIC JNYA - Added in Unicode 14
-extern int  \U00016AA2; // TANGSA LETTER GA - Added in Unicode 14
-extern int  \U0001E4D0; // 𞓐 NAG MUNDARI LETTER O - Added in Unicode 15
+extern int \u1B4C;     // BALINESE LETTER ARCHAIC JNYA - Added in Unicode 14
+extern int \U00016AA2; // TANGSA LETTER GA - Added in Unicode 14
+extern int \U0001E4D0; // 𞓐 NAG MUNDARI LETTER O - Added in Unicode 15
+extern int \u{2EBF0}; // CJK UNIFIED IDEOGRAPH-2EBF0 - Added in Unicode 15.1
 extern int a\N{TANGSA LETTER GA};
 extern int a\N{TANGSALETTERGA}; // expected-error {{'TANGSALETTERGA' is not a valid Unicode character name}} \
                                 // expected-error {{expected ';' after top level declarator}} \
diff --git a/llvm/lib/Support/Unicode.cpp b/llvm/lib/Support/Unicode.cpp
index 621ffc712187a5..288b75c872e175 100644
--- a/llvm/lib/Support/Unicode.cpp
+++ b/llvm/lib/Support/Unicode.cpp
@@ -25,7 +25,7 @@ namespace unicode {
 /// it's actually displayed on most terminals. \return true if the character is
 /// considered printable.
 bool isPrintable(int UCS) {
-  // https://unicode.org/Public/15.0.0/ucdxml/
+  // https://unicode.org/Public/15.1.0/ucdxml/
   static const UnicodeCharRange PrintableRanges[] = {
       {0x0020, 0x007E},   {0x00A0, 0x00AC},   {0x00AE, 0x0377},
       {0x037A, 0x037F},   {0x0384, 0x038A},   {0x038C, 0x038C},
@@ -119,151 +119,152 @@ bool isPrintable(int UCS) {
       {0x2DB8, 0x2DBE},   {0x2DC0, 0x2DC6},   {0x2DC8, 0x2DCE},
       {0x2DD0, 0x2DD6},   {0x2DD8, 0x2DDE},   {0x2DE0, 0x2E5D},
       {0x2E80, 0x2E99},   {0x2E9B, 0x2EF3},   {0x2F00, 0x2FD5},
-      {0x2FF0, 0x2FFB},   {0x3000, 0x303F},   {0x3041, 0x3096},
-      {0x3099, 0x30FF},   {0x3105, 0x312F},   {0x3131, 0x318E},
-      {0x3190, 0x31E3},   {0x31F0, 0x321E},   {0x3220, 0xA48C},
-      {0xA490, 0xA4C6},   {0xA4D0, 0xA62B},   {0xA640, 0xA6F7},
-      {0xA700, 0xA7CA},   {0xA7D0, 0xA7D1},   {0xA7D3, 0xA7D3},
-      {0xA7D5, 0xA7D9},   {0xA7F2, 0xA82C},   {0xA830, 0xA839},
-      {0xA840, 0xA877},   {0xA880, 0xA8C5},   {0xA8CE, 0xA8D9},
-      {0xA8E0, 0xA953},   {0xA95F, 0xA97C},   {0xA980, 0xA9CD},
-      {0xA9CF, 0xA9D9},   {0xA9DE, 0xA9FE},   {0xAA00, 0xAA36},
-      {0xAA40, 0xAA4D},   {0xAA50, 0xAA59},   {0xAA5C, 0xAAC2},
-      {0xAADB, 0xAAF6},   {0xAB01, 0xAB06},   {0xAB09, 0xAB0E},
-      {0xAB11, 0xAB16},   {0xAB20, 0xAB26},   {0xAB28, 0xAB2E},
-      {0xAB30, 0xAB6B},   {0xAB70, 0xABED},   {0xABF0, 0xABF9},
-      {0xAC00, 0xD7A3},   {0xD7B0, 0xD7C6},   {0xD7CB, 0xD7FB},
-      {0xF900, 0xFA6D},   {0xFA70, 0xFAD9},   {0xFB00, 0xFB06},
-      {0xFB13, 0xFB17},   {0xFB1D, 0xFB36},   {0xFB38, 0xFB3C},
-      {0xFB3E, 0xFB3E},   {0xFB40, 0xFB41},   {0xFB43, 0xFB44},
-      {0xFB46, 0xFBC2},   {0xFBD3, 0xFD8F},   {0xFD92, 0xFDC7},
-      {0xFDCF, 0xFDCF},   {0xFDF0, 0xFE19},   {0xFE20, 0xFE52},
-      {0xFE54, 0xFE66},   {0xFE68, 0xFE6B},   {0xFE70, 0xFE74},
-      {0xFE76, 0xFEFC},   {0xFF01, 0xFFBE},   {0xFFC2, 0xFFC7},
-      {0xFFCA, 0xFFCF},   {0xFFD2, 0xFFD7},   {0xFFDA, 0xFFDC},
-      {0xFFE0, 0xFFE6},   {0xFFE8, 0xFFEE},   {0xFFFC, 0xFFFD},
-      {0x10000, 0x1000B}, {0x1000D, 0x10026}, {0x10028, 0x1003A},
-      {0x1003C, 0x1003D}, {0x1003F, 0x1004D}, {0x10050, 0x1005D},
-      {0x10080, 0x100FA}, {0x10100, 0x10102}, {0x10107, 0x10133},
-      {0x10137, 0x1018E}, {0x10190, 0x1019C}, {0x101A0, 0x101A0},
-      {0x101D0, 0x101FD}, {0x10280, 0x1029C}, {0x102A0, 0x102D0},
-      {0x102E0, 0x102FB}, {0x10300, 0x10323}, {0x1032D, 0x1034A},
-      {0x10350, 0x1037A}, {0x10380, 0x1039D}, {0x1039F, 0x103C3},
-      {0x103C8, 0x103D5}, {0x10400, 0x1049D}, {0x104A0, 0x104A9},
-      {0x104B0, 0x104D3}, {0x104D8, 0x104FB}, {0x10500, 0x10527},
-      {0x10530, 0x10563}, {0x1056F, 0x1057A}, {0x1057C, 0x1058A},
-      {0x1058C, 0x10592}, {0x10594, 0x10595}, {0x10597, 0x105A1},
-      {0x105A3, 0x105B1}, {0x105B3, 0x105B9}, {0x105BB, 0x105BC},
-      {0x10600, 0x10736}, {0x10740, 0x10755}, {0x10760, 0x10767},
-      {0x10780, 0x10785}, {0x10787, 0x107B0}, {0x107B2, 0x107BA},
-      {0x10800, 0x10805}, {0x10808, 0x10808}, {0x1080A, 0x10835},
-      {0x10837, 0x10838}, {0x1083C, 0x1083C}, {0x1083F, 0x10855},
-      {0x10857, 0x1089E}, {0x108A7, 0x108AF}, {0x108E0, 0x108F2},
-      {0x108F4, 0x108F5}, {0x108FB, 0x1091B}, {0x1091F, 0x10939},
-      {0x1093F, 0x1093F}, {0x10980, 0x109B7}, {0x109BC, 0x109CF},
-      {0x109D2, 0x10A03}, {0x10A05, 0x10A06}, {0x10A0C, 0x10A13},
-      {0x10A15, 0x10A17}, {0x10A19, 0x10A35}, {0x10A38, 0x10A3A},
-      {0x10A3F, 0x10A48}, {0x10A50, 0x10A58}, {0x10A60, 0x10A9F},
-      {0x10AC0, 0x10AE6}, {0x10AEB, 0x10AF6}, {0x10B00, 0x10B35},
-      {0x10B39, 0x10B55}, {0x10B58, 0x10B72}, {0x10B78, 0x10B91},
-      {0x10B99, 0x10B9C}, {0x10BA9, 0x10BAF}, {0x10C00, 0x10C48},
-      {0x10C80, 0x10CB2}, {0x10CC0, 0x10CF2}, {0x10CFA, 0x10D27},
-      {0x10D30, 0x10D39}, {0x10E60, 0x10E7E}, {0x10E80, 0x10EA9},
-      {0x10EAB, 0x10EAD}, {0x10EB0, 0x10EB1}, {0x10EFD, 0x10F27},
-      {0x10F30, 0x10F59}, {0x10F70, 0x10F89}, {0x10FB0, 0x10FCB},
-      {0x10FE0, 0x10FF6}, {0x11000, 0x1104D}, {0x11052, 0x11075},
-      {0x1107F, 0x110BC}, {0x110BE, 0x110C2}, {0x110D0, 0x110E8},
-      {0x110F0, 0x110F9}, {0x11100, 0x11134}, {0x11136, 0x11147},
-      {0x11150, 0x11176}, {0x11180, 0x111DF}, {0x111E1, 0x111F4},
-      {0x11200, 0x11211}, {0x11213, 0x11241}, {0x11280, 0x11286},
-      {0x11288, 0x11288}, {0x1128A, 0x1128D}, {0x1128F, 0x1129D},
-      {0x1129F, 0x112A9}, {0x112B0, 0x112EA}, {0x112F0, 0x112F9},
-      {0x11300, 0x11303}, {0x11305, 0x1130C}, {0x1130F, 0x11310},
-      {0x11313, 0x11328}, {0x1132A, 0x11330}, {0x11332, 0x11333},
-      {0x11335, 0x11339}, {0x1133B, 0x11344}, {0x11347, 0x11348},
-      {0x1134B, 0x1134D}, {0x11350, 0x11350}, {0x11357, 0x11357},
-      {0x1135D, 0x11363}, {0x11366, 0x1136C}, {0x11370, 0x11374},
-      {0x11400, 0x1145B}, {0x1145D, 0x11461}, {0x11480, 0x114C7},
-      {0x114D0, 0x114D9}, {0x11580, 0x115B5}, {0x115B8, 0x115DD},
-      {0x11600, 0x11644}, {0x11650, 0x11659}, {0x11660, 0x1166C},
-      {0x11680, 0x116B9}, {0x116C0, 0x116C9}, {0x11700, 0x1171A},
-      {0x1171D, 0x1172B}, {0x11730, 0x11746}, {0x11800, 0x1183B},
-      {0x118A0, 0x118F2}, {0x118FF, 0x11906}, {0x11909, 0x11909},
-      {0x1190C, 0x11913}, {0x11915, 0x11916}, {0x11918, 0x11935},
-      {0x11937, 0x11938}, {0x1193B, 0x11946}, {0x11950, 0x11959},
-      {0x119A0, 0x119A7}, {0x119AA, 0x119D7}, {0x119DA, 0x119E4},
-      {0x11A00, 0x11A47}, {0x11A50, 0x11AA2}, {0x11AB0, 0x11AF8},
-      {0x11B00, 0x11B09}, {0x11C00, 0x11C08}, {0x11C0A, 0x11C36},
-      {0x11C38, 0x11C45}, {0x11C50, 0x11C6C}, {0x11C70, 0x11C8F},
-      {0x11C92, 0x11CA7}, {0x11CA9, 0x11CB6}, {0x11D00, 0x11D06},
-      {0x11D08, 0x11D09}, {0x11D0B, 0x11D36}, {0x11D3A, 0x11D3A},
-      {0x11D3C, 0x11D3D}, {0x11D3F, 0x11D47}, {0x11D50, 0x11D59},
-      {0x11D60, 0x11D65}, {0x11D67, 0x11D68}, {0x11D6A, 0x11D8E},
-      {0x11D90, 0x11D91}, {0x11D93, 0x11D98}, {0x11DA0, 0x11DA9},
-      {0x11EE0, 0x11EF8}, {0x11F00, 0x11F10}, {0x11F12, 0x11F3A},
-      {0x11F3E, 0x11F59}, {0x11FB0, 0x11FB0}, {0x11FC0, 0x11FF1},
-      {0x11FFF, 0x12399}, {0x12400, 0x1246E}, {0x12470, 0x12474},
-      {0x12480, 0x12543}, {0x12F90, 0x12FF2}, {0x13000, 0x1342F},
-      {0x13440, 0x13455}, {0x14400, 0x14646}, {0x16800, 0x16A38},
-      {0x16A40, 0x16A5E}, {0x16A60, 0x16A69}, {0x16A6E, 0x16ABE},
-      {0x16AC0, 0x16AC9}, {0x16AD0, 0x16AED}, {0x16AF0, 0x16AF5},
-      {0x16B00, 0x16B45}, {0x16B50, 0x16B59}, {0x16B5B, 0x16B61},
-      {0x16B63, 0x16B77}, {0x16B7D, 0x16B8F}, {0x16E40, 0x16E9A},
-      {0x16F00, 0x16F4A}, {0x16F4F, 0x16F87}, {0x16F8F, 0x16F9F},
-      {0x16FE0, 0x16FE4}, {0x16FF0, 0x16FF1}, {0x17000, 0x187F7},
-      {0x18800, 0x18CD5}, {0x18D00, 0x18D08}, {0x1AFF0, 0x1AFF3},
-      {0x1AFF5, 0x1AFFB}, {0x1AFFD, 0x1AFFE}, {0x1B000, 0x1B122},
-      {0x1B132, 0x1B132}, {0x1B150, 0x1B152}, {0x1B155, 0x1B155},
-      {0x1B164, 0x1B167}, {0x1B170, 0x1B2FB}, {0x1BC00, 0x1BC6A},
-      {0x1BC70, 0x1BC7C}, {0x1BC80, 0x1BC88}, {0x1BC90, 0x1BC99},
-      {0x1BC9C, 0x1BC9F}, {0x1CF00, 0x1CF2D}, {0x1CF30, 0x1CF46},
-      {0x1CF50, 0x1CFC3}, {0x1D000, 0x1D0F5}, {0x1D100, 0x1D126},
-      {0x1D129, 0x1D172}, {0x1D17B, 0x1D1EA}, {0x1D200, 0x1D245},
-      {0x1D2C0, 0x1D2D3}, {0x1D2E0, 0x1D2F3}, {0x1D300, 0x1D356},
-      {0x1D360, 0x1D378}, {0x1D400, 0x1D454}, {0x1D456, 0x1D49C},
-      {0x1D49E, 0x1D49F}, {0x1D4A2, 0x1D4A2}, {0x1D4A5, 0x1D4A6},
-      {0x1D4A9, 0x1D4AC}, {0x1D4AE, 0x1D4B9}, {0x1D4BB, 0x1D4BB},
-      {0x1D4BD, 0x1D4C3}, {0x1D4C5, 0x1D505}, {0x1D507, 0x1D50A},
-      {0x1D50D, 0x1D514}, {0x1D516, 0x1D51C}, {0x1D51E, 0x1D539},
-      {0x1D53B, 0x1D53E}, {0x1D540, 0x1D544}, {0x1D546, 0x1D546},
-      {0x1D54A, 0x1D550}, {0x1D552, 0x1D6A5}, {0x1D6A8, 0x1D7CB},
-      {0x1D7CE, 0x1DA8B}, {0x1DA9B, 0x1DA9F}, {0x1DAA1, 0x1DAAF},
-      {0x1DF00, 0x1DF1E}, {0x1DF25, 0x1DF2A}, {0x1E000, 0x1E006},
-      {0x1E008, 0x1E018}, {0x1E01B, 0x1E021}, {0x1E023, 0x1E024},
-      {0x1E026, 0x1E02A}, {0x1E030, 0x1E06D}, {0x1E08F, 0x1E08F},
-      {0x1E100, 0x1E12C}, {0x1E130, 0x1E13D}, {0x1E140, 0x1E149},
-      {0x1E14E, 0x1E14F}, {0x1E290, 0x1E2AE}, {0x1E2C0, 0x1E2F9},
-      {0x1E2FF, 0x1E2FF}, {0x1E4D0, 0x1E4F9}, {0x1E7E0, 0x1E7E6},
-      {0x1E7E8, 0x1E7EB}, {0x1E7ED, 0x1E7EE}, {0x1E7F0, 0x1E7FE},
-      {0x1E800, 0x1E8C4}, {0x1E8C7, 0x1E8D6}, {0x1E900, 0x1E94B},
-      {0x1E950, 0x1E959}, {0x1E95E, 0x1E95F}, {0x1EC71, 0x1ECB4},
-      {0x1ED01, 0x1ED3D}, {0x1EE00, 0x1EE03}, {0x1EE05, 0x1EE1F},
-      {0x1EE21, 0x1EE22}, {0x1EE24, 0x1EE24}, {0x1EE27, 0x1EE27},
-      {0x1EE29, 0x1EE32}, {0x1EE34, 0x1EE37}, {0x1EE39, 0x1EE39},
-      {0x1EE3B, 0x1EE3B}, {0x1EE42, 0x1EE42}, {0x1EE47, 0x1EE47},
-      {0x1EE49, 0x1EE49}, {0x1EE4B, 0x1EE4B}, {0x1EE4D, 0x1EE4F},
-      {0x1EE51, 0x1EE52}, {0x1EE54, 0x1EE54}, {0x1EE57, 0x1EE57},
-      {0x1EE59, 0x1EE59}, {0x1EE5B, 0x1EE5B}, {0x1EE5D,...
[truncated]

Copy link

github-actions bot commented Jan 5, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

This update all of our Unicode tables to Unicode 15.1.
This is a minor version so only a relatively small numbers
of characters are added, mainly ideographs

https://www.unicode.org/versions/Unicode15.1.0/#Appendices_nb
@cor3ntin
Copy link
Contributor Author

cor3ntin commented Jan 9, 2024

@nikic looks like you closed the wrong issue

@cor3ntin cor3ntin reopened this Jan 9, 2024
@nikic
Copy link
Contributor

nikic commented Jan 9, 2024

Ooops, sorry about that.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cor3ntin cor3ntin merged commit 03e43cf into llvm:main Jan 17, 2024
@MaskRay
Copy link
Member

MaskRay commented Jan 17, 2024

Nice! I wonder whether there is any documentation/procedure notes to facilitate future updates, like: what files need to be updated?

@cor3ntin
Copy link
Contributor Author

@MaskRay sadly, At the moment we have no automation whatsoever for some of the tables (only character names and case folding have upstream scripts). We really should write a script to generate the tables. (the one i use is fairly nasty and i need to copy each table in the corresponding file manually, it's... not great)

ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
This update all of our Unicode tables to Unicode 15.1. This is a minor
version so only a relatively small numbers of characters are added,
mainly ideographs

https://www.unicode.org/versions/Unicode15.1.0/#Appendices_nb
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants