Skip to content

Add support for char8_t (from C++20). #26153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions include/swift/ClangImporter/BuiltinMappedTypes.def
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ MAP_BUILTIN_INTEGER_TYPE(ULongLong, CUnsignedLongLong)
MAP_BUILTIN_INTEGER_TYPE(UInt128, CUnsignedInt128)
MAP_BUILTIN_INTEGER_TYPE(WChar_S, CWideChar)
MAP_BUILTIN_INTEGER_TYPE(WChar_U, CWideChar)
MAP_BUILTIN_INTEGER_TYPE(Char8, CChar8)
MAP_BUILTIN_INTEGER_TYPE(Char16, CChar16)
MAP_BUILTIN_INTEGER_TYPE(Char32, CChar32)
MAP_BUILTIN_INTEGER_TYPE(SChar, CSignedChar)
Expand Down
2 changes: 2 additions & 0 deletions lib/ClangImporter/ImportMacro.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,8 @@ static Optional<clang::QualType> builtinTypeForToken(const clang::Token &tok,
return clang::QualType(context.WCharTy);
case clang::tok::kw_bool:
return clang::QualType(context.BoolTy);
case clang::tok::kw_char8_t:
return clang::QualType(context.Char8Ty);
case clang::tok::kw_char16_t:
return clang::QualType(context.Char16Ty);
case clang::tok::kw_char32_t:
Expand Down
1 change: 0 additions & 1 deletion lib/ClangImporter/ImportType.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,6 @@ namespace {
case clang::BuiltinType::Float16:
case clang::BuiltinType::Float128:
case clang::BuiltinType::NullPtr:
case clang::BuiltinType::Char8:
return Type();

// Objective-C types that aren't mapped directly; rather, pointers to
Expand Down
2 changes: 2 additions & 0 deletions lib/PrintAsObjC/PrintAsObjC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1494,6 +1494,7 @@ class ObjCPrinter : private DeclVisitor<ObjCPrinter>,

MAP(CChar, "char", false);
MAP(CWideChar, "wchar_t", false);
MAP(CChar8, "char8_t", false);
MAP(CChar16, "char16_t", false);
MAP(CChar32, "char32_t", false);

Expand Down Expand Up @@ -2579,6 +2580,7 @@ class ModuleWriter {
"# if __has_include(<uchar.h>)\n"
"# include <uchar.h>\n"
"# elif !defined(__cplusplus)\n"
"typedef unsigned char char8_t;\n"
"typedef uint_least16_t char16_t;\n"
"typedef uint_least32_t char32_t;\n"
"# endif\n"
Expand Down
3 changes: 3 additions & 0 deletions stdlib/public/core/CTypes.swift
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,9 @@ public typealias CLongDouble = Double
/// The C++ 'wchar_t' type.
public typealias CWideChar = Unicode.Scalar

/// The C++20 'char8_t' type, which has UTF-8 encoding.
public typealias CChar8 = UInt8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@airspeedswift, how are we handling availability for new top-level typealiases? These can't affect ABI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I suggested Varun pick this up as a starter-ish task, but this is a public change to the stdlib, and CChar8 is close to CChar. Does it need formal review or even a proposal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, given that WG14 is working on a TN for char16_t and char32_t, I could see char8_t being introduced into C, where it gets interesting with -fsigned-char and -fno-signed-char. I think that on Darwin targets where you default to -fsigned-char, having CChar be signed char, and CChar8 be unsigned char is slightly scary. Might be worth bringing this up through a proposal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@compnerd but what's the alternative, really? Yeah, it would be slightly messy, but there's really no other option on the table except maybe "just don't import these types". I think this is a thing that we just live with.

Copy link
Member

@compnerd compnerd Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that "just don't import these types" is a viable solution (particularly long term). Sure, we can get away with it for the time being, but, I don't think that we would be able to escape it forever.

Now, I'm not saying that this is what we should do or even that what I'm suggesting is good. Hopefully someone can come up with something more clever.

We could import this as CXX.Char8 (where CXX is a @frozen enum which I believe should allow inling).

Yes, I am a bad person doing bad things to the compiler :-(.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, non-generic containing contexts don't really affect codegen at all for nested types. So yeah, that is technically an option. It's a weird choice if C22 adopts char8_t as well, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we could introduce CChar8 at that time for the C char8_t and if they make it compatible with char (including -fsigned-char), we can handle both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see char8_t ever being signed; it's defined as a UTF-8 code unit. And we can't introduce CChar8 separate from CXX.Char8 because they'd be different types you could overload on (unless one is a typealias).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, they're all typealiases. Retracted, at least partially.


// FIXME: Swift should probably have a UTF-16 type other than UInt16.
//
/// The C++11 'char16_t' type, which has UTF-16 encoding.
Expand Down
8 changes: 4 additions & 4 deletions test/IRGen/objc_type_encoding.swift
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,10 @@ import gizmo
// CHECK-ios: private unnamed_addr constant [23 x i8] c"v44@0:8C16S20I24Q28Q36\00"
// CHECK-tvos: private unnamed_addr constant [23 x i8] c"v44@0:8C16S20I24Q28Q36\00"

@objc func testCChars(_ basic: CChar, wchar wide: CWideChar, char16: CChar16, char32: CChar32) {}
// CHECK-macosx: private unnamed_addr constant [20 x i8] c"v32@0:8c16i20S24i28\00"
// CHECK-ios: private unnamed_addr constant [20 x i8] c"v32@0:8c16i20S24i28\00"
// CHECK-tvos: private unnamed_addr constant [20 x i8] c"v32@0:8c16i20S24i28\00"
@objc func testCChars(_ basic: CChar, wchar wide: CWideChar, char8: CChar8, char16: CChar16, char32: CChar32) {}
// CHECK-macosx: private unnamed_addr constant [23 x i8] c"v36@0:8c16i20C24S28i32\00"
// CHECK-ios: private unnamed_addr constant [23 x i8] c"v36@0:8c16i20C24S28i32\00"
// CHECK-tvos: private unnamed_addr constant [23 x i8] c"v36@0:8c16i20C24S28i32\00"

@objc func testCBool(_ a: CBool) {}
// CHECK-macosx: private unnamed_addr constant [11 x i8] c"v20@0:8c16\00"
Expand Down
4 changes: 2 additions & 2 deletions test/PrintAsObjC/classes.swift
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ class NotObjC {}
// CHECK-NEXT: - (void)testSelector:(SEL _Nonnull)sel boolean:(BOOL)b;
// CHECK-NEXT: - (void)testCSignedTypes:(signed char)a b:(short)b c:(int)c d:(long)d e:(long long)e;
// CHECK-NEXT: - (void)testCUnsignedTypes:(unsigned char)a b:(unsigned short)b c:(unsigned int)c d:(unsigned long)d e:(unsigned long long)e;
// CHECK-NEXT: - (void)testCChars:(char)basic wchar:(wchar_t)wide char16:(char16_t)char16 char32:(char32_t)char32;
// CHECK-NEXT: - (void)testCChars:(char)basic wchar:(wchar_t)wide char8:(char8_t)char8 char16:(char16_t)char16 char32:(char32_t)char32;
// CHECK-NEXT: - (void)testCFloats:(float)a b:(double)b;
// CHECK-NEXT: - (void)testCBool:(bool)a;
// CHECK-NEXT: - (void)testSizedSignedTypes:(int8_t)a b:(int16_t)b c:(int32_t)c d:(int64_t)d;
Expand Down Expand Up @@ -244,7 +244,7 @@ class NotObjC {}

@objc func testCSignedTypes(_ a: CSignedChar, b: CShort, c: CInt, d: CLong, e: CLongLong) {}
@objc func testCUnsignedTypes(_ a: CUnsignedChar, b: CUnsignedShort, c: CUnsignedInt, d: CUnsignedLong, e: CUnsignedLongLong) {}
@objc func testCChars(_ basic: CChar, wchar wide: CWideChar, char16: CChar16, char32: CChar32) {}
@objc func testCChars(_ basic: CChar, wchar wide: CWideChar, char8: CChar8, char16: CChar16, char32: CChar32) {}
@objc func testCFloats(_ a: CFloat, b: CDouble) {}
@objc func testCBool(_ a: CBool) {}

Expand Down
3 changes: 3 additions & 0 deletions test/stdlib/PrintInteger.swift
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ PrintTests.test("CustomStringConvertible") {
hasDescription(CLong(42))
hasDescription(CLongLong(42))
hasDescription(CWideChar(42)!)
hasDescription(CChar8(42))
hasDescription(CChar16(42))
hasDescription(CChar32(42)!)
}
Expand All @@ -51,6 +52,7 @@ PrintTests.test("Printable") {
expectPrinted("42", CLong(42))
expectPrinted("42", CLongLong(42))
expectPrinted("*", CWideChar(42)!)
expectPrinted("42", CChar8(42))
expectPrinted("42", CChar16(42))
expectPrinted("*", CChar32(42)!)

Expand Down Expand Up @@ -142,6 +144,7 @@ PrintTests.test("Printable") {
expectPrinted("42", CLongLong(42))

expectPrinted("*", CWideChar(42)!)
expectPrinted("42", CChar8(42))
expectPrinted("42", CChar16(42))
expectPrinted("*", CChar32(42)!)
}
Expand Down