-
Notifications
You must be signed in to change notification settings - Fork 1.2k
SR-7455: Allow NUL in Strings to match Darwin #1677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci test |
result = str.withMemoryRebound(to: UInt8.self, capacity: length) { ptr in | ||
let buffer = UnsafeBufferPointer(start: ptr, count: length) | ||
return String(decoding: buffer, as: UTF8.self) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. If you're open to a small addition to your fix, we may want to also consider trying to call CFStringGetCharactersPtr
if CFStringGetCStringPtr
fails. That one may return a pointer to a UTF16 encoded buffer (or NULL), which we can also use without re-allocating in the slow path below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* These functions attempt to return in O(1) time the desired format for the string.
Note that although this means a pointer to the internal structure is being returned,
this can't always be counted on. Please see note at the top of the file for more
details.
*/
CF_EXPORT
const UniChar *CFStringGetCharactersPtr(CFStringRef theString); /* May return NULL at any time; be prepared for NULL, if not now, in some other time or place. See discussion at top of this file. */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pakera I will look into making that change. BTW, is it correct to use CFStringGetLength
to find the number of bytes in the UTF8 string or is there something better to use?
@swift-ci test |
1 similar comment
@swift-ci test |
@parkera does the 2nd commit look ok? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CFStringGetLength returns the number of characters, which is not the number of UTF-8 bytes of their encoding. I’m pretty sure that’s not okay.
(Where ‘characters’ means ‘UTF-16 code points’.) |
@millenomi Are |
I've made four attempts at this and no longer have the time to do so, so I'm glad you're taking a look. There are the four attempts, which hopefully will give some insight: It's been a while since I've thought about this, but I recall being greatly dismayed by the following:
|
@swift-ci test |
1 similar comment
@swift-ci test |
@millenomi I think I have addressed the UTF-8 multibyte issue when using Some testing actually revealed that constructing a multibyte UTF-8 |
@millenomi Can you check if my last comment about only requesting an ASCII string when using |
Yeah, in that encoding byte length is guaranteed to be equal to character length. |
@swift-ci test and merge |
1 similar comment
@swift-ci test and merge |
@swift-ci please test and merge |
1 similar comment
@swift-ci please test and merge |
- String._conditionallyBridgeFromObjectiveC: Use String(decoding:as: UTF8.self) instead of String(cString:) to allow NULs in the decoded string. - Update tests which use buffers with trailing zeros to create strings, making the tests work correctly against Darwin's native Foundation.
- If CFStringGetLength() returns 0 just create an empty string (""). - Try CFStringGetCharactersPtr() before CFStringGetCharacters() to avoid allocating a buffer.
- Add tests for multibyte UTF-8/ UTF-16.
@spevans I missed the fact this has conflicts; even with them, tho, the build failure is worrying. Let's try again tomorrow. |
@swift-ci test |
The Mojave merge had some whitespace that caused a conflict, Ive just rebased. But yes, that build error is odd and not one Ive seen before. |
@swift-ci please test and merge |
1 similar comment
@swift-ci please test and merge |
String._conditionallyBridgeFromObjectiveC:
Use String(decoding:as: UTF8.self) instead of String(cString:) to
allow NULs in the decoded string.
Update tests which use buffers with trailing zeros to create strings,
making the tests work correctly against Darwin's native Foundation.