NSString bridging and internal NUL, take two [SR-2225] #549
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's going on in this PR
Existing bridging is buggy
The existing implementation of bridging from _NSCFString to String was broken in several ways:
As a consequence, if UTF8-encoded bytes that encode internal NUL characters and multi-byte characters were used to initialize an NSString, the "fast path" would clobber bridging in one way, and the "slow path" would clobber bridging in a different way.
Revised bridging
Since CF UTF16-related functions are erratic, I use a buffer to store UTF8-encoded bytes. If some way can be found to determine (reliably, not relying on the result of a buggy implementation in CF) the length in bytes of a CFString that uses UTF8-encoded bytes for storage, then a fast path can be restored that doesn't require allocating a buffer.
Modified tests
After implementing this revised bridging, five tests began to fail. On further examination, these tests were found to be incorrect because they relied on multiple tandem NUL characters being lost on bridging. On a Mac, Darwin Foundation fails these five tests as well. I have corrected the tests.
This PR additionally adds a test for constructing an NSString with data that contains an internal NUL character.
Miscellaneous changes
This PR rolls in the previously discussed (#496) change in
init?(data:encoding:)
.This PR also fixes three nits: a typo in a test name; an alphabetization issue in the Xcode project for a test file; and inconsistent use of an unindented
//
.