-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[stdlib] Make String.Index(_:within:) initializers more permissive #42442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In Swift 5.6 and below, (broken) code that acquired indices from a UTF-16-encoded string bridged from Cocoa and kept using them after a `makeContiguousUTF8` call (or other mutation) may have appeared to be working correctly as long as the string was ASCII. Since swiftlang#41417, the `String(_:within:)` initializers recognize miscoded indices and reject them by returning nil. This is technically correct, but it unfortunately may be a binary compatibility issue, as these used to return non-nil in previous versions. Mitigate this issue by accepting UTF-16 indices on a UTF-8 string, transcoding their offset as needed. (Attempting to use an UTF-8 index on a UTF-16 string is still rejected — we do not implicitly convert strings in that direction.) rdar://89369680
…tring.Index(_:within:)` Fix a long-standing issue where the UTF16View overload of `String.Index.init(_:within:)` used to return nil for valid indices that happened to point to a trailing surrogate in a UTF-8-encoded string. rdar://91935537
There is little point to having `isUTF16` properties when they simply return `!isUTF8`; remove them. Rename `String.Index._copyEncoding(from:)` to `_copyingEncoding(from:)`.
@swift-ci test |
Azoy
approved these changes
Apr 19, 2022
glessard
reviewed
Apr 19, 2022
glessard
approved these changes
Apr 19, 2022
@swift-ci test |
@swift-ci test Windows platform |
Windows seems to be down with an unrelated failure, but it did seem to pass the new tests:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In Swift 5.6 and below, (broken) code that acquired indices from a UTF-16-encoded string bridged from Cocoa and kept using them after a
makeContiguousUTF8
call (or other mutation) may have appeared to be working correctly as long as the string consisted of ASCII characters up to the given index.Since #41417, the
String(_:within:)
initializers recognize miscoded indices and reject them by returning nil, so this code printsInvalid index
. This is technically correct, but it unfortunately may be a binary compatibility issue, as in Swift 5.6 and below, this initializer used to return a non-nil value, and because the string starts with ASCII characters, the returned index happened to address the expected character. ('c' in this case.)Mitigate this issue by accepting UTF-16 indices on a UTF-8 string, transcoding their offsets as needed. (Attempting to use an UTF-8 index on a UTF-16 string is still rejected — we do not implicitly convert strings in that direction.) This restores Swift 5.6's behavior, as well as returning the expected value even in case the string's prefix includes non-ASCII scalars.
While we’re here, also fix a long-standing issue where the UTF16View overload of
String.Index.init(_:within:)
used to returnnil
for valid indices that happened to point to a trailing surrogate in a UTF-8 encoded string.rdar://89369680&91935537