-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[String.Index] Obsolete encodedOffset var/init #22108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dd38b3b
to
0cff8fb
Compare
@swift-ci please smoke test |
@swift-ci please test |
Build failed |
@swift-ci please test |
Build failed |
Build failed |
b596724
to
e210805
Compare
String.Index has an encodedOffset-based initializer and computed property that exists for serialization purposes. It was documented as UTF-16 in the SE proposal introducing it, which was String's underlying encoding at the time, but the dream of String even then was to abstract away whatever encoding happend to be used. Serialization needs an explicit encoding for serialized indices to make sense: the offsets need to align with the view. With String utilizing UTF-8 encoding for native contents in Swift 5, serialization isn't necessarily the most efficient in UTF-16. Furthermore, the majority of usage of encodedOffset in the wild is buggy and operates under the assumption that a UTF-16 code unit was a Swift Character, which isn't even valid if the String is known to be all-ASCII (because CR-LF). This change introduces a pair of semantics-preserving alternatives to encodedOffset that explicitly call out the UTF-16 assumption. These serve as a gentle off-ramp for current mis-uses of encodedOffset.
e210805
to
415cc8f
Compare
This proposal has been accepted and can be merged. |
@swift-ci please test |
Build failed |
Linux Failures seems unrelated:
|
@swift-ci please test Linux Platform |
@available(swift, deprecated: 4.2, message: """ | ||
encodedOffset has been deprecated as most common usage is incorrect. \ | ||
Use utf16Offset(in:) to achieve the same behavior. | ||
""") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Should the initializer and property be
deprecated: 4.2
ordeprecated: 5.0
? -
Should the deprecated APIs be moved to MigrationSupport.swift if possible?
-
A multi-line message (even with escaped line breaks) won't display properly in Xcode 10.1 (and perhaps in Xcode 10.2 beta).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I believe that availability matches the language version and not deployment target. This should be deprecated regardless of what is passed in
-swift-version
. @jrose-apple / @moiseev what do you think? -
I think MigrationSupport is done for obsoleted. We should probably obsolete this around 5.1 or so and move it then.
-
I'll have to try that then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MigrationSupport has some non-obsoleted APIs:
-
UTF8
,UTF16
,UTF32
,UnicodeScalar
(which are also non-deprecated). -
_boundsCheck(_:)
(I'm not sure what this is doing here, it seems to be unused).
Build failed |
Linux PR bot has been failing in multiple PRs. Trying again |
@swift-ci please test linux platform |
Build failed |
@swift-ci clean test |
Build failed |
@swift-ci please test linux platform |
1 similar comment
@swift-ci please test linux platform |
Build failed |
Linux failure is unrelated and appearing on other PRs: |
@swift-ci please test linux platform |
[String.Index] Deprecate encodedOffset var/init
String.Index has an encodedOffset-based initializer and computed
property that exists for serialization purposes. It was documented as
UTF-16 in the SE proposal introducing it, which was String's
underlying encoding at the time, but the dream of String even then was
to abstract away whatever encoding happend to be used.
Serialization needs an explicit encoding for serialized indices to
make sense: the offsets need to align with the view. With String
utilizing UTF-8 encoding for native contents in Swift 5, serialization
isn't necessarily the most efficient in UTF-16.
Furthermore, the majority of usage of encodedOffset in the wild is
buggy and operates under the assumption that a UTF-16 code unit
was a Swift Character, which isn't even valid if the String is known
to be all-ASCII (because CR-LF).
This change introduces a set of new initializers and methods with an
explicit encoding (through the type of the provided view) for
serialization. It also adds an init and method for String's default
view, which most code in the wild should migrate to if it eliminates
their underlying bugs.
Resolves SR-9749. Much thanks to @norio-nomura for catching this!
rdar://problem/47605131