Skip to content

Fix for a crash in String.substring(with:) [SR-2483] #647

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 5, 2017

Conversation

pushkarnk
Copy link
Member

@pushkarnk pushkarnk commented Sep 18, 2016

I think this is just one way to fix: https://bugs.swift.org/browse/SR-2483
I've documented some additional findings in the bug report.

@phausler
Copy link
Contributor

could you possibly add a unit test to verify this behavior?

@pushkarnk
Copy link
Member Author

pushkarnk commented Sep 26, 2016

@phausler : I am no longer comfortable with my fix. The API reference says String.init?(_ utf16) returns nil only when the utf16 contains unpaired surrogates - so, the case of it returning nil for "\r\n"[1..<1] does not match the spec and we cannot simply return "". This needs more investigation on the stdlib part. If you don't mind, I'll keep this pull request open until we have a better understanding of how this should be fixed.

@phausler
Copy link
Contributor

Certainly, it might be useful to verify the ground truth of how that behaves in objc - since that is our baseline here

@pushkarnk pushkarnk force-pushed the sr2483 branch 2 times, most recently from 8ccc37a to 4d87308 Compare October 4, 2016 12:36
@pushkarnk
Copy link
Member Author

pushkarnk commented Oct 4, 2016

When using substring(with:) if either end of the range breaks a surrogate pair, we crash on Linux but not on Darwin. On Darwin, I found the unpaired surrogate being replaced by the Unicode replacement character. We need to do the same on Linux. Also, for ranges of the type t..<t we need to make sure we return an empty string.

@phausler @parkera : can you please review the new changes?

@pushkarnk pushkarnk force-pushed the sr2483 branch 4 times, most recently from 93aa217 to 8f6103a Compare October 4, 2016 13:36
@pushkarnk
Copy link
Member Author

@swift-ci please test

@pushkarnk
Copy link
Member Author

pushkarnk commented Feb 9, 2017

Updated this to include a handful of corner cases with surrogate pairs and \r\n. I've made sure we behave like Darwin!

@pushkarnk
Copy link
Member Author

@swift-ci please test

//the range probably breaks surrogate pairs at both the ends
guard min.advanced(by: 1) <= max.advanced(by: -1) else { return prefix + suffix }

let substr = String(_storage.utf16[min.advanced(by: 1)..<max.advanced(by: -1)])!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After all the prior checks, I think this can be safely unwrapped.

@pushkarnk
Copy link
Member Author

@phausler @parkera Sorry for the delay on this.

I've tried to make substring(with: NSRange) more robust (and behave like Darwin) here, in the process of fixing SR-2483. Can you please review? Thanks.

@parkera
Copy link
Contributor

parkera commented Feb 15, 2017

@swift-ci please test

@pushkarnk
Copy link
Member Author

@parkera @phausler is there anything else you'd want me to consider here?

@parkera
Copy link
Contributor

parkera commented Mar 5, 2017

This took me a while to review because I was digging through the Darwin version, which really defers this operation to CFString. That's where the replacement character substitution happens. In the long term we may want to make NSString.swift do something similar so we don't end up replicating this logic in two places, but in the short term this fix seems appropriate.

@parkera
Copy link
Contributor

parkera commented Mar 5, 2017

@swift-ci test and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants