Skip to content

[5.8][stdlib] String: Fix forward implementation of grapheme breaking rule 11 #63047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2023

Conversation

lorentey
Copy link
Member

(Cherry picked from #63043.)

Rule GB11 in Unicode Annex 29 is:

GB11: Extended_Pictographic Extend* ZWJ × Extended_Pictographic

However, our forward grapheme breaking state machine implements it as:

GB11: Extended_Pictographic (Extend | ZWJ)* ZWJ × Extended_Pictographic

We implement the correct rules when going backward, which can cause String values to have different counts whether we’re going forward or back.

The rule as implemented would be fine (Unicode doesn’t care much about the placement of grapheme breaks in invalid sequences), but the directional inconsistency messes with String’s Collection conformance.

rdar://104279671

Rule GB11 in Unicode Annex 29 is:

GB11: Extended_Pictographic Extend* ZWJ × Extended_Pictographic

However, our forward grapheme breaking state machine implements it as:

GB11: Extended_Pictographic Extend* ZWJ+ × Extended_Pictographic

We implement the correct rules when going backward, which can cause String values to have different counts whether we’re going forward or back.

The rule as implemented would be fine (Unicode doesn’t care much about the placement of grapheme breaks in invalid sequences), but the directional inconsistency messes with String’s Collection conformance.

rdar://104279671
(cherry picked from commit a3e517e)
@lorentey lorentey changed the base branch from main to release/5.8 January 16, 2023 04:10
@lorentey
Copy link
Member Author

@swift-ci test

@lorentey lorentey merged commit 2405ce4 into swiftlang:release/5.8 Jan 16, 2023
@lorentey lorentey deleted the gb11-5.8 branch January 16, 2023 18:46
@AnthonyLatsis AnthonyLatsis added the 🍒 release cherry pick Flag: Release branch cherry picks label May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🍒 release cherry pick Flag: Release branch cherry picks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants