Skip to content

Make the lexer UTF-8 RFC 3629 correct re: prefix octets #6088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 6, 2016
Merged

Make the lexer UTF-8 RFC 3629 correct re: prefix octets #6088

merged 1 commit into from
Dec 6, 2016

Conversation

bitjammer
Copy link
Contributor

@bitjammer bitjammer commented Dec 6, 2016

RFC 2279 states that, in UTF-8:
"The octet values FE and FF never appear."

RFC 3629 states that, in UTF-8:
"The octet values C0, C1, F5 to FF never appear."

Generalize the check to advance past invalid starting bytes for
a UTF-8 sequence to fix a crash in the lever.

rdar://problem/28822218

RFC 2279 states that, in UTF-8:
"The octet values FE and FF never appear."

RFC 3629 states that, in UTF-8:
"The octet values C0, C1, F5 to FF never appear."

Generalize the check to advance past invalid starting bytes for
a UTF-8 sequence to fix a crash in the lexer.
@bitjammer
Copy link
Contributor Author

@swift-ci Please smoke test and merge

@swift-ci swift-ci merged commit 5b82297 into swiftlang:master Dec 6, 2016
@bitjammer bitjammer deleted the correct-utf-8-illegal-octets branch December 6, 2016 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants