-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Parse] Adjust Lexer to allow Multi-line string literals #2275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is very interesting, but we'd need an accepted swift-evolution proposal before we can take this. |
Absolutely, it’s being discussed on swift-evolution mail thread "multi-line string literals.” at the moment to refine the proposal before it is submitted. Apologies if I’ve got the process backwards. A 3.0 toolchain supporting multiline strings can be installed as follows : $ curl http://johnholdsworth.com/swift-LOCAL-2016-04-24-a-osx.tar.gz > multiline.tar.gz |
No worries, I just wanted to explain why it wasn't going to get a lot of immediate review. |
This PR has been modified to use “continuation quotes” as suggested on the thread https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160418/015596.html. This would make the following strings valid: (the “e” string litteral modifier will be removed before submission as this is a separate discussion. It is there to show this syntax can accomodate modifiers): Toolchain available here for testing: http://johnholdsworth.com/swift-LOCAL-2016-05-01-a-osx.tar.gz |
@@ -1688,15 +1724,21 @@ void Lexer::lexImpl() { | |||
case '&': case '|': case '^': case '~': case '.': | |||
return lexOperatorIdentifier(); | |||
|
|||
case '_': case 'e': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Special-casing e
and _
is adequate to test the feel of these features, but it doesn't tell us much about how feasible parsing the full feature—which would permit an arbitrary-length series of identifier characters in front of a string literal—would be. How I envision this actually working is that:
- The lexer scans an identifier, then (perhaps around here) checks if the next character is a quote mark. If so:
- It parses the just-scanned "identifier" as a set of modifiers. (For a first cut, this "parsing" might just take the first or last character of the identifier as a
char
, but eventually it would probably scan through the identifier, setting/clearing/incrementing flags in aStringLiteralModifier
struct/class, and diagnosing an error if it encountered something it didn't recognize.) - It passes the parsed modifiers to
lexStringLiteral()
and returns, bypassing the rest of the identifier lexing.
- It parses the just-scanned "identifier" as a set of modifiers. (For a first cut, this "parsing" might just take the first or last character of the identifier as a
Does that seem like a feasible approach to you? Is it something you could incorporate into the prototype?
@johnno1962 Wow, that's some great work. It makes these tests parse: plan.eq(
e_"print("Hello, world!\n")"_,
"print(\"Hello, world!\\n\")",
"Alternate delimiter string without escapes",
todo: "`e` should change handling of escapes with known meanings"
)
plan.eq(
e_""[^"\\]*(\\.[^"\\]*)*+""_,
"\"[^\"\\\\]*(\\.[^\"\\\\]*)*+\"",
"Complex regex with alternate delimiter and no escapes",
todo: "`e` should change handling of escapes with known meanings"
) And this new one pass: plan.eq(
e_""\w+""_,
"\"\\w+\"",
"Simple regex with alternate delimiter and no escapes"
) Thanks! |
Toolchain supporting “string” with continuation quotes, “””strings””” that allow newlines and <<“HEREDOC” or <<‘HEREDOC’ syntax can be downloaded here: |
The swift-evolution proposal has not reached consensus. This can be reopened at any time - even for Swift 4 because it is an additive change. Thank you so much for your contribution. |
Resolve conflicts with `main`
What's in this pull request?
This PR is a proof of concept picking up a not yet formally proposed evolution to have Swift support multi-line string literals after the python “””string””” syntax.
Changes to Lexer.cpp are very minor and the remainder of the toolchain seems unfazed with testing showing the following functioning correctly: Xcode Source Editor, syntax highlighting, compilation errors and their line numbers, breakpoints (even inside the string), value interpolation, accurate crash site reporting and, SourceKit’s indenting code is not affected.
To keep documentation straightforward and to confine the change to the lexer, no attempt has been made to remove the first newline from the string or deal with removing indenting in the string which were mentioned as potential desirables in the following thread:
https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20151207/001565.html
If this approach seems acceptable, let me know and I’ll create a full evolution proposal so this can be discussed by the community.
Resolved bug number:
Resolves #42792.