Skip to content

[Parse] Adjust Lexer to allow Multi-line string literals #2275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

johnno1962
Copy link
Contributor

@johnno1962 johnno1962 commented Apr 22, 2016

What's in this pull request?

This PR is a proof of concept picking up a not yet formally proposed evolution to have Swift support multi-line string literals after the python “””string””” syntax.

Changes to Lexer.cpp are very minor and the remainder of the toolchain seems unfazed with testing showing the following functioning correctly: Xcode Source Editor, syntax highlighting, compilation errors and their line numbers, breakpoints (even inside the string), value interpolation, accurate crash site reporting and, SourceKit’s indenting code is not affected.

multiline

To keep documentation straightforward and to confine the change to the lexer, no attempt has been made to remove the first newline from the string or deal with removing indenting in the string which were mentioned as potential desirables in the following thread:

https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20151207/001565.html

If this approach seems acceptable, let me know and I’ll create a full evolution proposal so this can be discussed by the community.

Resolved bug number:

Resolves #42792.

@jrose-apple jrose-apple added the swift evolution pending discussion Flag → feature: A feature that has a Swift evolution proposal currently in review label Apr 22, 2016
@lattner
Copy link
Contributor

lattner commented Apr 24, 2016

This is very interesting, but we'd need an accepted swift-evolution proposal before we can take this.

@johnno1962
Copy link
Contributor Author

johnno1962 commented Apr 24, 2016

Absolutely, it’s being discussed on swift-evolution mail thread "multi-line string literals.” at the moment to refine the proposal before it is submitted. Apologies if I’ve got the process backwards.

A 3.0 toolchain supporting multiline strings can be installed as follows :

$ curl http://johnholdsworth.com/swift-LOCAL-2016-04-24-a-osx.tar.gz > multiline.tar.gz
$ sudo tar xfz multiline.tar.gz -C /

@lattner
Copy link
Contributor

lattner commented Apr 24, 2016

No worries, I just wanted to explain why it wasn't going to get a lot of immediate review.

@johnno1962
Copy link
Contributor Author

johnno1962 commented May 1, 2016

This PR has been modified to use “continuation quotes” as suggested on the thread https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160418/015596.html. This would make the following strings valid: (the “e” string litteral modifier will be removed before submission as this is a separate discussion. It is there to show this syntax can accomodate modifiers):

multiline2

Toolchain available here for testing: http://johnholdsworth.com/swift-LOCAL-2016-05-01-a-osx.tar.gz

@@ -1688,15 +1724,21 @@ void Lexer::lexImpl() {
case '&': case '|': case '^': case '~': case '.':
return lexOperatorIdentifier();

case '_': case 'e':
Copy link
Contributor

@beccadax beccadax May 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special-casing e and _ is adequate to test the feel of these features, but it doesn't tell us much about how feasible parsing the full feature—which would permit an arbitrary-length series of identifier characters in front of a string literal—would be. How I envision this actually working is that:

  • The lexer scans an identifier, then (perhaps around here) checks if the next character is a quote mark. If so:
    • It parses the just-scanned "identifier" as a set of modifiers. (For a first cut, this "parsing" might just take the first or last character of the identifier as a char, but eventually it would probably scan through the identifier, setting/clearing/incrementing flags in a StringLiteralModifier struct/class, and diagnosing an error if it encountered something it didn't recognize.)
    • It passes the parsed modifiers to lexStringLiteral() and returns, bypassing the rest of the identifier lexing.

Does that seem like a feasible approach to you? Is it something you could incorporate into the prototype?

@beccadax
Copy link
Contributor

beccadax commented May 3, 2016

@johnno1962 Wow, that's some great work. It makes these tests parse:

plan.eq(
    e_"print("Hello, world!\n")"_,
    "print(\"Hello, world!\\n\")",

    "Alternate delimiter string without escapes",

    todo: "`e` should change handling of escapes with known meanings"
)

plan.eq(
    e_""[^"\\]*(\\.[^"\\]*)*+""_,
    "\"[^\"\\\\]*(\\.[^\"\\\\]*)*+\"",

    "Complex regex with alternate delimiter and no escapes",

    todo: "`e` should change handling of escapes with known meanings"
)

And this new one pass:

plan.eq(
    e_""\w+""_,
    "\"\\w+\"",

    "Simple regex with alternate delimiter and no escapes"
)

Thanks!

@johnno1962
Copy link
Contributor Author

johnno1962 commented May 7, 2016

Toolchain supporting “string” with continuation quotes, “””strings””” that allow newlines and <<“HEREDOC” or <<‘HEREDOC’ syntax can be downloaded here:
http://johnholdsworth.com/swift-LOCAL-2016-05-09-a-osx.tar.gz

@CodaFi
Copy link
Contributor

CodaFi commented Jul 28, 2016

The swift-evolution proposal has not reached consensus. This can be reopened at any time - even for Swift 4 because it is an additive change.

Thank you so much for your contribution.

@CodaFi CodaFi closed this Jul 28, 2016
MaxDesiatov pushed a commit that referenced this pull request Apr 19, 2021
@AnthonyLatsis AnthonyLatsis added compiler The Swift compiler itself feature A feature request or implementation literals Feature → expressions: Literals such as an integer or string literal labels Mar 22, 2023
@AnthonyLatsis AnthonyLatsis added multiline string literals Feature → expressions → literals → string literals: multiline string literals lexer Area → compiler: The legacy C++ lexer and removed swift evolution pending discussion Flag → feature: A feature that has a Swift evolution proposal currently in review labels Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler The Swift compiler itself feature A feature request or implementation lexer Area → compiler: The legacy C++ lexer literals Feature → expressions: Literals such as an integer or string literal multiline string literals Feature → expressions → literals → string literals: multiline string literals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants