Skip to content

[DNM] Parse /.../ regex literals #41767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

hamishknight
Copy link
Contributor

@hamishknight hamishknight commented Mar 10, 2022

Try to lex regex literals with /.../ delimiters when in an expression context in the parser, with a rule that it may not start with a space, tab or ) character.

@hamishknight
Copy link
Contributor Author

@rintaro The lexing logic is still WIP as:

  • Sub-lexing still needs to taken care of
  • The heuristic needs formalizing
  • The lexing loop needs to handle additional cases e.g invisible ASCII
  • Prefix operators on regex literals needs to be implemented

But let me know if you have any initial thoughts

@hamishknight
Copy link
Contributor Author

@swift-ci please test

@hamishknight
Copy link
Contributor Author

@swift-ci please test source compatibility

@rintaro
Copy link
Member

rintaro commented Mar 10, 2022

  • Sub-lexing still needs to taken care of

What do you think are the issues sub-lexing might cause?

  • The lexing loop needs to handle additional cases e.g invisible ASCII

Could you elaborate? Any example?

@hamishknight
Copy link
Contributor Author

What do you think are the issues sub-lexing might cause?

I don't think it currently causes any issues, e.g for delayed body parsing we start on the opening {. But it seems like a sub-lexer should be able to handle the case where it starts on an opening /.

Could you elaborate? Any example?

String literals currently diagnose invisible ASCII characters they come across, we probably ought to do the same for regex literals.

https://github.com/apple/swift/blob/d3702bacbb418afd4d4915cbebd77265dad50a65/lib/Parse/Lexer.cpp#L1338-L1344

@hamishknight
Copy link
Contributor Author

@swift-ci please test macOS

@hamishknight
Copy link
Contributor Author

hamishknight commented Mar 11, 2022

Try to lex regex literals with `/.../` delimiters.
Queue up diagnostics when lexing, waiting until
`Lexer::lex` is called before emitting them. This
allows us to re-lex without having to deal with
previously invalid tokens.
Loudly error if `-enable-experimental-string-processing`
is not used. This allows us to easily check for
source compatibility conficts.
@hamishknight
Copy link
Contributor Author

Updated to re-lex regex literals from the parser

@hamishknight
Copy link
Contributor Author

@swift-ci please test

@hamishknight
Copy link
Contributor Author

@swift-ci please test source compatibility

@hamishknight hamishknight deleted the regular-grammar branch March 31, 2022 19:08
@hamishknight
Copy link
Contributor Author

Continuing work on #42119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants