Skip to content

Explicitly model all supported contextual keywords in the contextualKeyword RawTokenKind #1173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 11, 2023

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Dec 28, 2022

The main change here is that the contextualKeyword TokenKind and RawTokenKind no longer have a string as an associated value but an enum that enumerates all known contextual keywords.

In follow-up commits, this allows us to rename contextualKeyword to keyword and merge all the other known keywords into the Keyword enum that currently enumerates all contextual keywords. That way we can remove the distinction between contextual keywords and keywords we currently have.

I’m also thinking that we can use this additional structure to guarantee a little more type safety in the parser, for example by generating RawTokenKindSubsets for children that have text_choices set, but I still need to look into that.

@ahoppen ahoppen requested review from rintaro and bnbarham December 28, 2022 18:37
Copy link
Contributor

@bnbarham bnbarham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff, the simplification of consume* is really nice! I'm not a huge fan of the huge switch for SyntaxText and then the conversion to String in TokenKind, but given we need it for defaultText in places ... 🤔.

(.__setter_access, _)?,
(.reasync, _)?:
elements.append(self.parseSimpleModifier())
case (.final, let handle)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few places where this is the case, but IMO it'd be better to assign kind/handle first and then switch over kind rather than having to do (.kind, let handle) or (.kind, _) everywhere. Not really relevant to this, I just noticed it while going through the changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about this approach is that it guarantees we have a handle if we match a kind without having to force-unwrap it (remember there’s a default case at the bottom that doesn’t have a handle).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason that has to be in the switch though:

guard let (kind, handle) = self.at(anyIn: DeclarationModifier.self) else {
  break MODIFIER_LOOP
}

switch kind {
...
}

Or am I missing something obvious?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that would also work. I think that’s just a matter of style now. If there were fewer cases in the RawTokenKindSubset, I would prefer a single switch instead of the guard + switch combination. And then I think it’s nice to always use the same style when you’re switching over RawTokenKindSubsets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's definitely style. I'll win you over eventually!

Comment on lines +266 to +268
} else if self.at(.contextualKeyword(.Type)) || self.at(.contextualKeyword(.Protocol)) {
let typeOrProtocol = self.consume(if: .contextualKeyword(.Type)) ?? self.consume(if: .contextualKeyword(.Protocol))!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should add a 2 and 3 arg consume to avoid the array creation and then use that here. The only non-2/3 kind calls should probably be subsets anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I’ll re-consider it after I have had some time to play around automatic generation of RawTokenKindSubset from text_choices tomorrow.

Copy link
Contributor

@kimdv kimdv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some inline questions and notes to my self

@@ -26,7 +26,9 @@ public enum TokenKind: Hashable {
%
% # Tokens that don't have a set text have an associated value that
% # contains their text.
% if not token.text:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Remember in #1158

@ahoppen ahoppen force-pushed the ahoppen/enumerate-contextual-keywords branch 2 times, most recently from b7b9576 to f275db8 Compare December 29, 2022 14:24
@ahoppen
Copy link
Member Author

ahoppen commented Dec 29, 2022

@swift-ci Please test

1 similar comment
@ahoppen
Copy link
Member Author

ahoppen commented Dec 30, 2022

@swift-ci Please test

@ahoppen
Copy link
Member Author

ahoppen commented Dec 30, 2022

@swift-ci Please test

@ahoppen
Copy link
Member Author

ahoppen commented Jan 2, 2023

@swift-ci Please test

@ahoppen ahoppen force-pushed the ahoppen/enumerate-contextual-keywords branch from d01b3e0 to 8ef8747 Compare January 3, 2023 10:43
@ahoppen
Copy link
Member Author

ahoppen commented Jan 3, 2023

@swift-ci Please test

@ahoppen
Copy link
Member Author

ahoppen commented Jan 3, 2023

@ahoppen ahoppen force-pushed the ahoppen/enumerate-contextual-keywords branch 2 times, most recently from db1d8a4 to 32e1b39 Compare January 3, 2023 16:39
@ahoppen
Copy link
Member Author

ahoppen commented Jan 3, 2023

…Keyword` `RawTokenKind`

The main change here is that the `contextualKeyword` `TokenKind` and `RawTokenKind` no longer have a string as an associated value but an enum that enumerates all known contextual keywords.

In follow-up commits, this allows us to rename `contextualKeyword` to `keyword` and merge all the other known keywords into the `Keyword` enum that currently enumerates all contextual keywords. That way we can remove the distinction between contextual keywords and keywords we currently have.

I’m also thinking that we can use this additional structure to guarantee a little more type safety in the parser, for example by generating `RawTokenKindSubset`s for children that have `text_choices` set, but I still need to look into that.
@ahoppen ahoppen force-pushed the ahoppen/enumerate-contextual-keywords branch from 32e1b39 to 2a1597f Compare January 11, 2023 06:55
@ahoppen
Copy link
Member Author

ahoppen commented Jan 11, 2023

@ahoppen ahoppen merged commit 8e6ae6e into swiftlang:main Jan 11, 2023
@ahoppen ahoppen deleted the ahoppen/enumerate-contextual-keywords branch January 11, 2023 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants