Skip to content

Use TokenSpec (nee RawTokenKindMatch) in consume functions #1328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 9, 2023

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Feb 8, 2023

This allows us to simplify the consume methods a little since they no longer need to take parameters for recovery precedence, remapping RawTokenKind and allowTokenAtStartOfLine.

It also removes RawTokenKind from the API surface of all the consumption functions, which makes it easier to make RawTokenKind a trivial enum again.

And it renames RawTokenKindSubset to TokenSpecSet because that’s what it is now: A set of TokenSpecs.

I measured and there is no performance regression associated with this change, maybe even a performance increase because of the added @inline(__always) but it’s too narrow to tell for certain.

@ahoppen ahoppen requested review from rintaro and bnbarham February 8, 2023 22:32
@ahoppen
Copy link
Member Author

ahoppen commented Feb 8, 2023

@swift-ci Please test

@ahoppen
Copy link
Member Author

ahoppen commented Feb 9, 2023

@swift-ci Please test macOS

This allows us to simplify the consume methods a little since they no longer need to take parameters for recovery precedence, remapped `RawTokenKind` and `allowTokenAtStartOfLine`.

It also removes `RawTokenKind` from the API surface of all the consumption functions, which makes it easier to make `RawTokenKind` a trivial enum again.

This also renames `RawTokenKindSubset` to `TokenSpecSet` because that’s what it is now: A set of `TokenSpec`s.

I measured and there is no performance regression associated with this change.
@ahoppen ahoppen force-pushed the ahoppen/consume-rawtokenkindmatch branch from 0f028a8 to b439add Compare February 9, 2023 15:02
@ahoppen
Copy link
Member Author

ahoppen commented Feb 9, 2023

@swift-ci Please test

Copy link
Contributor

@bnbarham bnbarham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still feel weird about how all this works though. In my head it seems like it would be so much simpler to just have the lexer convert all tokens to known keywords. That can be super optimized as much as we want. Then the subsets would be eg. just an optionset that could be compared first and then a spec after that (if needed)... have to think it through more.

Anyway, regardless of any of that we still need to make RawTokenKind a trivial enum again. So I'm happy with this + what I assume your next PR is going to be. If all of this turns out to be too slow we can revisit later.

@ahoppen ahoppen merged commit 32188f6 into swiftlang:main Feb 9, 2023
@ahoppen ahoppen deleted the ahoppen/consume-rawtokenkindmatch branch February 9, 2023 22:29
@ahoppen
Copy link
Member Author

ahoppen commented Feb 9, 2023

I still feel weird about how all this works though. In my head it seems like it would be so much simpler to just have the lexer convert all tokens to known keywords. That can be super optimized as much as we want. Then the subsets would be eg. just an optionset that could be compared first and then a spec after that (if needed)... have to think it through more.

Anyway, regardless of any of that we still need to make RawTokenKind a trivial enum again. So I'm happy with this + what I assume your next PR is going to be. If all of this turns out to be too slow we can revisit later.

My plan is that eventually all the identifier -> keyword conversions are done in the parser and the lexer doesn’t even know about keywords. But last time I checked, making that switch will not be trivial if we don’t want to start accepting input that we currently reject.

@bnbarham
Copy link
Contributor

bnbarham commented Feb 9, 2023

My plan is that eventually all the identifier -> keyword conversions are done in the parser and the lexer doesn’t even know about keywords. But last time I checked, making that switch will not be trivial if we don’t want to start accepting input that we currently reject.

My problem with that is that it just seems harder to optimize. Ie. harder to not end up with "if else if else if " for each at/consume/etc. Where as if there's just one function that does this in the lexer ... start with a switch on count, add in some lookup tables for cases where we can convert the keyword to eg. 8 bytes. So on and so forth.

Having said all that, we could totally have that function regardless and just run it as needed in the parser (ie. if we only need an identifier, don't bother). If we really wanted we could even do it for each subset, it'd just be way more code.

... So maybe my issue is actually with the pattern matching in the subsets 😅. But this PR doesn't change that and we could still do either with taking in a TokenSpec. And I also don't think this changes the trivial RawTokenKind update either. It's really just optimizing the subset matching.

@ahoppen
Copy link
Member Author

ahoppen commented Feb 10, 2023

... So maybe my issue is actually with the pattern matching in the subsets 😅. But this PR doesn't change that and we could still do either with taking in a TokenSpec. And I also don't think this changes the trivial RawTokenKind update either. It's really just optimizing the subset matching.

This might be just what you are asking for

https://github.com/apple/swift-syntax/pull/1334/files#diff-f78f15946b1330eeb07d5625d092a98956c9f8a3931fd8e660df2bb19f4d11a7R15-R39

ahoppen added a commit that referenced this pull request Feb 10, 2023
…um-prep

Preparation to switch RawTokenKind to be a trivial enum again 🚥 #1328
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants