Use `TokenSpec` (nee `RawTokenKindMatch`) in `consume` functions #1328

ahoppen · 2023-02-08T22:32:43Z

This allows us to simplify the consume methods a little since they no longer need to take parameters for recovery precedence, remapping RawTokenKind and allowTokenAtStartOfLine.

It also removes RawTokenKind from the API surface of all the consumption functions, which makes it easier to make RawTokenKind a trivial enum again.

And it renames RawTokenKindSubset to TokenSpecSet because that’s what it is now: A set of TokenSpecs.

I measured and there is no performance regression associated with this change, maybe even a performance increase because of the added @inline(__always) but it’s too narrow to tell for certain.

ahoppen · 2023-02-08T22:32:56Z

@swift-ci Please test

ahoppen · 2023-02-09T10:18:40Z

@swift-ci Please test macOS

This allows us to simplify the consume methods a little since they no longer need to take parameters for recovery precedence, remapped `RawTokenKind` and `allowTokenAtStartOfLine`. It also removes `RawTokenKind` from the API surface of all the consumption functions, which makes it easier to make `RawTokenKind` a trivial enum again. This also renames `RawTokenKindSubset` to `TokenSpecSet` because that’s what it is now: A set of `TokenSpec`s. I measured and there is no performance regression associated with this change.

ahoppen · 2023-02-09T15:03:28Z

@swift-ci Please test

bnbarham

I still feel weird about how all this works though. In my head it seems like it would be so much simpler to just have the lexer convert all tokens to known keywords. That can be super optimized as much as we want. Then the subsets would be eg. just an optionset that could be compared first and then a spec after that (if needed)... have to think it through more.

Anyway, regardless of any of that we still need to make RawTokenKind a trivial enum again. So I'm happy with this + what I assume your next PR is going to be. If all of this turns out to be too slow we can revisit later.

ahoppen · 2023-02-09T22:32:10Z

I still feel weird about how all this works though. In my head it seems like it would be so much simpler to just have the lexer convert all tokens to known keywords. That can be super optimized as much as we want. Then the subsets would be eg. just an optionset that could be compared first and then a spec after that (if needed)... have to think it through more.

Anyway, regardless of any of that we still need to make RawTokenKind a trivial enum again. So I'm happy with this + what I assume your next PR is going to be. If all of this turns out to be too slow we can revisit later.

My plan is that eventually all the identifier -> keyword conversions are done in the parser and the lexer doesn’t even know about keywords. But last time I checked, making that switch will not be trivial if we don’t want to start accepting input that we currently reject.

bnbarham · 2023-02-09T23:19:44Z

My plan is that eventually all the identifier -> keyword conversions are done in the parser and the lexer doesn’t even know about keywords. But last time I checked, making that switch will not be trivial if we don’t want to start accepting input that we currently reject.

My problem with that is that it just seems harder to optimize. Ie. harder to not end up with "if else if else if " for each at/consume/etc. Where as if there's just one function that does this in the lexer ... start with a switch on count, add in some lookup tables for cases where we can convert the keyword to eg. 8 bytes. So on and so forth.

Having said all that, we could totally have that function regardless and just run it as needed in the parser (ie. if we only need an identifier, don't bother). If we really wanted we could even do it for each subset, it'd just be way more code.

... So maybe my issue is actually with the pattern matching in the subsets 😅. But this PR doesn't change that and we could still do either with taking in a TokenSpec. And I also don't think this changes the trivial RawTokenKind update either. It's really just optimizing the subset matching.

ahoppen · 2023-02-10T06:16:21Z

... So maybe my issue is actually with the pattern matching in the subsets 😅. But this PR doesn't change that and we could still do either with taking in a TokenSpec. And I also don't think this changes the trivial RawTokenKind update either. It's really just optimizing the subset matching.

This might be just what you are asking for

https://github.com/apple/swift-syntax/pull/1334/files#diff-f78f15946b1330eeb07d5625d092a98956c9f8a3931fd8e660df2bb19f4d11a7R15-R39

…um-prep Preparation to switch RawTokenKind to be a trivial enum again 🚥 #1328

ahoppen requested review from rintaro and bnbarham February 8, 2023 22:32

ahoppen force-pushed the ahoppen/consume-rawtokenkindmatch branch from 0f028a8 to b439add Compare February 9, 2023 15:02

ahoppen mentioned this pull request Feb 9, 2023

Preparation to switch RawTokenKind to be a trivial enum again #1333

Merged

bnbarham approved these changes Feb 9, 2023

View reviewed changes

rintaro approved these changes Feb 9, 2023

View reviewed changes

ahoppen merged commit 32188f6 into swiftlang:main Feb 9, 2023

ahoppen deleted the ahoppen/consume-rawtokenkindmatch branch February 9, 2023 22:29

ahoppen added a commit that referenced this pull request Feb 10, 2023

Merge pull request #1333 from ahoppen/ahoppen/rawtokenkind-trivial-en…

43568e9

…um-prep Preparation to switch RawTokenKind to be a trivial enum again 🚥 #1328

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use `TokenSpec` (nee `RawTokenKindMatch`) in `consume` functions #1328

Use `TokenSpec` (nee `RawTokenKindMatch`) in `consume` functions #1328

Uh oh!

ahoppen commented Feb 8, 2023

Uh oh!

ahoppen commented Feb 8, 2023

Uh oh!

ahoppen commented Feb 9, 2023

Uh oh!

ahoppen commented Feb 9, 2023

Uh oh!

bnbarham left a comment

Uh oh!

ahoppen commented Feb 9, 2023

Uh oh!

bnbarham commented Feb 9, 2023

Uh oh!

ahoppen commented Feb 10, 2023

Uh oh!

Uh oh!

Use TokenSpec (nee RawTokenKindMatch) in consume functions #1328

Use TokenSpec (nee RawTokenKindMatch) in consume functions #1328

Uh oh!

Conversation

ahoppen commented Feb 8, 2023

Uh oh!

ahoppen commented Feb 8, 2023

Uh oh!

ahoppen commented Feb 9, 2023

Uh oh!

ahoppen commented Feb 9, 2023

Uh oh!

bnbarham left a comment

Choose a reason for hiding this comment

Uh oh!

ahoppen commented Feb 9, 2023

Uh oh!

bnbarham commented Feb 9, 2023

Uh oh!

ahoppen commented Feb 10, 2023

Uh oh!

Uh oh!

Use `TokenSpec` (nee `RawTokenKindMatch`) in `consume` functions #1328

Use `TokenSpec` (nee `RawTokenKindMatch`) in `consume` functions #1328