Skip to content

[swiftSyntax] Swift side syntax classifier #18251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 31, 2018

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Jul 26, 2018

This PR moves the SyntaxClassifier that previously only existed on the C++ side to the Swift side, so that classification of tokens can be performed on the basis of an incrementally transferred syntax tree.

This essentially removes the need for the C++ SyntaxClassifier and I plan to remove that one in a later PR.

@ahoppen ahoppen requested review from rintaro and nkcsgexi July 26, 2018 06:31
@ahoppen
Copy link
Member Author

ahoppen commented Jul 26, 2018

@swift-ci Please smoke test

@ahoppen ahoppen force-pushed the 02-swift-syntax-classifier branch from 7188c0e to 8e8580b Compare July 26, 2018 15:55
@ahoppen
Copy link
Member Author

ahoppen commented Jul 26, 2018

@swift-ci Please smoke test

1 similar comment
@ahoppen
Copy link
Member Author

ahoppen commented Jul 26, 2018

@swift-ci Please smoke test

int main() {
#define TOKEN(KW) printKeyword(#KW);
#define SIL_KEYWORD(KW)
#include "swift/Syntax/TokenKinds.def"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding another tool, we should add this print action into swift-ide-test; and printing the syntax token kinds in swift-swiftsyntax-test. The test should be a simple diff of two dumps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about this, it's better if we put the print action into swift-syntax-test. It makes a lot of sense to compare dumps from swift-syntax-test and swift-swiftsyntax-test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read swift-syntax-test anyway ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we keep the generated python list below it also makes sense to consider keeping this tool, since swift-syntax-test currently always operates on an input file and putting this into swift-syntax-test thus feels a little awkward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can provide a dummy input file to swift-syntax-test for this action (I think it is totally OK). Drawbacks of having these standalone tools are (1) it's conventional to separate test driver and test data, like the way swift-syntax-test and round_trip_parse_gen.swift are separated; (2) we won't visit these tools for other purposes in the future very often, leading to the cognitive burden of figuring out why they are there in a longer term. So I still prefer we move these testing logics to a more popular tool.

}%
% for token in SYNTAX_TOKENS:
${token.kind}
% end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an action print-token-kind in swift-swiftsyntax-test for this output.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think that this way is better. This is not really swift-specific since we're just checking the list of declarations in the python declaration of gyb_syntax_support.

If we were to implement this in swift-swiftsyntax-test we would need to:
a) Make swift-swiftsyntax-test be a gyb tool (I don't like this at all)
b) Need to check that all the tokenKinds get generated in the TokenKind enum, but for that we'd need to have a list of all kinds define in TokenKind and we cannot use the autogenerated allCases property using CaseIterable since TokenKind has associated values.

And after all this file is not a real tool that needs to be compiled and is a standalone binary, but is just an input to gyb.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nkcsgexi and I discussed this in person and decided that it's probably the easiest and cleanest way to test this. In the future we might want to consider generation TokenKinds.def from gyb which would make the entire test obsolete.

//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2014 - 2017 Apple Inc. and the Swift project authors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: 2018.


class _SyntaxClassifier: SyntaxVisitor {

private var contextStack: [(classification: SyntaxClassification, force: Bool)] = [(classification: .none, force: false)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: 80-columns in the entire file.

@@ -89,39 +93,46 @@ def __init__(self, name, text):
Keyword('__DSO_HANDLE__', '__DSO_HANDLE__'),
Keyword('Wildcard', '_'),
Token('PoundAvailable', 'pound_available', text='#available',
is_keyword=True),
is_keyword=True, classification='Keyword'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we share the token classification information with the C++ side as well? We should specify them only once in here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm planning to remove the C++ classifier, I don't think it's worth it to hook into this infrastructure from C++.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Never mind then.

@ahoppen ahoppen force-pushed the 02-swift-syntax-classifier branch from 8e8580b to 8d40915 Compare July 26, 2018 21:23
@ahoppen
Copy link
Member Author

ahoppen commented Jul 26, 2018

@swift-ci Please smoke test

@ahoppen
Copy link
Member Author

ahoppen commented Jul 27, 2018

Let's make sure this still passes now that #18276 is merged.

@swift-ci Please smoke test

@ahoppen ahoppen force-pushed the 02-swift-syntax-classifier branch from 8d40915 to 7a92267 Compare July 27, 2018 16:32
@ahoppen
Copy link
Member Author

ahoppen commented Jul 27, 2018

@swift-ci Please smoke test

let swiftcRunner = try SwiftcRunner(sourceFile: url)
let result = try swiftcRunner.invoke()
guard result.wasSuccessful else {
if !result.wasSuccessful && !allowInvalid {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax errors in source are quite normal, and they should not be treated as exception in SwiftSyntax IMO.
This should catch "invoked but didn't get parsed" error. But once it gets parsed, we might want to have a way to get both parsed syntax tree and diagnostics. Until we implement it, allowInvalid should be defaulted to true. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes total sense to me. I just wanted to keep the current behaviour for now.

@nkcsgexi Do you have any opinion on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should treat cases like missing file at the given url or bad encoding as exception. We shouldn't handle invalid syntax at this level since syntax tree can provide interpretation for invalid syntax too.

@ahoppen ahoppen force-pushed the 02-swift-syntax-classifier branch 2 times, most recently from 9d9fd0a to b6cd535 Compare July 30, 2018 20:29
@ahoppen
Copy link
Member Author

ahoppen commented Jul 30, 2018

@swift-ci Please smoke test

@ahoppen ahoppen force-pushed the 02-swift-syntax-classifier branch from b6cd535 to 179940b Compare July 30, 2018 22:26
@ahoppen
Copy link
Member Author

ahoppen commented Jul 30, 2018

Rebased because I merged #18314.

@swift-ci Please smoke test

@ahoppen ahoppen merged commit 9670a71 into swiftlang:master Jul 31, 2018
@ahoppen ahoppen deleted the 02-swift-syntax-classifier branch July 31, 2018 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants