[swiftSyntax] Swift side syntax classifier #18251

ahoppen · 2018-07-26T06:31:14Z

This PR moves the SyntaxClassifier that previously only existed on the C++ side to the Swift side, so that classification of tokens can be performed on the basis of an incrementally transferred syntax tree.

This essentially removes the need for the C++ SyntaxClassifier and I plan to remove that one in a later PR.

ahoppen · 2018-07-26T06:31:22Z

@swift-ci Please smoke test

ahoppen · 2018-07-26T15:55:20Z

@swift-ci Please smoke test

ahoppen · 2018-07-26T17:19:02Z

@swift-ci Please smoke test

nkcsgexi · 2018-07-26T17:43:56Z

test/SwiftSyntax/AllTokenKindsInSyntaxGybSupport.c

+int main() {
+#define TOKEN(KW) printKeyword(#KW);
+#define SIL_KEYWORD(KW)
+#include "swift/Syntax/TokenKinds.def"


Instead of adding another tool, we should add this print action into swift-ide-test; and printing the syntax token kinds in swift-swiftsyntax-test. The test should be a simple diff of two dumps.

Thinking more about this, it's better if we put the print action into swift-syntax-test. It makes a lot of sense to compare dumps from swift-syntax-test and swift-swiftsyntax-test.

I read swift-syntax-test anyway ;-)

In case we keep the generated python list below it also makes sense to consider keeping this tool, since swift-syntax-test currently always operates on an input file and putting this into swift-syntax-test thus feels a little awkward.

We can provide a dummy input file to swift-syntax-test for this action (I think it is totally OK). Drawbacks of having these standalone tools are (1) it's conventional to separate test driver and test data, like the way swift-syntax-test and round_trip_parse_gen.swift are separated; (2) we won't visit these tools for other purposes in the future very often, leading to the cognitive burden of figuring out why they are there in a longer term. So I still prefer we move these testing logics to a more popular tool.

nkcsgexi · 2018-07-26T17:44:42Z

test/SwiftSyntax/Inputs/TokenKindList.txt.gyb

+}%
+% for token in SYNTAX_TOKENS:
+${token.kind}
+% end


Add an action print-token-kind in swift-swiftsyntax-test for this output.

I actually think that this way is better. This is not really swift-specific since we're just checking the list of declarations in the python declaration of gyb_syntax_support.

If we were to implement this in swift-swiftsyntax-test we would need to:
a) Make swift-swiftsyntax-test be a gyb tool (I don't like this at all)
b) Need to check that all the tokenKinds get generated in the TokenKind enum, but for that we'd need to have a list of all kinds define in TokenKind and we cannot use the autogenerated allCases property using CaseIterable since TokenKind has associated values.

And after all this file is not a real tool that needs to be compiled and is a standalone binary, but is just an input to gyb.

@nkcsgexi and I discussed this in person and decided that it's probably the easiest and cleanest way to test this. In the future we might want to consider generation TokenKinds.def from gyb which would make the entire test obsolete.

nkcsgexi · 2018-07-26T17:45:42Z

tools/SwiftSyntax/SyntaxClassifier.swift.gyb

+//
+// This source file is part of the Swift.org open source project
+//
+// Copyright (c) 2014 - 2017 Apple Inc. and the Swift project authors


nkcsgexi · 2018-07-26T17:45:56Z

tools/SwiftSyntax/SyntaxClassifier.swift.gyb

+
+class _SyntaxClassifier: SyntaxVisitor {
+
+  private var contextStack: [(classification: SyntaxClassification, force: Bool)] = [(classification: .none, force: false)]


Nit: 80-columns in the entire file.

nkcsgexi · 2018-07-26T17:51:36Z

utils/gyb_syntax_support/Token.py

@@ -89,39 +93,46 @@ def __init__(self, name, text):
    Keyword('__DSO_HANDLE__', '__DSO_HANDLE__'),
    Keyword('Wildcard', '_'),
    Token('PoundAvailable', 'pound_available', text='#available',
-          is_keyword=True),
+          is_keyword=True, classification='Keyword'),


Can we share the token classification information with the C++ side as well? We should specify them only once in here.

Since I'm planning to remove the C++ classifier, I don't think it's worth it to hook into this infrastructure from C++.

OK. Never mind then.

ahoppen · 2018-07-26T21:25:37Z

@swift-ci Please smoke test

ahoppen · 2018-07-27T15:05:00Z

Let's make sure this still passes now that #18276 is merged.

@swift-ci Please smoke test

ahoppen · 2018-07-27T16:56:19Z

@swift-ci Please smoke test

rintaro · 2018-07-30T02:16:46Z

tools/SwiftSyntax/SwiftSyntax.swift

    let swiftcRunner = try SwiftcRunner(sourceFile: url)
    let result = try swiftcRunner.invoke()
-    guard result.wasSuccessful else {
+    if !result.wasSuccessful && !allowInvalid {


Syntax errors in source are quite normal, and they should not be treated as exception in SwiftSyntax IMO.
This should catch "invoked but didn't get parsed" error. But once it gets parsed, we might want to have a way to get both parsed syntax tree and diagnostics. Until we implement it, allowInvalid should be defaulted to true. What do you think?

Makes total sense to me. I just wanted to keep the current behaviour for now.

@nkcsgexi Do you have any opinion on this?

I think we should treat cases like missing file at the given url or bad encoding as exception. We shouldn't handle invalid syntax at this level since syntax tree can provide interpretation for invalid syntax too.

ahoppen · 2018-07-30T20:30:04Z

@swift-ci Please smoke test

ahoppen · 2018-07-30T22:26:44Z

Rebased because I merged #18314.

@swift-ci Please smoke test

ahoppen requested review from rintaro and nkcsgexi July 26, 2018 06:31

ahoppen force-pushed the 02-swift-syntax-classifier branch from 7188c0e to 8e8580b Compare July 26, 2018 15:55

nkcsgexi reviewed Jul 26, 2018

View reviewed changes

ahoppen force-pushed the 02-swift-syntax-classifier branch from 8e8580b to 8d40915 Compare July 26, 2018 21:23

ahoppen force-pushed the 02-swift-syntax-classifier branch from 8d40915 to 7a92267 Compare July 27, 2018 16:32

ahoppen mentioned this pull request Jul 27, 2018

[libSyntax] Remove the C++ SyntaxClassifier #18314

Merged

rintaro reviewed Jul 30, 2018

View reviewed changes

ahoppen force-pushed the 02-swift-syntax-classifier branch 2 times, most recently from 9d9fd0a to b6cd535 Compare July 30, 2018 20:29

ahoppen added 2 commits July 30, 2018 14:54

[libSyntax] Add a swift token classifier for syntax highlighting

775beec

[swiftSyntax] Add test cases for the SyntaxClassifier

179940b

ahoppen force-pushed the 02-swift-syntax-classifier branch from b6cd535 to 179940b Compare July 30, 2018 22:26

nkcsgexi approved these changes Jul 31, 2018

View reviewed changes

ahoppen merged commit 9670a71 into swiftlang:master Jul 31, 2018

ahoppen deleted the 02-swift-syntax-classifier branch July 31, 2018 20:43


		class _SyntaxClassifier: SyntaxVisitor {

		private var contextStack: [(classification: SyntaxClassification, force: Bool)] = [(classification: .none, force: false)]

[swiftSyntax] Swift side syntax classifier #18251

[swiftSyntax] Swift side syntax classifier #18251

Uh oh!

Conversation

ahoppen commented Jul 26, 2018

Uh oh!

ahoppen commented Jul 26, 2018

Uh oh!

ahoppen commented Jul 26, 2018

Uh oh!

ahoppen commented Jul 26, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahoppen commented Jul 26, 2018

Uh oh!

ahoppen commented Jul 27, 2018

Uh oh!

ahoppen commented Jul 27, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahoppen commented Jul 30, 2018

Uh oh!

ahoppen commented Jul 30, 2018

Uh oh!

Uh oh!