[libSyntax] Syntax colouring based on the syntax tree #17621

ahoppen · 2018-06-29T04:33:29Z

This PR adds a way to perform syntax colouring based on the syntax tree instead of the collected tokens and then converts that syntax colouring classification to the old syntax map. As a next step, we will be able to use the incrementally created syntax map to incrementally performance syntax colouring.

This is just a temporary bootstrapping step. In the future, we want to incrementally transfer the syntax tree and perform the classification for syntax colouring on the Swift side.

The classifier is currently not able to syntax colour semantics inside comments like markers or URLs.

ahoppen · 2018-06-29T04:37:36Z

test/IDE/coloring.swift

@@ -592,7 +670,8 @@ func keywordAsLabel4(_: Int) {}
 func keywordAsLabel5(_: Int, for: Int) {}
 // CHECK: <kw>func</kw> keywordAsLabel5(<kw>_</kw>: <type>Int</type>, for: <type>Int</type>) {}
 func keywordAsLabel6(if let: Int) {}
-// CHECK: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
+// CHECK-OLD: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
+// CHECK-NEW: <kw>func</kw> keywordAsLabel6(<kw>if</kw> <kw>let</kw>: <type>Int</type>) {}


@rintaro Following up on your comment in #16636: This is however the way the syntax tree is currently built. We could look into optimising the syntax tree creation for this, but to be honest, I don't have strong enough feelings about how this gets coloured to put a lot of effort into it right now. Plus it's clearly marked as a divergence, so when we come back and try to make the new colouring behave exactly the same as the old one, it will come back up.

Ah, now I understand why this is considered as <kw>, because let keyword is not canBeArgumentLabel(). This func decl is currently a unknown decl.

<CodeBlockItem><UnknownDecl>func keywordAsLabel6<Unknown><Unknown>(<FunctionParameterList><FunctionParameter>if </FunctionParameter></FunctionParameterList></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><UnknownDecl>let<Unknown><Unknown><TypeAnnotation>: <SimpleTypeIdentifier>Int</SimpleTypeIdentifier></TypeAnnotation></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><NonEmptyTokenList>) </NonEmptyTokenList></CodeBlockItem><CodeBlockItem><ClosureExpr>{<CodeBlockItemList></CodeBlockItemList>}</ClosureExpr></CodeBlockItem>

I don't think that is what we wanted to test. Could you make this func keywordAsLabel6(if func: for example?

Updated the test case

ahoppen · 2018-06-29T05:03:33Z

@swift-ci Please test

swift-ci · 2018-06-29T05:56:38Z

Build failed
Swift Test OS X Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

swift-ci · 2018-06-29T06:01:31Z

Build failed
Swift Test Linux Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

rintaro · 2018-06-29T12:04:59Z

General question: why SyntaxClassifier → SyntaxToSyntaxMapConverter indirection?
I think SyntaxToSyntaxMapConverter can classify tokens at the same time.

ahoppen · 2018-06-29T15:10:31Z

@rintaro There are multiple reasons:

I think it's more modular
This way I can use the classifier in combination with the ColoredSyntaxTreePrinter in swift-ide-test and with SyntaxToSyntaxMapConverter in SwiftEditor
I consider the SyntaxToSyntaxMapConverter a transitional step to bootstrap syntax colouring based on the syntax tree and later incremental syntax tree parsing and want to remove it later on
The SyntaxClassifier can fairly easily be ported to Swift to perform the syntax classification on the Swift side since it's so self-contained

@swift-ci Please test

swift-ci · 2018-06-29T15:12:17Z

Build failed
Swift Test Linux Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

swift-ci · 2018-06-29T15:12:17Z

Build failed
Swift Test OS X Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

ahoppen · 2018-06-29T16:32:53Z

@swift-ci Please test

swift-ci · 2018-06-29T16:33:20Z

Build failed
Swift Test OS X Platform
Git Sha - d38421c5264b420121179ea3d0e65eb7d9dd0e9e

swift-ci · 2018-06-29T16:34:45Z

Build failed
Swift Test Linux Platform
Git Sha - d38421c5264b420121179ea3d0e65eb7d9dd0e9e

nkcsgexi

I've reviewed this in the previous PR and I've no more comments.

rintaro · 2018-07-02T12:48:11Z

Regarding SyntaxClassifier indirection: I'm still not convinced that we want to do this.

I think visiting whole tree is fairly heavy operation. If we can make it 1-pass, we should do it IMO.
Discarding re-usability of RawSyntax diminishes benefit of red-green tree.

For example, how about making ColoredSyntaxTreePrinter and SyntaxToSyntaxMapConverter subclass of SyntaxClassifier which receives classified token? or make them "callback" of SyntaxClassifier?

ahoppen · 2018-07-11T20:36:41Z

Performance really isn't critical here. This is just a temporary path to be able to test the syntax classification based on the syntax tree. In future PRs, the entire classification will run on the Swift side. The only purpose for the SyntaxToSyntaxMapConverter is so that we are able to reuse the existing syntax highlighting test cases.

I don't have super strong opinions on going with the subclass approach. If you would like that a lot better, I can implement it that way.

Which red-green tree are you talking about?

ahoppen · 2018-07-12T00:28:53Z

@swift-ci Please test

swift-ci · 2018-07-12T00:30:40Z

Build failed
Swift Test OS X Platform
Git Sha - a5a7e9db93aa643a3903b2d7f35e3d02bb3bd313

swift-ci · 2018-07-12T00:30:43Z

Build failed
Swift Test Linux Platform
Git Sha - a5a7e9db93aa643a3903b2d7f35e3d02bb3bd313

rintaro

Alex and I discussed in person, and agreed about the general implementation strategy.

Let me review this again after you fix the build failure. I want to try this PR locally before merging :)

rintaro · 2018-07-12T10:00:47Z

tools/SourceKit/tools/sourcekitd-test/sourcekitd-test.cpp

@@ -647,6 +647,8 @@ static int handleTestInvocation(TestOptions Opts, TestOptions &InitOpts) {
    sourcekitd_request_dictionary_set_int64(Req, KeyEnableSyntaxMap, true);
    sourcekitd_request_dictionary_set_int64(Req, KeyEnableStructure, false);
    sourcekitd_request_dictionary_set_int64(Req, KeyEnableSyntaxTree, false);
+    sourcekitd_request_dictionary_set_int64(
+        Req, KeyForceLibSyntaxBasedProcessing, true);


Could you make this optional?

rintaro · 2018-07-12T10:01:34Z

tools/SourceKit/tools/sourcekitd-test/sourcekitd-test.cpp

@@ -1061,6 +1065,8 @@ static bool handleResponse(sourcekitd_response_t Resp, const TestOptions &Opts,
                                                EnableSubStructure);
        sourcekitd_request_dictionary_set_int64(EdReq, KeySyntacticOnly,
                                                !Opts.UsedSema);
+        sourcekitd_request_dictionary_set_int64(
+            EdReq, KeyForceLibSyntaxBasedProcessing, true);


I don't think this works for structure request (for now).

ahoppen · 2018-07-12T21:52:17Z

I cherry-picked a commit for a later PR that should fix both the build failure and your comments.

@swift-ci Please test

swift-ci · 2018-07-12T22:46:12Z

Build failed
Swift Test Linux Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

swift-ci · 2018-07-12T23:11:23Z

Build failed
Swift Test OS X Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

ahoppen · 2018-07-12T23:12:38Z

Cherry picked another commit over. Looks like I lost one or two of them on some rebase.

@swift-ci Please test

swift-ci · 2018-07-12T23:13:00Z

Build failed
Swift Test OS X Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

swift-ci · 2018-07-12T23:14:27Z

Build failed
Swift Test Linux Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

rintaro · 2018-07-12T22:58:42Z

tools/SourceKit/tools/sourcekitd/lib/API/Requests.cpp

@@ -473,16 +480,21 @@ void handleRequestImpl(sourcekitd_object_t ReqObj, ResponseReceiver Rec) {
    int64_t EnableDiagnostics = true;
    Req.getInt64(KeyEnableDiagnostics, EnableDiagnostics, /*isOptional=*/true);
    int64_t EnableSyntaxTree = false;
-    Req.getInt64(KeyEnableSyntaxTree, EnableSyntaxTree, /*isOptional=*/true);
+    Req.getInt64(KeyEnableSyntaxTree, EnableSyntaxTree, /*isedOptional=*/true);


rintaro · 2018-07-13T01:03:14Z

include/swift/Syntax/RawSyntax.h

@@ -300,14 +312,15 @@ class RawSyntax final
  /// Make a raw "layout" syntax node.
  static RC<RawSyntax> make(SyntaxKind Kind, ArrayRef<RC<RawSyntax>> Layout,
                            SourcePresence Presence,
-                            SyntaxArena *Arena = nullptr);
+                            SyntaxArena *Arena = nullptr,
+                            llvm::Optional<unsigned> NodeId = llvm::None);


If NodeId should be always greater than 0, I think this can be unsigned NodeId = 0 just like token version. Am I missing something?

I initially used 0 to indicate that the NodeId should be picked automatically, but changed it to Optional<unsigned> because of @nkcsgexi's comment here: #16636 (comment). I'm fine with either.

I don't think 0 as default is too un-readable. @nkcsgexi WDYT?
Either way, if there's no specific reason, please be consistent between token and layout.

rintaro · 2018-07-13T01:10:23Z

lib/Syntax/RawSyntax.cpp

+    NextFreeNodeId = std::max(this->NodeId + 1, NextFreeNodeId);
+  } else {
+    this->NodeId = NextFreeNodeId++;
+  }


If the above comment (Receive unsigned NodeId) is true, you could factor this out:

static unsigned claimNodeId(unsigned NodeId) { if (!NodeId) return NextFreeNodeId++; NextFreeNodeId = std::max(NodeId + 1, NextFreeNodeId); return NodeId; }

then call it from here and token constructor.

this->NodeId = claimNodeId(NodeId);

rintaro · 2018-07-13T01:27:48Z

include/swift/Syntax/SyntaxClassifier.h.gyb

+#ifndef SWIFT_SYNTAX_CLASSIFIER_H
+#define SWIFT_SYNTAX_CLASSIFIER_H
+
+#include "swift/AST/Identifier.h"


This is layer violation. Could you move bool isEditorPlaceholder(StringRef name) to lib/Basic/EditorPlaceholder.cpp?
Also, if you don't use it in .h, move this to .cpp.

Is there any documentation of what the layering is supposed to be?

Good catch. Forgot to move it when I moved stuff from the .h file to .cpp.

Is there any documentation of what the layering is supposed to be?

I don't think so :(

rintaro · 2018-07-13T01:33:09Z

include/swift/Syntax/SyntaxClassifier.h.gyb

+  template<typename T>
+  void visit(T Node, SyntaxClassification Classification,
+             bool ForceClassification) {
+    ContextStack.push({Classification, ForceClassification});


I prefer .emplace(Classification, ForceClassification)

rintaro · 2018-07-13T01:51:02Z

test/IDE/coloring.swift

@@ -592,7 +670,8 @@ func keywordAsLabel4(_: Int) {}
 func keywordAsLabel5(_: Int, for: Int) {}
 // CHECK: <kw>func</kw> keywordAsLabel5(<kw>_</kw>: <type>Int</type>, for: <type>Int</type>) {}
 func keywordAsLabel6(if let: Int) {}
-// CHECK: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
+// CHECK-OLD: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
+// CHECK-NEW: <kw>func</kw> keywordAsLabel6(<kw>if</kw> <kw>let</kw>: <type>Int</type>) {}


Ah, now I understand why this is considered as <kw>, because let keyword is not canBeArgumentLabel(). This func decl is currently a unknown decl.

<CodeBlockItem><UnknownDecl>func keywordAsLabel6<Unknown><Unknown>(<FunctionParameterList><FunctionParameter>if </FunctionParameter></FunctionParameterList></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><UnknownDecl>let<Unknown><Unknown><TypeAnnotation>: <SimpleTypeIdentifier>Int</SimpleTypeIdentifier></TypeAnnotation></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><NonEmptyTokenList>) </NonEmptyTokenList></CodeBlockItem><CodeBlockItem><ClosureExpr>{<CodeBlockItemList></CodeBlockItemList>}</ClosureExpr></CodeBlockItem>

I don't think that is what we wanted to test. Could you make this func keywordAsLabel6(if func: for example?

rintaro

LGTM!
#17621 (comment) can be followup PR.

ahoppen · 2018-07-13T18:26:05Z

@swift-ci Please test and merge

ahoppen · 2018-07-13T20:24:12Z

Looks like CI didn't pick up the request

@swift-ci Please test

swift-ci · 2018-07-13T20:26:01Z

Build failed
Swift Test Linux Platform
Git Sha - 5466f3695742367aec958776bf31076bc089a052

swift-ci · 2018-07-13T20:27:39Z

Build failed
Swift Test OS X Platform
Git Sha - 5466f3695742367aec958776bf31076bc089a052

…ction

The id is meant to be stable across incremental parses

IDs are not expected to be the same between incremental parsing and from-scratch parsing.

Since nodes have unique IDs now, we cannot reuse the nodes at multiple locations in the source file

swift-ci · 2018-07-14T00:54:14Z

Build failed
Swift Test Linux Platform
Git Sha - dc4dd732e855e3df20dc17421eef25e27daf8b8d

swift-ci · 2018-07-14T00:54:17Z

Build failed
Swift Test OS X Platform
Git Sha - dc4dd732e855e3df20dc17421eef25e27daf8b8d

… syntax tree

ahoppen · 2018-07-14T00:59:12Z

@swift-ci Please test

swift-ci · 2018-07-14T00:59:41Z

Build failed
Swift Test OS X Platform
Git Sha - 182fe76d352c2be0345c974f739473ade6e362f7

swift-ci · 2018-07-14T01:00:52Z

Build failed
Swift Test Linux Platform
Git Sha - 182fe76d352c2be0345c974f739473ade6e362f7

ahoppen · 2018-07-16T23:29:49Z

@swift-ci Please test and merge

ahoppen commented Jun 29, 2018

View reviewed changes

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 59ff6c5 to ff4c7a0 Compare June 29, 2018 04:55

ahoppen requested review from rintaro and nkcsgexi June 29, 2018 05:03

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from d38421c to a5a7e9d Compare June 29, 2018 16:32

nkcsgexi approved these changes Jun 29, 2018

View reviewed changes

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from a5a7e9d to cf53f16 Compare July 11, 2018 20:49

rintaro reviewed Jul 12, 2018

View reviewed changes

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from cf53f16 to 26bbd70 Compare July 12, 2018 21:51

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 26bbd70 to 5466f36 Compare July 12, 2018 23:12

rintaro reviewed Jul 13, 2018

View reviewed changes

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 5466f36 to fc1a1e2 Compare July 13, 2018 18:15

rintaro approved these changes Jul 13, 2018

View reviewed changes

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from fc1a1e2 to dc4dd73 Compare July 13, 2018 18:25

ahoppen added 5 commits July 13, 2018 16:56

[Basic] Extract isEditorPlaceholder from Identifier to standalone fun…

dfad6f7

…ction

[incrParse] Add a stable id to the syntax nodes

9d59cd2

The id is meant to be stable across incremental parses

[incrParse] Test utility: Compare syntax trees without IDs

4516873

IDs are not expected to be the same between incremental parsing and from-scratch parsing.

[libSyntax] Disable caching of token nodes

03a7042

Since nodes have unique IDs now, we cannot reuse the nodes at multiple locations in the source file

[libSyntax] Add syntax coloring based on the syntax tree

8430eff

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from dc4dd73 to 182fe76 Compare July 14, 2018 00:49

ahoppen added 3 commits July 13, 2018 17:57

[libSyntax] Enable tests for libSyntax based syntax coloring

5c22761

[SourceKit] Add option to force the SyntaxMap to be generated via the…

c8a3957

… syntax tree

[libSyntax] Add test variants for building the syntax map via libSyntax

6bc1b5a

ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 182fe76 to 6bc1b5a Compare July 14, 2018 00:57

ahoppen merged commit 3bf94ab into swiftlang:master Jul 17, 2018

ahoppen deleted the 002-sytnax-tree-based-coloring branch July 19, 2018 16:19

[libSyntax] Syntax colouring based on the syntax tree #17621

[libSyntax] Syntax colouring based on the syntax tree #17621

Uh oh!

Conversation

ahoppen commented Jun 29, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rintaro Jul 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahoppen commented Jun 29, 2018

Uh oh!

swift-ci commented Jun 29, 2018

Uh oh!

swift-ci commented Jun 29, 2018

Uh oh!

rintaro commented Jun 29, 2018

Uh oh!

ahoppen commented Jun 29, 2018

Uh oh!

swift-ci commented Jun 29, 2018

Uh oh!

swift-ci commented Jun 29, 2018

Uh oh!

ahoppen commented Jun 29, 2018

Uh oh!

swift-ci commented Jun 29, 2018

Uh oh!

swift-ci commented Jun 29, 2018

Uh oh!

nkcsgexi left a comment

Choose a reason for hiding this comment

Uh oh!

rintaro commented Jul 2, 2018

Uh oh!

ahoppen commented Jul 11, 2018

Uh oh!

ahoppen commented Jul 12, 2018

Uh oh!

swift-ci commented Jul 12, 2018

Uh oh!

swift-ci commented Jul 12, 2018

Uh oh!

rintaro left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahoppen commented Jul 12, 2018

Uh oh!

swift-ci commented Jul 12, 2018

Uh oh!

swift-ci commented Jul 12, 2018

Uh oh!

ahoppen commented Jul 12, 2018

Uh oh!

swift-ci commented Jul 12, 2018

Uh oh!

swift-ci commented Jul 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rintaro Jul 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rintaro Jul 13, 2018 •

edited

Loading

rintaro Jul 13, 2018 •

edited

Loading

rintaro Jul 13, 2018 •

edited

Loading