Skip to content

[libSyntax] Syntax colouring based on the syntax tree #17621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 17, 2018

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Jun 29, 2018

This PR adds a way to perform syntax colouring based on the syntax tree instead of the collected tokens and then converts that syntax colouring classification to the old syntax map. As a next step, we will be able to use the incrementally created syntax map to incrementally performance syntax colouring.

This is just a temporary bootstrapping step. In the future, we want to incrementally transfer the syntax tree and perform the classification for syntax colouring on the Swift side.

The classifier is currently not able to syntax colour semantics inside comments like markers or URLs.

@@ -592,7 +670,8 @@ func keywordAsLabel4(_: Int) {}
func keywordAsLabel5(_: Int, for: Int) {}
// CHECK: <kw>func</kw> keywordAsLabel5(<kw>_</kw>: <type>Int</type>, for: <type>Int</type>) {}
func keywordAsLabel6(if let: Int) {}
// CHECK: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
// CHECK-OLD: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
// CHECK-NEW: <kw>func</kw> keywordAsLabel6(<kw>if</kw> <kw>let</kw>: <type>Int</type>) {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rintaro Following up on your comment in #16636: This is however the way the syntax tree is currently built. We could look into optimising the syntax tree creation for this, but to be honest, I don't have strong enough feelings about how this gets coloured to put a lot of effort into it right now. Plus it's clearly marked as a divergence, so when we come back and try to make the new colouring behave exactly the same as the old one, it will come back up.

Copy link
Member

@rintaro rintaro Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I understand why this is considered as <kw>, because let keyword is not canBeArgumentLabel(). This func decl is currently a unknown decl.

<CodeBlockItem><UnknownDecl>func keywordAsLabel6<Unknown><Unknown>(<FunctionParameterList><FunctionParameter>if </FunctionParameter></FunctionParameterList></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><UnknownDecl>let<Unknown><Unknown><TypeAnnotation>: <SimpleTypeIdentifier>Int</SimpleTypeIdentifier></TypeAnnotation></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><NonEmptyTokenList>) </NonEmptyTokenList></CodeBlockItem><CodeBlockItem><ClosureExpr>{<CodeBlockItemList></CodeBlockItemList>}</ClosureExpr></CodeBlockItem>

I don't think that is what we wanted to test. Could you make this func keywordAsLabel6(if func: for example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the test case

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 59ff6c5 to ff4c7a0 Compare June 29, 2018 04:55
@ahoppen
Copy link
Member Author

ahoppen commented Jun 29, 2018

@swift-ci Please test

@ahoppen ahoppen requested review from rintaro and nkcsgexi June 29, 2018 05:03
@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

@rintaro
Copy link
Member

rintaro commented Jun 29, 2018

General question: why SyntaxClassifierSyntaxToSyntaxMapConverter indirection?
I think SyntaxToSyntaxMapConverter can classify tokens at the same time.

@ahoppen
Copy link
Member Author

ahoppen commented Jun 29, 2018

@rintaro There are multiple reasons:

  • I think it's more modular
  • This way I can use the classifier in combination with the ColoredSyntaxTreePrinter in swift-ide-test and with SyntaxToSyntaxMapConverter in SwiftEditor
  • I consider the SyntaxToSyntaxMapConverter a transitional step to bootstrap syntax colouring based on the syntax tree and later incremental syntax tree parsing and want to remove it later on
  • The SyntaxClassifier can fairly easily be ported to Swift to perform the syntax classification on the Swift side since it's so self-contained

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - ff4c7a0510ae113f9b9dcd65ddff372008f99bb4

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from d38421c to a5a7e9d Compare June 29, 2018 16:32
@ahoppen
Copy link
Member Author

ahoppen commented Jun 29, 2018

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - d38421c5264b420121179ea3d0e65eb7d9dd0e9e

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - d38421c5264b420121179ea3d0e65eb7d9dd0e9e

Copy link
Contributor

@nkcsgexi nkcsgexi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed this in the previous PR and I've no more comments.

@rintaro
Copy link
Member

rintaro commented Jul 2, 2018

Regarding SyntaxClassifier indirection: I'm still not convinced that we want to do this.

  • I think visiting whole tree is fairly heavy operation. If we can make it 1-pass, we should do it IMO.
  • Discarding re-usability of RawSyntax diminishes benefit of red-green tree.

For example, how about making ColoredSyntaxTreePrinter and SyntaxToSyntaxMapConverter subclass of SyntaxClassifier which receives classified token? or make them "callback" of SyntaxClassifier?

@ahoppen
Copy link
Member Author

ahoppen commented Jul 11, 2018

Performance really isn't critical here. This is just a temporary path to be able to test the syntax classification based on the syntax tree. In future PRs, the entire classification will run on the Swift side. The only purpose for the SyntaxToSyntaxMapConverter is so that we are able to reuse the existing syntax highlighting test cases.

I don't have super strong opinions on going with the subclass approach. If you would like that a lot better, I can implement it that way.

Which red-green tree are you talking about?

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from a5a7e9d to cf53f16 Compare July 11, 2018 20:49
@ahoppen
Copy link
Member Author

ahoppen commented Jul 12, 2018

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - a5a7e9db93aa643a3903b2d7f35e3d02bb3bd313

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - a5a7e9db93aa643a3903b2d7f35e3d02bb3bd313

Copy link
Member

@rintaro rintaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex and I discussed in person, and agreed about the general implementation strategy.

Let me review this again after you fix the build failure. I want to try this PR locally before merging :)

@@ -647,6 +647,8 @@ static int handleTestInvocation(TestOptions Opts, TestOptions &InitOpts) {
sourcekitd_request_dictionary_set_int64(Req, KeyEnableSyntaxMap, true);
sourcekitd_request_dictionary_set_int64(Req, KeyEnableStructure, false);
sourcekitd_request_dictionary_set_int64(Req, KeyEnableSyntaxTree, false);
sourcekitd_request_dictionary_set_int64(
Req, KeyForceLibSyntaxBasedProcessing, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make this optional?

@@ -1061,6 +1065,8 @@ static bool handleResponse(sourcekitd_response_t Resp, const TestOptions &Opts,
EnableSubStructure);
sourcekitd_request_dictionary_set_int64(EdReq, KeySyntacticOnly,
!Opts.UsedSema);
sourcekitd_request_dictionary_set_int64(
EdReq, KeyForceLibSyntaxBasedProcessing, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works for structure request (for now).

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from cf53f16 to 26bbd70 Compare July 12, 2018 21:51
@ahoppen
Copy link
Member Author

ahoppen commented Jul 12, 2018

I cherry-picked a commit for a later PR that should fix both the build failure and your comments.

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 26bbd70 to 5466f36 Compare July 12, 2018 23:12
@ahoppen
Copy link
Member Author

ahoppen commented Jul 12, 2018

Cherry picked another commit over. Looks like I lost one or two of them on some rebase.

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 26bbd7045201668bb8e5f58b60e93d1f438fbf4f

@@ -473,16 +480,21 @@ void handleRequestImpl(sourcekitd_object_t ReqObj, ResponseReceiver Rec) {
int64_t EnableDiagnostics = true;
Req.getInt64(KeyEnableDiagnostics, EnableDiagnostics, /*isOptional=*/true);
int64_t EnableSyntaxTree = false;
Req.getInt64(KeyEnableSyntaxTree, EnableSyntaxTree, /*isOptional=*/true);
Req.getInt64(KeyEnableSyntaxTree, EnableSyntaxTree, /*isedOptional=*/true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@@ -300,14 +312,15 @@ class RawSyntax final
/// Make a raw "layout" syntax node.
static RC<RawSyntax> make(SyntaxKind Kind, ArrayRef<RC<RawSyntax>> Layout,
SourcePresence Presence,
SyntaxArena *Arena = nullptr);
SyntaxArena *Arena = nullptr,
llvm::Optional<unsigned> NodeId = llvm::None);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If NodeId should be always greater than 0, I think this can be unsigned NodeId = 0 just like token version. Am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially used 0 to indicate that the NodeId should be picked automatically, but changed it to Optional<unsigned> because of @nkcsgexi's comment here: #16636 (comment). I'm fine with either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 0 as default is too un-readable. @nkcsgexi WDYT?
Either way, if there's no specific reason, please be consistent between token and layout.

NextFreeNodeId = std::max(this->NodeId + 1, NextFreeNodeId);
} else {
this->NodeId = NextFreeNodeId++;
}
Copy link
Member

@rintaro rintaro Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the above comment (Receive unsigned NodeId) is true, you could factor this out:

static unsigned claimNodeId(unsigned NodeId) {
  if (!NodeId)
    return NextFreeNodeId++;
    
  NextFreeNodeId = std::max(NodeId + 1, NextFreeNodeId);
  return NodeId;
}

then call it from here and token constructor.

this->NodeId = claimNodeId(NodeId);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

#ifndef SWIFT_SYNTAX_CLASSIFIER_H
#define SWIFT_SYNTAX_CLASSIFIER_H

#include "swift/AST/Identifier.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is layer violation. Could you move bool isEditorPlaceholder(StringRef name) to lib/Basic/EditorPlaceholder.cpp?
Also, if you don't use it in .h, move this to .cpp.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any documentation of what the layering is supposed to be?

Good catch. Forgot to move it when I moved stuff from the .h file to .cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any documentation of what the layering is supposed to be?

I don't think so :(

template<typename T>
void visit(T Node, SyntaxClassification Classification,
bool ForceClassification) {
ContextStack.push({Classification, ForceClassification});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer .emplace(Classification, ForceClassification)

@@ -592,7 +670,8 @@ func keywordAsLabel4(_: Int) {}
func keywordAsLabel5(_: Int, for: Int) {}
// CHECK: <kw>func</kw> keywordAsLabel5(<kw>_</kw>: <type>Int</type>, for: <type>Int</type>) {}
func keywordAsLabel6(if let: Int) {}
// CHECK: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
// CHECK-OLD: <kw>func</kw> keywordAsLabel6(if <kw>let</kw>: <type>Int</type>) {}
// CHECK-NEW: <kw>func</kw> keywordAsLabel6(<kw>if</kw> <kw>let</kw>: <type>Int</type>) {}
Copy link
Member

@rintaro rintaro Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I understand why this is considered as <kw>, because let keyword is not canBeArgumentLabel(). This func decl is currently a unknown decl.

<CodeBlockItem><UnknownDecl>func keywordAsLabel6<Unknown><Unknown>(<FunctionParameterList><FunctionParameter>if </FunctionParameter></FunctionParameterList></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><UnknownDecl>let<Unknown><Unknown><TypeAnnotation>: <SimpleTypeIdentifier>Int</SimpleTypeIdentifier></TypeAnnotation></Unknown></Unknown></UnknownDecl></CodeBlockItem><CodeBlockItem><NonEmptyTokenList>) </NonEmptyTokenList></CodeBlockItem><CodeBlockItem><ClosureExpr>{<CodeBlockItemList></CodeBlockItemList>}</ClosureExpr></CodeBlockItem>

I don't think that is what we wanted to test. Could you make this func keywordAsLabel6(if func: for example?

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 5466f36 to fc1a1e2 Compare July 13, 2018 18:15
Copy link
Member

@rintaro rintaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
#17621 (comment) can be followup PR.

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from fc1a1e2 to dc4dd73 Compare July 13, 2018 18:25
@ahoppen
Copy link
Member Author

ahoppen commented Jul 13, 2018

@swift-ci Please test and merge

@ahoppen
Copy link
Member Author

ahoppen commented Jul 13, 2018

Looks like CI didn't pick up the request

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 5466f3695742367aec958776bf31076bc089a052

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 5466f3695742367aec958776bf31076bc089a052

ahoppen added 5 commits July 13, 2018 16:56
The id is meant to be stable across incremental parses
IDs are not expected to be the same between incremental parsing and
from-scratch parsing.
Since nodes have unique IDs now, we cannot reuse the nodes at multiple
locations in the source file
@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from dc4dd73 to 182fe76 Compare July 14, 2018 00:49
@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - dc4dd732e855e3df20dc17421eef25e27daf8b8d

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - dc4dd732e855e3df20dc17421eef25e27daf8b8d

@ahoppen ahoppen force-pushed the 002-sytnax-tree-based-coloring branch from 182fe76 to 6bc1b5a Compare July 14, 2018 00:57
@ahoppen
Copy link
Member Author

ahoppen commented Jul 14, 2018

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 182fe76d352c2be0345c974f739473ade6e362f7

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 182fe76d352c2be0345c974f739473ade6e362f7

@ahoppen
Copy link
Member Author

ahoppen commented Jul 16, 2018

@swift-ci Please test and merge

@ahoppen ahoppen merged commit 3bf94ab into swiftlang:master Jul 17, 2018
@ahoppen ahoppen deleted the 002-sytnax-tree-based-coloring branch July 19, 2018 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants