Skip to content

[libSyntax] Incremental Syntax Parsing #16340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
May 22, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
b904194
[libSyntax] Add support for parsing #sourceLocation directives
ahoppen May 9, 2018
d832a38
[libSyntax] Documentation improvements
ahoppen Apr 30, 2018
f369d5d
[libSyntax] Add static_asserts for the size of RawSyntaxBits
ahoppen Apr 30, 2018
0eb16d6
[libSyntax] Update space needed for raw syntax bits
ahoppen Apr 30, 2018
f8cd1ca
[libSyntax] Compute the text length of every node on the fly
ahoppen Apr 30, 2018
ea5e4e2
Wrap the path to -sdk in quotes for %target-swift-frontend
ahoppen Apr 30, 2018
de9737c
[incrParse] Support incremental parsing for edited files
ahoppen May 3, 2018
60d11d2
[incrParse] Reparse a node if the next leaf node has been modified
ahoppen May 2, 2018
db52819
[swift-syntax-test] Fix formatting of command-line-argument descriptions
ahoppen May 2, 2018
186feb6
[incrParse] Allow information about node reused be outputted
ahoppen May 2, 2018
65ac4f5
[incrParse] Refactor node reusability into separate function
ahoppen May 3, 2018
3382fac
[incrParse] Allow line:column notation to specify edits
ahoppen May 3, 2018
92f8f34
[incrParse] Store reused regions and output them after parsing
ahoppen May 3, 2018
8998b27
[incrParse] Add coloured output indicating which code got reused
ahoppen May 3, 2018
d9fd523
[swift-syntax-test] Refactor to allow incremental parsing on all actions
ahoppen May 4, 2018
8c9e2e0
[incParse] Make the SytnaxParsingCache operate on the leading trivia'…
ahoppen May 4, 2018
a1ff223
[incrParse] Add utility to test incremental parsing
ahoppen May 4, 2018
bc5e4d7
[incrParse] Compute byte offsets of pre-edit file based on that file
ahoppen May 4, 2018
ec4a527
[incrParse] Reparse nodes if the next node's trailing trivia has changed
ahoppen May 7, 2018
ef69e41
[incrParse] Add some more passing tests
ahoppen May 7, 2018
6135f10
[incrParse] Outdated documentation fixes
ahoppen May 7, 2018
4da37b1
[incrParse] Add option to force coloured output
ahoppen May 7, 2018
723e2be
[incrParse] Add option to the test utility to print visual reuse info
ahoppen May 7, 2018
8044907
[incrParse] Test utility: Put swift-syntax-test args in quotes if needed
ahoppen May 7, 2018
d026b2d
[incrParse] Add verification of reparsed regions to swift-syntax-test
ahoppen May 7, 2018
6b3ac23
[incrParse] Allow testing of multiple edits per line
ahoppen May 7, 2018
9bdc54a
[incrParse] Add support for test cases that verify the reused regions
ahoppen May 8, 2018
c31a880
[incrParse] Allow whitespaces to be reparsed in test
ahoppen May 8, 2018
9d3233c
[incrParse] Allow reuse of MemberDeclListItems
ahoppen May 8, 2018
e1a99ef
[incrParse] Fix parsing of nodes covering no source text
ahoppen May 8, 2018
17c14a0
[incrParse] Add some more tests
ahoppen May 8, 2018
9a3ff5b
[libSyntax] Add a debug dump function to SyntaxParsingContext
ahoppen May 9, 2018
2c02b1e
[incrParse] Fix lexer offset issue when missing tokens get synthesized
ahoppen May 9, 2018
2fbb875
[incrParse] Minor improvements to the test utility
ahoppen May 9, 2018
a137e0d
[libSyntax] Omit unknown nodes if they do not have any children
ahoppen May 9, 2018
b26dd11
[incrParse] Fix swift-syntax-test not complaining about unexpected re…
ahoppen May 9, 2018
082086c
[libSyntax] Fix parsing of StringLiterals with invalid interpolation …
ahoppen May 9, 2018
b2ebc96
[incrParse] Reparse a node if the next leaf node has been modified
ahoppen May 10, 2018
1b3baf9
[incrParse] Add error nodes to the end of a CodeBlockItem
ahoppen May 10, 2018
15b2bae
[libSyntax] Improve syntax related dump functions
ahoppen May 10, 2018
c733b5c
[incrParse] Aesthetic improvements to the test utility
ahoppen May 10, 2018
bc52823
[libSyntax] Adjust tests for improved syntax parsing behaviour
ahoppen May 21, 2018
a791a3d
[libSyntax] Fix SwiftSyntax CodeBlockItemList test
ahoppen May 21, 2018
2decf8f
[libSyntax] Rename recordReuseInformation to setRecordReuseInformation
ahoppen May 21, 2018
4e44e68
[libSyntax] Store shared SyntaxParsingContext data in RootContextData
ahoppen May 21, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions include/swift/AST/Module.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
#include "swift/Basic/OptionSet.h"
#include "swift/Basic/STLExtras.h"
#include "swift/Basic/SourceLoc.h"
#include "swift/Parse/SyntaxParsingCache.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/STLExtras.h"
Expand Down Expand Up @@ -838,6 +839,10 @@ class SourceFile final : public FileUnit {
/// The list of top-level declarations in the source file.
std::vector<Decl*> Decls;

/// A cache of syntax nodes that can be reused when creating the syntax tree
/// for this file.
SyntaxParsingCache *SyntaxParsingCache = nullptr;

/// The list of local type declarations in the source file.
llvm::SetVector<TypeDecl *> LocalTypeDecls;

Expand Down
4 changes: 3 additions & 1 deletion include/swift/Basic/LangOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,9 @@ namespace swift {
/// Whether collect tokens during parsing for syntax coloring.
bool CollectParsedToken = false;

/// Whether to parse syntax tree.
/// Whether to parse syntax tree. If the syntax tree is built, the generated
/// AST may not be correct when syntax nodes are reused as part of
/// incrementals parsing.
bool BuildSyntaxTree = false;

/// Whether to verify the parsed syntax tree and emit related diagnostics.
Expand Down
3 changes: 2 additions & 1 deletion include/swift/Basic/SourceLoc.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,8 @@ class SourceLoc {
void print(raw_ostream &OS, const SourceManager &SM,
unsigned &LastBufferID) const;

void printLineAndColumn(raw_ostream &OS, const SourceManager &SM) const;
void printLineAndColumn(raw_ostream &OS, const SourceManager &SM,
unsigned BufferID = 0) const;

void print(raw_ostream &OS, const SourceManager &SM) const {
unsigned Tmp = ~0U;
Expand Down
12 changes: 12 additions & 0 deletions include/swift/Frontend/Frontend.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
#include "swift/Migrator/MigratorOptions.h"
#include "swift/Parse/CodeCompletionCallbacks.h"
#include "swift/Parse/Parser.h"
#include "swift/Parse/SyntaxParsingCache.h"
#include "swift/Sema/SourceLoader.h"
#include "swift/Serialization/Validation.h"
#include "swift/Subsystems.h"
Expand Down Expand Up @@ -68,6 +69,9 @@ class CompilerInvocation {
MigratorOptions MigratorOpts;
SILOptions SILOpts;
IRGenOptions IRGenOpts;
/// The \c SyntaxParsingCache to use when parsing the main file of this
/// invocation
SyntaxParsingCache *MainFileSyntaxParsingCache = nullptr;

llvm::MemoryBuffer *CodeCompletionBuffer = nullptr;

Expand Down Expand Up @@ -217,6 +221,14 @@ class CompilerInvocation {
IRGenOptions &getIRGenOptions() { return IRGenOpts; }
const IRGenOptions &getIRGenOptions() const { return IRGenOpts; }

void setMainFileSyntaxParsingCache(SyntaxParsingCache *Cache) {
MainFileSyntaxParsingCache = Cache;
}

SyntaxParsingCache *getMainFileSyntaxParsingCache() const {
return MainFileSyntaxParsingCache;
}

void setParseStdlib() {
FrontendOpts.ParseStdlib = true;
}
Expand Down
9 changes: 9 additions & 0 deletions include/swift/Parse/Lexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,15 @@ class Lexer {
lex(Result, LeadingTrivia, TrailingTrivia);
}

/// Reset the lexer's buffer pointer to \p Offset bytes after the buffer
/// start.
void resetToOffset(size_t Offset) {
assert(BufferStart + Offset <= BufferEnd && "Offset after buffer end");

CurPtr = BufferStart + Offset;
lexImpl();
}

bool isKeepingComments() const {
return RetainComments == CommentRetentionMode::ReturnAsTokens;
}
Expand Down
7 changes: 7 additions & 0 deletions include/swift/Parse/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,13 @@ class Parser {
/// \brief Skip until the next '#else', '#endif' or until eof.
void skipUntilConditionalBlockClose();

/// If the parser is generating only a syntax tree, try loading the current
/// node from a previously generated syntax tree.
/// Returns \c true if the node has been loaded and inserted into the current
/// syntax tree. In this case the parser should behave as if the node has
/// successfully been created.
bool loadCurrentSyntaxNodeFromCache();

/// Parse an #endif.
bool parseEndIfDirective(SourceLoc &Loc);

Expand Down
102 changes: 102 additions & 0 deletions include/swift/Parse/SyntaxParsingCache.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
//===----------- SyntaxParsingCache.h -================----------*- C++ -*-===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2014 - 2018 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
//
//===----------------------------------------------------------------------===//

#ifndef SWIFT_PARSE_SYNTAXPARSINGCACHE_H
#define SWIFT_PARSE_SYNTAXPARSINGCACHE_H

#include "swift/Syntax/SyntaxNodes.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/raw_ostream.h"

namespace {

/// A single edit to the original source file in which a continuous range of
/// characters have been replaced by a new string
struct SourceEdit {
/// The byte offset from which on characters were replaced.
size_t Start;

/// The byte offset to which on characters were replaced.
size_t End;

/// The length of the string that replaced the range described above.
size_t ReplacementLength;

/// The length of the range that has been replaced
size_t originalLength() { return End - Start; }

/// Check if the characters replaced by this edit fall into the given range
/// or are directly adjacent to it
bool intersectsOrTouchesRange(size_t RangeStart, size_t RangeEnd) {
return !(End <= RangeStart || Start >= RangeEnd);
}
};

} // anonymous namespace

namespace swift {

using namespace swift::syntax;

class SyntaxParsingCache {
/// The syntax tree prior to the edit
SourceFileSyntax OldSyntaxTree;

/// The edits that were made from the source file that created this cache to
/// the source file that is now parsed incrementally
llvm::SmallVector<SourceEdit, 4> Edits;

/// Whether or not information about reused nodes shall be recored in
/// \c ReusedRanges
bool RecordReuseInformation = false;

/// If \c RecordReuseInformation buffer offsets of ranges that have been
/// successfully looked up in this cache are stored.
std::vector<std::pair<unsigned, unsigned>> ReusedRanges;

public:
SyntaxParsingCache(SourceFileSyntax OldSyntaxTree)
: OldSyntaxTree(OldSyntaxTree) {}

/// Add an edit that transformed the source file which created this cache into
/// the source file that is now being parsed incrementally. The order in which
/// the edits are added using this method needs to be the same order in which
/// the edits were applied to the source file.
void addEdit(size_t Start, size_t End, size_t ReplacementLength) {
Edits.push_back({Start, End, ReplacementLength});
}

/// Check if a syntax node of the given kind at the given position can be
/// reused for a new syntax tree.
llvm::Optional<Syntax> lookUp(size_t NewPosition, SyntaxKind Kind);

/// Turn recording of reused ranges on
void setRecordReuseInformation() { RecordReuseInformation = true; }

/// Return the ranges of the new source file that have been successfully
/// looked up in this cache as a (start, end) pair of byte offsets in the
/// post-edit file.
std::vector<std::pair<unsigned, unsigned>> getReusedRanges() const {
return ReusedRanges;
}

private:
llvm::Optional<Syntax> lookUpFrom(const Syntax &Node, size_t Position,
SyntaxKind Kind);

bool nodeCanBeReused(const Syntax &Node, size_t Position,
SyntaxKind Kind) const;
};

} // namespace swift

#endif // SWIFT_SYNTAX_PARSING_CACHE_H
62 changes: 50 additions & 12 deletions include/swift/Parse/SyntaxParsingContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

namespace swift {
class SourceFile;
class SyntaxParsingCache;
class Token;
class DiagnosticEngine;

Expand Down Expand Up @@ -74,9 +75,17 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
// Storage for Collected parts.
std::vector<RC<RawSyntax>> Storage;

SyntaxArena &Arena;

/// A cache of nodes that can be reused when creating the current syntax
/// tree
SyntaxParsingCache *SyntaxCache = nullptr;

RootContextData(SourceFile &SF, DiagnosticEngine &Diags,
SourceManager &SourceMgr, unsigned BufferID)
: SF(SF), Diags(Diags), SourceMgr(SourceMgr), BufferID(BufferID) {}
SourceManager &SourceMgr, unsigned BufferID,
SyntaxArena &Arena, SyntaxParsingCache *SyntaxCache)
: SF(SF), Diags(Diags), SourceMgr(SourceMgr), BufferID(BufferID),
Arena(Arena), SyntaxCache(SyntaxCache) {}
};

private:
Expand All @@ -97,6 +106,9 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
// Discard all parts in the context.
Discard,

// The node has been loaded from the cache and all parts shall be discarded.
LoadedFromCache,

// Construct SourceFile syntax to the specified SF.
Root,

Expand All @@ -112,9 +124,7 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
// Reference to the
SyntaxParsingContext *&CtxtHolder;

SyntaxArena &Arena;

std::vector<RC<RawSyntax>> &Storage;
RootContextData *RootData;

// Offet for 'Storage' this context owns from.
const size_t Offset;
Expand All @@ -138,7 +148,7 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
void createNodeInPlace(SyntaxKind Kind, size_t N);

ArrayRef<RC<RawSyntax>> getParts() const {
return makeArrayRef(Storage).drop_front(Offset);
return makeArrayRef(getStorage()).drop_front(Offset);
}

RC<RawSyntax> makeUnknownSyntax(SyntaxKind Kind,
Expand All @@ -154,11 +164,12 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
/// Designated constructor for child context.
SyntaxParsingContext(SyntaxParsingContext *&CtxtHolder)
: RootDataOrParent(CtxtHolder), CtxtHolder(CtxtHolder),
Arena(CtxtHolder->Arena),
Storage(CtxtHolder->Storage), Offset(Storage.size()),
RootData(CtxtHolder->RootData), Offset(RootData->Storage.size()),
Enabled(CtxtHolder->isEnabled()) {
assert(CtxtHolder->isTopOfContextStack() &&
"SyntaxParsingContext cannot have multiple children");
assert(CtxtHolder->Mode != AccumulationMode::LoadedFromCache &&
"Cannot create child context for a node loaded from the cache");
CtxtHolder = this;
}

Expand All @@ -174,20 +185,41 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {

~SyntaxParsingContext();

/// Try loading the current node from the \c SyntaxParsingCache by looking up
/// if an unmodified node exists at \p LexerOffset of the same kind. If a node
/// is found, replace the node that is currently being constructed by the
/// parsing context with the node from the cache and return the number of
/// bytes the loaded node took up in the original source. The lexer should
/// pretend it has read these bytes and continue from the advanced offset.
/// If nothing is found \c 0 is returned.
size_t loadFromCache(size_t LexerOffset);

void disable() { Enabled = false; }
bool isEnabled() const { return Enabled; }
bool isRoot() const { return RootDataOrParent.is<RootContextData*>(); }
bool isTopOfContextStack() const { return this == CtxtHolder; }

SyntaxParsingContext *getParent() {
SyntaxParsingContext *getParent() const {
return RootDataOrParent.get<SyntaxParsingContext*>();
}

RootContextData &getRootData() {
return *getRoot()->RootDataOrParent.get<RootContextData*>();
RootContextData *getRootData() { return RootData; }

const RootContextData *getRootData() const { return RootData; }

std::vector<RC<RawSyntax>> &getStorage() { return getRootData()->Storage; }

const std::vector<RC<RawSyntax>> &getStorage() const {
return getRootData()->Storage;
}

SyntaxParsingCache *getSyntaxParsingCache() const {
return getRootData()->SyntaxCache;
}

SyntaxParsingContext *getRoot();
SyntaxArena &getArena() const { return getRootData()->Arena; }

const SyntaxParsingContext *getRoot() const;

/// Add RawSyntax to the parts.
void addRawSyntax(RC<RawSyntax> Raw);
Expand All @@ -201,6 +233,7 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {

template<typename SyntaxNode>
llvm::Optional<SyntaxNode> popIf() {
auto &Storage = getStorage();
assert(Storage.size() > Offset);
if (auto Node = make<Syntax>(Storage.back()).getAs<SyntaxNode>()) {
Storage.pop_back();
Expand All @@ -210,6 +243,7 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
}

TokenSyntax popToken() {
auto &Storage = getStorage();
assert(Storage.size() > Offset);
assert(Storage.back()->getKind() == SyntaxKind::Token);
auto Node = make<TokenSyntax>(std::move(Storage.back()));
Expand Down Expand Up @@ -263,6 +297,10 @@ class alignas(1 << SyntaxAlignInBits) SyntaxParsingContext {
/// Make a missing node corresponding to the given node kind, and
/// push this node into the context.
void synthesize(SyntaxKind Kind);

/// Dump the nodes that are in the storage stack of the SyntaxParsingContext
LLVM_ATTRIBUTE_DEPRECATED(void dumpStorage() const LLVM_ATTRIBUTE_USED,
"Only meant for use in the debugger");
};

} // namespace swift
Expand Down
Loading