Skip to content

[libSyntax] Don't create dedicated deferred nodes in SyntaxTreeCreator #36364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 10, 2021

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Mar 9, 2021

We have finally reached our goal of optimising deferred node creation for SyntaxTreeCreator. Instead of creating dedicated deferred nodes and copying the data into a RawSyntax node when recording, we always create RawSyntax nodes. Recording a deferred node is thus a no-op, since we have already created a RawSyntax node. Should a deferred node not be recorded, it stays alive in the SyntaxArena without any reference to it. While this means, we are leaking some memory for such nodes.

Measured increased memory use

While compiling Alamofire, I took the following measurements:

  • 77.628 total nodes recorded
  • 6.009 total deferred nodes created (7.7% of recorded nodes)
  • 630 of the deferred nodes were later recorded (10.5% of the deferred nodes)

If we don’t clean up the deferred nodes, this will result in 5.379 more nodes being present in the SyntaxArena. If we (for simplicity) assume that all nodes have the same size, this will result in a 7% increase of memory usage in the SyntaxArena. Putting this into context with the 1.24MB memory footprint of the SyntaxArena measured in #36352, we get a 86KB increase in memory usage, which I think is pretty negligible compared to the >100MB overall memory footprint of swift-frontend.

Benefits

The reason why I put so much energy in optimizing the creation of deferred nodes, is because during the migration to libSyntax-based parsing, we will be enabling the SyntaxParsingContext for any migrated syntax element in deferred mode and will be discarding all the deferred nodes. Still, in order to run ASTGen, we create “fake” recorded nodes, which converts the deferred nodes into recorded nodes on the side.

I don’t have exact number at hand on how much this change actually speeds up parsing in the migrated state, but it was pretty significant (10% - 30% IIRC, can measure exactly if requested), because we essentially avoid creating each RawSyntax node twice (once as deferred and once as recorded).

@ahoppen ahoppen changed the title Pr/deferred rawsyntax nodes [libSyntax] Don't create dedicated deferred nodes in SyntaxTreeCreator Mar 9, 2021
@ahoppen ahoppen force-pushed the pr/deferred-rawsyntax-nodes branch 2 times, most recently from ada1e26 to 7d755d2 Compare March 9, 2021 22:04
@ahoppen ahoppen marked this pull request as ready for review March 9, 2021 22:04
@ahoppen
Copy link
Member Author

ahoppen commented Mar 9, 2021

@swift-ci Please smoke test

@ahoppen ahoppen requested a review from rintaro March 9, 2021 22:05
Comment on lines 192 to 194
auto childrenStore = ScratchAlloc.Allocate<OpaqueSyntaxNode>(children.size());
MutableArrayRef<OpaqueSyntaxNode> opaqueChildren =
llvm::makeMutableArrayRef(childrenStore, children.size());
Copy link
Member

@rintaro rintaro Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to allocate a new memory region for each deferred layout creation.
SmallVector<OpaqueSyntaxNode, 0> doesn't work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s one of these places where I didn't put too much energy in and just kept the memory allocation scheme that’s currently used because I’ve got a further optimization up my sleeve. In this case, I’ll reuse the memory of children to store the OpaqueNodes because OpaqueNodes are always guaranteed to be smaller than RecordedOrDeferredNodes.

Copy link
Member

@rintaro rintaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall look good.

As I mentioned in #36235 (review)
some backtracking scope are never canceled. (e.g. isValidTrailingClosure(), isStartOfSwiftDecl())
In such backtracking scopes, creating deferred tokens are just a waste. When this PR lands, they just leak. Could you consider disabling SyntaxParsingContext in such backtracking scope?

// be required to record the deferred nodes.
// Should a deferred node not be recorded, its data stays alive in the
// SyntaxArena. This causes a small memory leak but since most nodes are
// being recorded, it is acceptable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a TODO comments about creating deferred nodes in a different arena

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided against creating deferred nodes in a different arena, because I assume that creating a new arena is rather expensive (heap allocation of the arena itself plus another heap allocation for its bump allocator) and because I think the leakage is pretty small (see my description of the PR). But disabling deferred node creation in the backtracking scopes that are never cancelled sounds like a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After offline discussion, we decided to merge this as it and I’ll set up a follow-up PR that disables the SyntaxParsingContext when it’s not needed.

We have finally reached our goal of optimising deferred node creation
for SyntaxTreeCreator. Instead of creating dedicated deferred nodes and
copying the data into a RawSyntax node when recording, we always create
RawSyntax nodes. Recording a deferred node is thus a no-op, since we
have already created a RawSyntax node. Should a deferred node not be
recorded, it stays alive in the SyntaxArena without any reference to it.
While this means, we are leaking some memory for such nodes, most nodes
do get recorded, so the overhead should be fine compared to the
performance benefit.
@ahoppen ahoppen force-pushed the pr/deferred-rawsyntax-nodes branch from 7d755d2 to a47bd70 Compare March 10, 2021 07:49
@ahoppen
Copy link
Member Author

ahoppen commented Mar 10, 2021

@swift-ci Please smoke test

@ahoppen ahoppen merged commit 0d913e9 into swiftlang:main Mar 10, 2021
@ahoppen ahoppen deleted the pr/deferred-rawsyntax-nodes branch March 10, 2021 14:40
@@ -48,6 +48,8 @@ class SyntaxTreeCreator: public SyntaxParseActions {
/// tree.
SyntaxParsingCache *SyntaxCache;

llvm::BumpPtrAllocator ScratchAlloc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. I’ll delete it in the next PR. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants