[libSyntax] Don't cache token nodes #36352

ahoppen · 2021-03-08T15:38:45Z

It turns out that caching is actually more expensive than just creating new nodes. I assume this is because memory allocation has become fast through the bump allocator and the cache lookup requires computing a hash value which is rather expensive.

In terms of memory, I measured the memory used by the SyntaxArena when parsing Alamofire, which increases from 1.07MB to 1.24MB (15% increase), which should be negligible compared to the >100MB overall memory footprint of swift-frontend. I haven’t measured again, but the performance increase when creating a libSyntax tree was significant (something like 15-40%, I can measure again if you’d like more accurate numbers).

Also, out of curiosity, I measured how many cache hits we were actually seeing and 75% of all tokens could be served from the cache. This tells me that the majority of the SyntaxArena's memory is used by layout nodes.

ahoppen · 2021-03-08T15:38:52Z

@swift-ci Please smoke test

akyrtzi · 2021-03-08T18:35:57Z

IIRC I came to same conclusion on the SwiftSyntax side and removed the caching there as well.

I was thinking at the time we could reconsider caching as bulk creating plain token nodes in advance and re-using them process-wide, but perhaps it would complicate memory management a bit so not sure how much it is worth.

ahoppen · 2021-03-08T18:38:07Z

I was thinking at the time we could reconsider caching as bulk creating plain token nodes in advance and re-using them process-wide, but perhaps it would complicate memory management a bit so not sure how much it is worth.

As in, create tokens for the common keywords with common trivia (e.g. one trailing space) and have an efficient look-up structure for them?

akyrtzi · 2021-03-08T20:13:10Z

I was thinking at the time we could reconsider caching as bulk creating plain token nodes in advance and re-using them process-wide, but perhaps it would complicate memory management a bit so not sure how much it is worth.

As in, create tokens for the common keywords with common trivia (e.g. one trailing space) and have an efficient look-up structure for them?

Yes, exactly.

It turns out that caching is actually more expensive than just creating new nodes.

ahoppen · 2021-03-09T08:55:52Z

@swift-ci Please smoke test

ahoppen requested review from akyrtzi and rintaro March 8, 2021 15:38

akyrtzi approved these changes Mar 8, 2021

View reviewed changes

rintaro approved these changes Mar 8, 2021

View reviewed changes

[libSyntax] Don't cache token nodes

2e5c869

It turns out that caching is actually more expensive than just creating new nodes.

ahoppen force-pushed the pr/dont-cache-tokens branch from c0b2ab5 to 2e5c869 Compare March 9, 2021 08:55

ahoppen merged commit bdacb68 into swiftlang:main Mar 9, 2021

ahoppen deleted the pr/dont-cache-tokens branch March 9, 2021 12:09

ahoppen mentioned this pull request Mar 9, 2021

[libSyntax] Don't create dedicated deferred nodes in SyntaxTreeCreator #36364

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libSyntax] Don't cache token nodes #36352

[libSyntax] Don't cache token nodes #36352

Uh oh!

ahoppen commented Mar 8, 2021

Uh oh!

ahoppen commented Mar 8, 2021

Uh oh!

akyrtzi commented Mar 8, 2021

Uh oh!

ahoppen commented Mar 8, 2021

Uh oh!

akyrtzi commented Mar 8, 2021

Uh oh!

ahoppen commented Mar 9, 2021

Uh oh!

Uh oh!

[libSyntax] Don't cache token nodes #36352

[libSyntax] Don't cache token nodes #36352

Uh oh!

Conversation

ahoppen commented Mar 8, 2021

Uh oh!

ahoppen commented Mar 8, 2021

Uh oh!

akyrtzi commented Mar 8, 2021

Uh oh!

ahoppen commented Mar 8, 2021

Uh oh!

akyrtzi commented Mar 8, 2021

Uh oh!

ahoppen commented Mar 9, 2021

Uh oh!

Uh oh!