[libSyntax] Don't cache token nodes #36352
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It turns out that caching is actually more expensive than just creating new nodes. I assume this is because memory allocation has become fast through the bump allocator and the cache lookup requires computing a hash value which is rather expensive.
In terms of memory, I measured the memory used by the
SyntaxArena
when parsing Alamofire, which increases from 1.07MB to 1.24MB (15% increase), which should be negligible compared to the >100MB overall memory footprint ofswift-frontend
. I haven’t measured again, but the performance increase when creating a libSyntax tree was significant (something like 15-40%, I can measure again if you’d like more accurate numbers).Also, out of curiosity, I measured how many cache hits we were actually seeing and 75% of all tokens could be served from the cache. This tells me that the majority of the
SyntaxArena
's memory is used by layout nodes.