-
Notifications
You must be signed in to change notification settings - Fork 440
Fix performance regressions introduced by my recent changes #1257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix performance regressions introduced by my recent changes #1257
Conversation
…d value for keywords If we model `RawTokenKind` as an enum with associated values, equality comparisons call into `__derived_enum_equals`, which is less efficient than comparing the base and keyword of the `RawTokenKind` separately. I believe this is because `__derived_enum_equals` needs to handle the case where multiple cases have associated values, while we can special case for a single associated value.
If we inline the matching calls, the compiler can constant evaluate the `self.rawTokenKind.base == .keyword` condition in `matches`, allowing it to avoid comparing string contents if we are not expecting a keyword.
The ARC traffic to keep track of the state stack was significant. Since most of the time we are in the `normal` state and the stack is effectively empty, we can store any large-ish stack in a bump allocator which outlives the cursor, avoiding ARC traffic.
If you call a non-mutating method from a mutating method, for some reason that results in a retain of `self`. Marking all methods (most importantly the at` methods avoids that ARC traffic.
This avoids some more ARC traffic.
@swift-ci Please test |
case .push(newState: let newState): | ||
if let topState = topState { | ||
if let stateStack = stateStack { | ||
let newStateStack = stateAllocator.allocate(State.self, count: stateStack.count + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little unfortunate we don't reuse previously-allocated buffer. For example
push
, push
, pop
, push
. operation should reuse the buffer allocated at the second push
, instead of allocating a whole new buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but how do you know that you have already seen this exact stack? We could try to create a dictionary for it, but given that the bump allocator doesn’t allocate more than 250 byte for any file in MovieSwiftUI, I’m not sure if there’s much to be gained here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am convinced that we can't reuse the memory region since other (copied) cursor might still have a reference to it.
This improves parsing performance by ~2.5x. Parsing is still about 1.5x slower compared to before #1173 but given that I introduced quite a bit of necessary state and the fact that we still parse all files in MovieSwiftUI in less than 15ms I hope that’s acceptable. If we want to improve performance further, we just need to do some more investigation.
It’s probably best to review this commit by commit. All commits are standalone and have a description of what they do. I can split any commit that’s controversial to its own PR.