Skip to content

Delay processing fresh SCCs until needed #2110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 8, 2016

Conversation

Michael0x2a
Copy link
Collaborator

Previously, the process_graph function would always process SCCs, fresh and stale, as they were encountered.

This commit modifies this process so instead, we batch and defer processing fresh SCCs until we encounter a stale SCCs. If we do not end up encountering any stale SCCs at all, we throw away the queued fresh SCCs without having to load any of the data files at all.

Since loading the data file of fresh SCCs does take a small amount of time, this change will make mypy slightly faster, particularly in the case where there are very few to no files to recheck, and the corresponding SCCs occur early in the sorted list of SCCs.

As an example, when running mypy against itself with a fully warm cache, the time it takes to finish typechecking is now approximately 0.3 to 0.4 seconds -- previously, it was 1.3 to 1.4.

Previously, the process_graph function would always process SCCs, fresh
and stale, as they were encountered.

This commit modifies this process so instead, we batch and defer
processing fresh SCCs until we encounter a stale SCCs. If we do not end
up encountering any stale SCCs at all, we throw away the queued fresh
SCCs without having to load any of the data files at all.

Since loading the data file of fresh SCCs does take a small amount of
time, this change will make mypy slightly faster, particularly in the
case where there are very few to no files to recheck, and the
corresponding SCCs occur early in the sorted list of SCCs.

As an example, when running mypy against itself with a fully warm cache,
the time it takes to finish typechecking is now approximately 0.3 to 0.4
seconds -- previously, it was 1.3 to 1.4.
@gvanrossum
Copy link
Member

Nice! Can you add some extra logging at all the new branch points? E.g. when putting an SCC into the queue, and when the queue has to be processed. Also at the end, log the queue size so we know if it made any difference.

This also makes me think that sometimes we can keep things in the queue because the stale thing we've found doesn't depend on them. But that's probably better done as a TODO for now.

This commit adds some logging when fresh SCCs are queued and handled
and cleans up some of the logic associated with logging fresh SCCs.
@Michael0x2a
Copy link
Collaborator Author

Ok, added some logging and a TODO note.

(I tried implementing something yesterday that would try and load only a subset of the queued SCCs yesterday, but I ran into a few bugs and ultimately decided to just go with this version rather then grappling with it more.)

@gvanrossum gvanrossum merged commit ee28565 into python:master Sep 8, 2016
@Michael0x2a Michael0x2a deleted the delay-processing-fresh-sccs branch September 9, 2016 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants