Delay processing fresh SCCs until needed #2110

Michael0x2a · 2016-09-08T03:24:36Z

Previously, the process_graph function would always process SCCs, fresh and stale, as they were encountered.

This commit modifies this process so instead, we batch and defer processing fresh SCCs until we encounter a stale SCCs. If we do not end up encountering any stale SCCs at all, we throw away the queued fresh SCCs without having to load any of the data files at all.

Since loading the data file of fresh SCCs does take a small amount of time, this change will make mypy slightly faster, particularly in the case where there are very few to no files to recheck, and the corresponding SCCs occur early in the sorted list of SCCs.

As an example, when running mypy against itself with a fully warm cache, the time it takes to finish typechecking is now approximately 0.3 to 0.4 seconds -- previously, it was 1.3 to 1.4.

Previously, the process_graph function would always process SCCs, fresh and stale, as they were encountered. This commit modifies this process so instead, we batch and defer processing fresh SCCs until we encounter a stale SCCs. If we do not end up encountering any stale SCCs at all, we throw away the queued fresh SCCs without having to load any of the data files at all. Since loading the data file of fresh SCCs does take a small amount of time, this change will make mypy slightly faster, particularly in the case where there are very few to no files to recheck, and the corresponding SCCs occur early in the sorted list of SCCs. As an example, when running mypy against itself with a fully warm cache, the time it takes to finish typechecking is now approximately 0.3 to 0.4 seconds -- previously, it was 1.3 to 1.4.

gvanrossum · 2016-09-08T17:51:00Z

Nice! Can you add some extra logging at all the new branch points? E.g. when putting an SCC into the queue, and when the queue has to be processed. Also at the end, log the queue size so we know if it made any difference.

This also makes me think that sometimes we can keep things in the queue because the stale thing we've found doesn't depend on them. But that's probably better done as a TODO for now.

This commit adds some logging when fresh SCCs are queued and handled and cleans up some of the logic associated with logging fresh SCCs.

Michael0x2a · 2016-09-08T18:37:39Z

Ok, added some logging and a TODO note.

(I tried implementing something yesterday that would try and load only a subset of the queued SCCs yesterday, but I ran into a few bugs and ultimately decided to just go with this version rather then grappling with it more.)

MichaelLeeDBX added 2 commits September 8, 2016 11:21

Add (and clean up) logging for processing SCCs

560e50d

This commit adds some logging when fresh SCCs are queued and handled and cleans up some of the logic associated with logging fresh SCCs.

Add TODO for possible improvements to queue processing

332d229

gvanrossum merged commit ee28565 into python:master Sep 8, 2016

Michael0x2a deleted the delay-processing-fresh-sccs branch September 9, 2016 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Delay processing fresh SCCs until needed #2110

Delay processing fresh SCCs until needed #2110

Uh oh!

Michael0x2a commented Sep 8, 2016

Uh oh!

gvanrossum commented Sep 8, 2016

Uh oh!

Michael0x2a commented Sep 8, 2016

Uh oh!

Uh oh!

Uh oh!

Delay processing fresh SCCs until needed #2110

Delay processing fresh SCCs until needed #2110

Uh oh!

Conversation

Michael0x2a commented Sep 8, 2016

Uh oh!

gvanrossum commented Sep 8, 2016

Uh oh!

Michael0x2a commented Sep 8, 2016

Uh oh!

Uh oh!