Skip to content

Commit ee28565

Browse files
Michael0x2agvanrossum
authored andcommitted
Delay processing fresh SCCs until needed (#2110)
Previously, the process_graph function would always process SCCs, fresh and stale, as they were encountered. This commit modifies this process so instead, we batch and defer processing fresh SCCs until we encounter a stale SCCs. If we do not end up encountering any stale SCCs at all, we throw away the queued fresh SCCs without having to load any of the data files at all. Since loading the data file of fresh SCCs does take a small amount of time, this change will make mypy slightly faster, particularly in the case where there are very few to no files to recheck, and the corresponding SCCs occur early in the sorted list of SCCs. As an example, when running mypy against itself with a fully warm cache, the time it takes to finish typechecking is now approximately 0.3 to 0.4 seconds -- previously, it was 1.3 to 1.4.
1 parent b8abcaf commit ee28565

File tree

1 file changed

+29
-6
lines changed

1 file changed

+29
-6
lines changed

mypy/build.py

Lines changed: 29 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1550,6 +1550,9 @@ def process_graph(graph: Graph, manager: BuildManager) -> None:
15501550
sccs = sorted_components(graph)
15511551
manager.log("Found %d SCCs; largest has %d nodes" %
15521552
(len(sccs), max(len(scc) for scc in sccs)))
1553+
1554+
fresh_scc_queue = [] # type: List[List[str]]
1555+
15531556
# We're processing SCCs from leaves (those without further
15541557
# dependencies) to roots (those from which everything else can be
15551558
# reached).
@@ -1627,16 +1630,36 @@ def process_graph(graph: Graph, manager: BuildManager) -> None:
16271630
fresh_msg += " with stale deps (%s)" % " ".join(sorted(stale_deps))
16281631
else:
16291632
fresh_msg = "stale due to deps (%s)" % " ".join(sorted(stale_deps))
1630-
if len(scc) == 1:
1631-
manager.log("Processing SCC singleton (%s) as %s" % (" ".join(scc), fresh_msg))
1632-
else:
1633-
manager.log("Processing SCC of size %d (%s) as %s" %
1634-
(len(scc), " ".join(scc), fresh_msg))
1633+
1634+
scc_str = " ".join(scc)
16351635
if fresh:
1636-
process_fresh_scc(graph, scc)
1636+
manager.log("Queuing fresh SCC (%s)" % scc_str)
1637+
fresh_scc_queue.append(scc)
16371638
else:
1639+
if len(fresh_scc_queue) > 0:
1640+
manager.log("Processing the last {} queued SCCs".format(len(fresh_scc_queue)))
1641+
# Defer processing fresh SCCs until we actually run into a stale SCC
1642+
# and need the earlier modules to be loaded.
1643+
#
1644+
# Note that `process_graph` may end with us not having processed every
1645+
# single fresh SCC. This is intentional -- we don't need those modules
1646+
# loaded if there are no more stale SCCs to be rechecked.
1647+
#
1648+
# TODO: see if it's possible to determine if we need to process only a
1649+
# _subset_ of the past SCCs instead of having to process them all.
1650+
for prev_scc in fresh_scc_queue:
1651+
process_fresh_scc(graph, prev_scc)
1652+
fresh_scc_queue = []
1653+
size = len(scc)
1654+
if size == 1:
1655+
manager.log("Processing SCC singleton (%s) as %s" % (scc_str, fresh_msg))
1656+
else:
1657+
manager.log("Processing SCC of size %d (%s) as %s" % (size, scc_str, fresh_msg))
16381658
process_stale_scc(graph, scc)
16391659

1660+
sccs_left = len(fresh_scc_queue)
1661+
manager.log("{} fresh SCCs left in queue (and will remain unprocessed)".format(sccs_left))
1662+
16401663

16411664
def order_ascc(graph: Graph, ascc: AbstractSet[str], pri_max: int = PRI_ALL) -> List[str]:
16421665
"""Come up with the ideal processing order within an SCC.

0 commit comments

Comments
 (0)