Skip to content

bpo-17005: Add a topological sort algorithm #11583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Jan 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
b615c54
bpo-17005: Add a topological sort algorithm
pablogsal Jan 16, 2020
e911459
Adress first round of Tim feedback and adjust docstring lenght
pablogsal Jan 17, 2020
493f6a5
Add a set to avoid O(n2) behaviour in the check for cycles
pablogsal Jan 17, 2020
cac6cf2
Use Tim's cycle-search algorithm
pablogsal Jan 17, 2020
e1255dd
Add another test for the cycle-find algorithm
pablogsal Jan 17, 2020
301a459
Uncouple if confition and restore original code
pablogsal Jan 17, 2020
6f33c71
Add __bool__ as alias of is_active
pablogsal Jan 17, 2020
1efe994
Improve docs and allow multiple nodes in done()
pablogsal Jan 17, 2020
aa56145
Add 'static_order' method, docs and tests and some cleanups
pablogsal Jan 17, 2020
d0ba05b
Misc improvements to the docs
pablogsal Jan 17, 2020
2309642
Apply suggestions from code review
pablogsal Jan 18, 2020
3980ef8
Apply suggestions from code review
pablogsal Jan 18, 2020
050c2bb
Update Doc/library/functools.rst
pablogsal Jan 18, 2020
1149685
Update Doc/library/functools.rst
pablogsal Jan 18, 2020
cf8a729
Remove stable from the docs
pablogsal Jan 18, 2020
26c52a6
Update Doc/library/functools.rst
pablogsal Jan 18, 2020
4648f50
Update Doc/library/functools.rst
pablogsal Jan 18, 2020
3e2f2d9
Fix quotes
pablogsal Jan 18, 2020
d82f697
Add more test cases
pablogsal Jan 18, 2020
c0d2ff6
Add even more test cases
pablogsal Jan 18, 2020
ed8c4ee
Use strings for the hash test and compare groups for insertion order
pablogsal Jan 18, 2020
6229ad5
Add example on insertion order affecting static_order()
pablogsal Jan 18, 2020
618a791
Simplify test
pablogsal Jan 18, 2020
1d926ab
Apply suggestions from code review
pablogsal Jan 18, 2020
064209a
Improvements to the documentation formatting
pablogsal Jan 18, 2020
cc7aacd
Apply suggestions from code review
pablogsal Jan 20, 2020
c82d41b
Correct typo
pablogsal Jan 20, 2020
81c1833
Document CycleError
pablogsal Jan 23, 2020
89d07ce
fixup! Document CycleError
pablogsal Jan 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 208 additions & 0 deletions Doc/library/functools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,16 @@
.. moduleauthor:: Raymond Hettinger <[email protected]>
.. moduleauthor:: Nick Coghlan <[email protected]>
.. moduleauthor:: Łukasz Langa <[email protected]>
.. moduleauthor:: Pablo Galindo <[email protected]>
.. sectionauthor:: Peter Harris <[email protected]>

**Source code:** :source:`Lib/functools.py`

.. testsetup:: default

import functools
from functools import *

--------------

The :mod:`functools` module is for higher-order functions: functions that act on
Expand Down Expand Up @@ -512,6 +518,192 @@ The :mod:`functools` module defines the following functions:
.. versionadded:: 3.8


.. class:: TopologicalSorter(graph=None)

Provides functionality to topologically sort a graph of hashable nodes.

A topological order is a linear ordering of the vertices in a graph such that for
every directed edge u -> v from vertex u to vertex v, vertex u comes before vertex
v in the ordering. For instance, the vertices of the graph may represent tasks to
be performed, and the edges may represent constraints that one task must be
performed before another; in this example, a topological ordering is just a valid
sequence for the tasks. A complete topological ordering is possible if and only if
the graph has no directed cycles, that is, if it is a directed acyclic graph.

If the optional *graph* argument is provided it must be a dictionary representing
a directed acyclic graph where the keys are nodes and the values are iterables of
all predecessors of that node in the graph (the nodes that have edges that point
to the value in the key). Additional nodes can be added to the graph using the
:meth:`~TopologicalSorter.add` method.

In the general case, the steps required to perform the sorting of a given graph
are as follows:

* Create an instance of the :class:`TopologicalSorter` with an optional initial graph.
* Add additional nodes to the graph.
* Call :meth:`~TopologicalSorter.prepare` on the graph.
* While :meth:`~TopologicalSorter.is_active` is ``True``, iterate over the
nodes returned by :meth:`~TopologicalSorter.get_ready` and process them.
Call :meth:`~TopologicalSorter.done` on each node as it finishes processing.

In case just an immediate sorting of the nodes in the graph is required and
no parallelism is involved, the convenience method :meth:`TopologicalSorter.static_order`
can be used directly. For example, this method can be used to implement a simple
version of the C3 linearization algorithm used by Python to calculate the Method
Resolution Order (MRO) of a derived class:

.. doctest::

>>> class A: pass
>>> class B(A): pass
>>> class C(A): pass
>>> class D(B, C): pass

>>> D.__mro__
(<class 'D'>, <class 'B'>, <class 'C'>, <class 'A'>, <class 'object'>)

>>> graph = {D: {B, C}, C: {A}, B: {A}, A:{object}}
>>> ts = TopologicalSorter(graph)
>>> topological_order = tuple(ts.static_order())
>>> tuple(reversed(topological_order))
(<class 'D'>, <class 'B'>, <class 'C'>, <class 'A'>, <class 'object'>)

The class is designed to easily support parallel processing of the nodes as they
become ready. For instance::

topological_sorter = TopologicalSorter()

# Add nodes to 'topological_sorter'...

topological_sorter.prepare()
while topological_sorter.is_active():
for node in topological_sorter.get_ready():
# Worker threads or processes take nodes to work on off the
# 'task_queue' queue.
task_queue.put(node)

# When the work for a node is done, workers put the node in
# 'finalized_tasks_queue' so we can get more nodes to work on.
# The definition of 'is_active()' guarantees that, at this point, at
# least one node has been placed on 'task_queue' that hasn't yet
# been passed to 'done()', so this blocking 'get()' must (eventually)
# succeed. After calling 'done()', we loop back to call 'get_ready()'
# again, so put newly freed nodes on 'task_queue' as soon as
# logically possible.
node = finalized_tasks_queue.get()
topological_sorter.done(node)

.. method:: add(node, *predecessors)

Add a new node and its predecessors to the graph. Both the *node* and
all elements in *predecessors* must be hashable.

If called multiple times with the same node argument, the set of dependencies
will be the union of all dependencies passed in.

It is possible to add a node with no dependencies (*predecessors* is not
provided) or to provide a dependency twice. If a node that has not been
provided before is included among *predecessors* it will be automatically added
to the graph with no predecessors of its own.

Raises :exc:`ValueError` if called after :meth:`~TopologicalSorter.prepare`.

.. method:: prepare()

Mark the graph as finished and check for cycles in the graph. If any cycle is
detected, :exc:`CycleError` will be raised, but
:meth:`~TopologicalSorter.get_ready` can still be used to obtain as many nodes
as possible until cycles block more progress. After a call to this function,
the graph cannot be modified, and therefore no more nodes can be added using
:meth:`~TopologicalSorter.add`.

.. method:: is_active()

Returns ``True`` if more progress can be made and ``False`` otherwise. Progress
can be made if cycles do not block the resolution and either there are still
nodes ready that haven't yet been returned by
:meth:`TopologicalSorter.get_ready` or the number of nodes marked
:meth:`TopologicalSorter.done` is less than the number that have been returned
by :meth:`TopologicalSorter.get_ready`.

The :meth:`~TopologicalSorter.__bool__` method of this class defers to this
function, so instead of::

if ts.is_active():
...

if possible to simply do::

if ts:
...

Raises :exc:`ValueError` if called without calling :meth:`~TopologicalSorter.prepare`
previously.

.. method:: done(*nodes)

Marks a set of nodes returned by :meth:`TopologicalSorter.get_ready` as
processed, unblocking any successor of each node in *nodes* for being returned
in the future by a call to :meth:`TopologicalSorter.get_ready`.

Raises :exc:`ValueError` if any node in *nodes* has already been marked as
processed by a previous call to this method or if a node was not added to the
graph by using :meth:`TopologicalSorter.add`, if called without calling
:meth:`~TopologicalSorter.prepare` or if node has not yet been returned by
:meth:`~TopologicalSorter.get_ready`.

.. method:: get_ready()

Returns a ``tuple`` with all the nodes that are ready. Initially it returns all
nodes with no predecessors, and once those are marked as processed by calling
:meth:`TopologicalSorter.done`, further calls will return all new nodes that
have all their predecessors already processed. Once no more progress can be
made, empty tuples are returned.
made.

Raises :exc:`ValueError` if called without calling
:meth:`~TopologicalSorter.prepare` previously.

.. method:: static_order()

Returns an iterable of nodes in a topological order. Using this method
does not require to call :meth:`TopologicalSorter.prepare` or
:meth:`TopologicalSorter.done`. This method is equivalent to::

def static_order(self):
self.prepare()
while self.is_active():
node_group = self.get_ready()
yield from node_group
self.done(*node_group)

The particular order that is returned may depend on the specific order in
which the items were inserted in the graph. For example:

.. doctest::

>>> ts = TopologicalSorter()
>>> ts.add(3, 2, 1)
>>> ts.add(1, 0)
>>> print([*ts.static_order()])
[2, 0, 1, 3]

>>> ts2 = TopologicalSorter()
>>> ts2.add(1, 0)
>>> ts2.add(3, 2, 1)
>>> print([*ts2.static_order()])
[0, 2, 1, 3]

This is due to the fact that "0" and "2" are in the same level in the graph (they
would have been returned in the same call to :meth:`~TopologicalSorter.get_ready`)
and the order between them is determined by the order of insertion.


If any cycle is detected, :exc:`CycleError` will be raised.

.. versionadded:: 3.9


.. function:: update_wrapper(wrapper, wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES)

Update a *wrapper* function to look like the *wrapped* function. The optional
Expand Down Expand Up @@ -621,3 +813,19 @@ differences. For instance, the :attr:`~definition.__name__` and :attr:`__doc__`
are not created automatically. Also, :class:`partial` objects defined in
classes behave like static methods and do not transform into bound methods
during instance attribute look-up.


Exceptions
----------
The :mod:`functools` module defines the following exception classes:

.. exception:: CycleError

Subclass of :exc:`ValueError` raised by :meth:`TopologicalSorter.prepare` if cycles exist
in the working graph. If multiple cycles exist, only one undefined choice among them will
be reported and included in the exception.

The detected cycle can be accessed via the second element in the :attr:`~CycleError.args`
attribute of the exception instance and consists in a list of nodes, such that each node is,
in the graph, an immediate predecessor of the next node in the list. In the reported list,
the first and the last node will be the same, to make it clear that it is cyclic.
Binary file added Doc/myfile.bz2
Binary file not shown.
7 changes: 7 additions & 0 deletions Doc/whatsnew/3.9.rst
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,13 @@ ftplib
if the given timeout for their constructor is zero to prevent the creation of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pablogsal the file Doc/myfile.bz2 above does not seem referenced from anywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That has been corrected since this PR :)

Copy link
Member Author

@pablogsal pablogsal Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Particulary in 67673b0 in #21286

a non-blocking socket. (Contributed by Dong-hee Na in :issue:`39259`.)

functools
---------

Add the :class:`functools.TopologicalSorter` class to offer functionality to perform
topological sorting of graphs. (Contributed by Pablo Galindo, Tim Peters and Larry
Hastings in :issue:`17005`.)

gc
--

Expand Down
Loading