Skip to content

Add disjoint set #1194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 23, 2019
Merged

Add disjoint set #1194

merged 4 commits into from
Sep 23, 2019

Conversation

luoheng23
Copy link
Contributor

No description provided.

Copy link
Member

@cclauss cclauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supercool! Congratulations and thanks for sharing.

A nice addition would be a function like:
def as_python_set(node: Node) -> set:

And an addition test, as_python_set(vertex[0]).is_disjoint(as_python_set(vertex[3]))
https://docs.python.org/3.7/library/stdtypes.html#frozenset.isdisjoint

@luoheng23
Copy link
Contributor Author

This is supercool! Congratulations and thanks for sharing.

A nice addition would be a function like:
def as_python_set(node: Node) -> set:

And an addition test, as_python_set(vertex[0]).is_disjoint(as_python_set(vertex[3]))
https://docs.python.org/3.7/library/stdtypes.html#frozenset.isdisjoint

Thank you for your advice. I have updated the code.
But I have problem with addtional function. The function as_python_set is hard to implement. Disjoint set is a tree structure, each node only has a pointer to its parent, so it's hard to get the whole set using just one node. The test has the same function with assert find_set(vertex[2]) != find_set(vertex[3]), for each node has only one pointer.

@cclauss
Copy link
Member

cclauss commented Sep 22, 2019

We should be able to get the whole set by starting at the leaf node.

Put this towards the bottom of your file:

def as_python_set(node: Node) -> set:
    python_set = set()
    while True:
        python_set.add(node.data)
        if node == node.p:
            # assert node.rank == 0, f"{python_set} root node has zero rank: {node.rank}"
            return python_set
        node = node.p


if __name__ == "__main__":
    test_disjoint_set()
    vertex = [Node(i) for i in range(6)]
    for v in vertex:
        make_set(v)
    union_set(vertex[0], vertex[1])
    union_set(vertex[1], vertex[2])
    union_set(vertex[3], vertex[4])
    union_set(vertex[4], vertex[5])
    for node in vertex:
        print(node.data, as_python_set(node))

This might help to see issues to fix. Ideally, we would like as_python_set(vertex[2]) to be {0, 1, 2}.

The output currently is:

0 {0, 1}
1 {1}
2 {1, 2}
3 {3, 4}
4 {4}
5 {4, 5}

@cclauss
Copy link
Member

cclauss commented Sep 22, 2019

Perhaps we are not creating parents with the current code but are creating children instead. ;-)

My sense is that rank is not really helping us and could safely be dropped.

Goal: Build a Python set from either the root node of a union_set or the leaf node of a union_set.

@luoheng23
Copy link
Contributor Author

Perhaps we are not creating parents with the current code but are creating children instead. ;-)

My sense is that rank is not really helping us and could safely be dropped.

Goal: Build a Python set from either the root node of a union_set or the leaf node of a union_set.

I don't think we need a node-to-set method.
disjoint set is to quickly determine if two nodes are in the same set. The idea is choosing a root, and all nodes point to it.
For example, for a, b, c, d. Final disjoint set is not like a -> b -> c -> d, but like a -> d, b -> d, c -> d, d -> d. Then find_set can quickly return d in just one operation. So we can quickly know whether they are in same set.
So it's impossible to get whole set through a node, and this is not purpose of this data structure.
Rank is an optimize method for disjoint set, it can make the tree more flat, then the time complexity will be reduced. refercence here, with such method, disjoint set can be faster than normal set.

@luoheng23 luoheng23 requested a review from cclauss September 23, 2019 01:45
@cclauss
Copy link
Member

cclauss commented Sep 23, 2019

Sorry for being so dense. You are correct. It works perfectly. Here is how I tested it.

def find_python_set(node: Node) -> set:
    """
    Return a Python Standard Library set that contains i.
    """
    sets = ({0, 1, 2}, {3, 4, 5})
    for s in sets:
        if node.data in s:
            return s
    raise ValueError(f"{node.data} is not in {sets}")


def test_disjoint_set():
    """
    >>> test_disjoint_set()
    """
    vertex = [Node(i) for i in range(6)]
    for v in vertex:
        make_set(v)

    union_set(vertex[0], vertex[1])
    union_set(vertex[1], vertex[2])
    union_set(vertex[3], vertex[4])
    union_set(vertex[3], vertex[5])

    for node0 in vertex:
        for node1 in vertex:
            if find_python_set(node0).isdisjoint(find_python_set(node1)):
                assert find_set(node0) != find_set(node1)
            else:
                assert find_set(node0) == find_set(node1)

Please put a link to the Uncyclopedia article in a comment at the top of the file and then we can land this. Thanks for your patience.

@luoheng23
Copy link
Contributor Author

Sorry for being so dense. You are correct. It works perfectly. Here is how I tested it.

def find_python_set(node: Node) -> set:
    """
    Return a Python Standard Library set that contains i.
    """
    sets = ({0, 1, 2}, {3, 4, 5})
    for s in sets:
        if node.data in s:
            return s
    raise ValueError(f"{node.data} is not in {sets}")


def test_disjoint_set():
    """
    >>> test_disjoint_set()
    """
    vertex = [Node(i) for i in range(6)]
    for v in vertex:
        make_set(v)

    union_set(vertex[0], vertex[1])
    union_set(vertex[1], vertex[2])
    union_set(vertex[3], vertex[4])
    union_set(vertex[3], vertex[5])

    for node0 in vertex:
        for node1 in vertex:
            if find_python_set(node0).isdisjoint(find_python_set(node1)):
                assert find_set(node0) != find_set(node1)
            else:
                assert find_set(node0) == find_set(node1)

Please put a link to the Uncyclopedia article in a comment at the top of the file and then we can land this. Thanks for your patience.

Thank you for your reply.
Your test is better, I have updated the test and reference.

@cclauss cclauss merged commit 01601e6 into TheAlgorithms:master Sep 23, 2019
Raj1998 added a commit to Raj1998/Python that referenced this pull request Sep 25, 2019
@luoheng23 luoheng23 deleted the disjoint_set branch September 28, 2019 10:19
stokhos pushed a commit to stokhos/Python that referenced this pull request Jan 3, 2021
* Add disjoint set

* disjoint set: add doctest, make code more Pythonic

* disjoint set: replace x.p with x.parent

* disjoint set: add test and refercence
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants