Skip to content

Adding random_permutation function to math.py #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 28, 2025

Conversation

mohitbhasin
Copy link
Contributor

No description provided.

@hx2A
Copy link
Collaborator

hx2A commented Jan 23, 2025

Hi, @mohitbhasin ! Thank you for your interest in working on this.

For this implementation, I'd rather not call random_sample(). Numpy has its own shuffle() method and we can call that directly. Also, you can see in math.py, you can see the self._rng object is used instead of calling np.random.something.

The signature of the shuffle() method should be something like:

def shuffle(self, objects: Sequence[Any]) -> Sequence[Any]:

And I just noticed that random_sample() and random_choice() should have Sequence in their signatures, not list.

Like random_sample(), it should check if objects is a generator and if so convert it to a list. Numpy's shuffle() method does the shuffling in place which will also happen here. I'd like this method to return Sequence[Any] anyway because that's the only way to get an output if the user passes a generator.

This method should work for generators, lists, and numpy arrays, returning the same data type except when the user passes a generator.

Finally, note the docstring. That docstring is a key that links to a reference file. Here is the file for random_sample().

https://github.com/py5coding/py5generator/blob/26fd1e2576b41014e514dd4bc5167290e1ded0ce/py5_docs/Reference/api_en/Sketch_random_sample.txt

You'll need to create a file for that also, in the same format, with example code.

@mohitbhasin
Copy link
Contributor Author

Got it. I'll work on it and let you know if I have any questions.

@hx2A
Copy link
Collaborator

hx2A commented Jan 23, 2025

Thank you!

@hx2A
Copy link
Collaborator

hx2A commented Jan 23, 2025

Also, let's call it random_shuffle() instead of shuffle().

@GoToLoop
Copy link

GoToLoop commented Jan 23, 2025

  1. The parameter name "objects" should be renamed to a singular name, something like "sequence" or "container".
  2. Instead of [Any], maybe we should attempt a generic type [T].
  3. Processing's array methods always return a new array. Maybe this function should have a parameter "in_place=False".
  4. Also, Processing's random-related static methods, such as Pvector.random3D(), allow us to pass a PApplet instance, so it uses that "seeded" Random instance internally.

@mohitbhasin
Copy link
Contributor Author

What are your thoughts on this:

def random_shuffle(self, container: Sequence[Any]) -> Sequence[Any]:
    if isinstance(container, types.GeneratorType):
        container = list(container)
    self._rng.shuffle(container)
    return container

@villares
Copy link
Collaborator

def shuffle(self, objects: Sequence[Any]) -> Sequence[Any]:

I have a question here... Sequence allows for immutable sequences, but you mentioned "in place" shuffling...
... so this works fine for choice and sample, but maybe not for shuffle.

So I went to have a look at the standard library's random.shuffle() to see if I had any hints:

image

And nothing at the official Python type shed either.

@hx2A
Copy link
Collaborator

hx2A commented Jan 23, 2025

Sequence allows for immutable sequences, but you mentioned "in place" shuffling...
... so this works fine for choice and sample, but maybe not for shuffle.

Right. I think I've changed my mind here. None of the other methods in math.py change user inputs in-place, and in general, I don't like it when there inconsistencies or confusion about if something does or does not change something in place. This is one of the things I don't like about Processing's PVector class.

How about naming this random_permutation() and it returns a shuffled copy of the input data?

@hx2A
Copy link
Collaborator

hx2A commented Jan 24, 2025

@mohitbhasin , we've talked it over, both here and on Mastodon, and we'd changed our mind about two things:

  • Let's call the new method random_permutation()
  • Internally it can call numpy's permutation() method, which returns a copy of the user's data, but randomized. The code should be similar to random_sample() except it will use numpy's permutation() method. This doesn't change the input data in place like shuffle() does.

@GoToLoop
Copy link

How about using Numpy's types?

import numpy as np
from numpy.typing import ArrayLike, NDArray
from collections.abc import Generator

def random_permutation(seq: ArrayLike | Generator) -> NDArray:
    if isinstance(seq, Generator): seq = tuple(seq)
    return np.random.permutation(seq)

@mohitbhasin
Copy link
Contributor Author

How about using Numpy's types?

import numpy as np
from numpy.typing import ArrayLike, NDArray
from collections.abc import Generator

def random_permutation(seq: ArrayLike | Generator) -> NDArray:
    if isinstance(seq, Generator): seq = tuple(seq)
    return np.random.permutation(seq)
  1. Yea the random.permutation will return NDArray so the random_permutation() can return NDArray.
  2. Do we need Generator in the signature? Generator can be passed to the function without it as well.
def random_permutation(self, seq: ArrayLike) -> NDArray:
    if isinstance(seq, Generator): seq = tuple(seq)
    return np.random.permutation(seq)

@GoToLoop
Copy link

2. Do we need Generator in the signature?

Signature type hints are for IDE linters, not for the execution itself.

@mohitbhasin
Copy link
Contributor Author

Got it. Thanks.

@mohitbhasin
Copy link
Contributor Author

@GoToLoop Is there a reason for not using np.random.default_rng() which is used across Math.py?

@hx2A
Copy link
Collaborator

hx2A commented Jan 24, 2025

For the input param, you can use Sequence as the type. That makes it consistent with what we are doing (or should be doing) in other places.

If the input type is a list, the output type should also be a list, not a numpy array. You can follow what random_sample() is doing. I think this new method will be almost the same as random_sample(), I think you can just replace the call to choice() with a call to permutation().

@hx2A
Copy link
Collaborator

hx2A commented Jan 24, 2025

@GoToLoop Is there a reason for not using np.random.default_rng() which is used across Math.py?

You can use self._rng so the output is controlled by the random_seed().

@GoToLoop
Copy link

GoToLoop commented Jan 24, 2025

For the input param, you can use Sequence as the type.

That was my 1st try! However, method permutation() complained about that parameter type inside VSCode. 😞

@mohitbhasin
Copy link
Contributor Author

mohitbhasin commented Jan 24, 2025

@hx2A Thanks for the feedback. Does this look good?

def random_permutation(self, seq: Sequence[Any]) -> Sequence[Any]:
    """$class_Sketch_random_permutation"""
    if isinstance(seq, types.GeneratorType):
        seq = list(seq)
    indices = self._rng.permutation(range(len(seq)))
    if not isinstance(seq, list):
        try:
            return seq[indices]
        except:
            pass
    return [seq[idx] for idx in indices]

Do you thinkif not isinstance(seq, list): needs any consideration, specifically return seq[indices]

…string for the same.

random_permutation() utilizes permutation() of numpy and return a sequence.
@hx2A
Copy link
Collaborator

hx2A commented Jan 24, 2025

For the input param, you can use Sequence as the type.

That was my 1st try! However, method permutation() complained about that parameter type inside VSCode. 😞

@GoToLoop , Really? Interesting. Why would it see a conflict between the Sequence type and a numpy function? What kind of linting do you have set up?

@mohitbhasin , I think that code looks good to me. Did you test it with a list, numpy array, and generator to make sure they all work?

The next step is to write that documentation file.

Do you thinkif not isinstance(seq, list): needs any consideration, specifically return seq[indices]

I'm not sure what you are saying. What do you mean?

@GoToLoop
Copy link

Maybe b/c updating after a long time, VSCodium was glitching somehow.
But now after re-opening the IDE, it doesn't complain about Sequence type anymore:

import numpy as np
from numpy.typing import NDArray
from collections.abc import Generator, Sequence

def random_permutation(seq: Sequence | Generator) -> NDArray:
    if isinstance(seq, Generator): seq = tuple(seq)
    return np.random.permutation(seq)

Just for reference, this is permutation()'s signature according to addon "BasedPyright":
def permutation(x: ArrayLike) -> NDArray[Any]

@mohitbhasin
Copy link
Contributor Author

@mohitbhasin , I think that code looks good to me. Did you test it with a list, numpy array, and generator to make sure they all work?

Yes, I tested list, empty list, generator, tuple, numpy array. All are working as expected.

The next step is to write that documentation file.

I have pushed the code and documentation in the pull request, please review.

Do you thinkif not isinstance(seq, list): needs any consideration, specifically return seq[indices]

I'm not sure what you are saying. What do you mean?

I was not sure for which data type return seq[indices] is used. Now, I understand it is for numpy array. Thanks.

subcategory = random

@@ signatures
random_permutation(seq: Sequence[Any] -> Sequence[Any]
Copy link
Collaborator

@hx2A hx2A Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file looks good. Only thing is this signature is missing the final ) and :

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I added the missing ). I checked other docstrings files, none of them ends with :, so I have not added it.

However, If it should be there, I can add it. Just let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hx2A What do you think about the above change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was good, thank you! I just merged the PR.

You were correct about the :, thank you for paying attention to details.

@hx2A
Copy link
Collaborator

hx2A commented Jan 26, 2025

Almost there, @mohitbhasin !

Can you also add a line to this csv file:

https://github.com/py5coding/py5generator/blob/main/py5_resources/data/sketch.csv

It will be just like random_sample.

That's a data file that py5generator uses to analyze the code.

@mohitbhasin
Copy link
Contributor Author

Almost there, @mohitbhasin !

Can you also add a line to this csv file:

https://github.com/py5coding/py5generator/blob/main/py5_resources/data/sketch.csv

It will be just like random_sample.

That's a data file that py5generator uses to analyze the code.

I have added the entry in sketch.csv for random_permutation()

@hx2A hx2A merged commit 7e655ec into py5coding:main Jan 28, 2025
@mohitbhasin mohitbhasin deleted the add-shuffle-function branch January 28, 2025 17:51
@hx2A hx2A changed the title Adding shuffle function to math.py Adding random_permutation function to math.py Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants