Skip to content

bpo-33144: random.Random and subclasses: split _randbelow implementation #6291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 17, 2018

Conversation

wm75
Copy link

@wm75 wm75 commented Mar 28, 2018

Splits the getrandbits-dependent and -independent branches of
random.Random._randbelow into separate methods and selects the
implementation to be used by Random and its subclasses at class
creation time for increased performance.

https://bugs.python.org/issue33144

Lib/random.py Outdated
def __init_subclass__(cls, **kwargs):
# Only call self.getrandbits if the original random() builtin method
# has not been overridden or if a new getrandbits() was supplied.
if type(cls.__dict__.get('getrandbits')) is _FunctionType:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just 'getrandbits' in cls.__dict__?

Copy link
Author

@wm75 wm75 Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to preserve the existing behavior as much as possible; if getrandbits gets overridden with just an attribute instead of a method, the fallback mechanism stays activated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already change the behavior by allowing getrandbits() to win even if random() is overridden in a parent. I think it would be better to allow overriding getrandbits() not only with Python functions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meaning that when you override it with something that isn't callable _randbelow will fail with an exception; it doesn't do that now, but I agree it may actually be the right thing to do because such a situation would probably only indicate a mistake.

Lib/random.py Outdated
# so we can only use random() from here.
cls._randbelow = cls._randbelow_without_getrandbits
else:
cls._randbelow = getattr(cls, cls._randbelow.__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the exotic case that a subclass overrides one of the private _randbelow implementations without providing random or getrandbits methods. Even though that would be considered a hack, inheritance mechanisms should still work and _randbelow_with_getrandbits or _randbelow_without_getrandbits get replaced. The test with Subclass5 is for this situation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It deserves a comment in the source though. I'll add it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_randbelow is an implementation detail. It isn't designed for overriding. The code will be simpler if don't support this hack.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that one extra line (I'm not really counting the else because if/elif without else looks questionable anyway) is worth it. Hack or not, our class-level manipulation of things should not make inheritance unpredictable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serhiy is correct in pointing out the _randbelow() is an implementation detail. We really don't want people using (overriding) this non public api.

Let's not feature-creep this optimization patch. Ideally, it should be the smallest possible change that moves algorithmic decision to class instantiation time.

Also, any possible "bug fix" to change the exception raised should be in a separate PR. Unless we've guaranteed a particular exception (which is not usually the case), the C code is allowed to differ in small ways from the Python code. That isn't to say that it shouldn't be changes, but that it is a separate discussion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok: separate PR #6338 and corresponding https://bugs.python.org/issue33203 for the ValueError fix. Once that's settled I'll rebase this PR.

Speaking of not "feature-creeping" this simple optimization: would you prefer this patch then to retain the old behaviour that as soon as getrandbits was overridden once in a subclass, overriding random later again did not have an effect anymore on _randbelow (see Serhiy's Case 2 in https://bugs.python.org/msg314534 )?

random.Random.__init__(self)
Subclass(newarg=1)

def test_overriding_random(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses implementation details. It is possible to test this behavior using only public API. Make random() and getrandbits() in subclasses logging their calls, and check what methods were used when call randrange(). This will make the test valid even if the implementation be changed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I'll look into it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe we need to keep a simple white-box test for random.Random._randbelow and a subclass that don't override random or getrandbits, because there is no way to test this as a black box. It is worth to mark this test as CPython-only.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable: I'm going to work on all the changes tomorrow, thanks.

@serhiy-storchaka serhiy-storchaka added the performance Performance or resource usage label Mar 28, 2018
Copy link
Contributor

@rhettinger rhettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a blurb entry and resolve the conflict caused by the separate patch for the ZeroDivisionError.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Splits the getrandbits-dependent and -independent branches of
random.Random._randbelow into separate methods and selects the
implementation to be used by Random and its subclasses at class
creation time for increased performance.
@wm75 wm75 force-pushed the random-improve branch from 2f77a64 to cc1a10c Compare April 9, 2018 16:30
@wm75
Copy link
Author

wm75 commented Apr 10, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@rhettinger: please review the changes made to this pull request.

@@ -221,22 +242,23 @@ def randint(self, a, b):

return self.randrange(a, b+1)

def _randbelow(self, n, int=int, maxsize=1<<BPF, type=type,
Method=_MethodType, BuiltinMethod=_BuiltinMethodType):
def _randbelow_with_getrandbits(self, n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call this _randbelow and only patch the without getrandbits case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to keep a reference to the "with getrandbits" implementation to be able to return to using it in the case:

class Rand1(Random):
    def random(self): ...
        # _randbelow should use random()

class Rand2(Rand1):
    def getrandbits(self): ...
         # _randbelow should use getrandbits() now again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see what you're trying to do.

ranges.
"""

if (cls.random is _random.Random.random) or (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write this as:

if 'getrandbits' in cls.__dict__:
    cls._randbelow = cls._randbelow_with_getrandbits
elif 'random' in cls.__dict__:
    cls._randbelow = cls._randbelow_without_getrandbits
#else inherits from the parent

return super().random()

def getrandbits(self, n):
logging.getLogger('getrandbits').info('used getrandbits')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using logging for testing looks very... strange. You could just set a nonlocal variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants