-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-33144: random.Random and subclasses: split _randbelow implementation #6291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Lib/random.py
Outdated
def __init_subclass__(cls, **kwargs): | ||
# Only call self.getrandbits if the original random() builtin method | ||
# has not been overridden or if a new getrandbits() was supplied. | ||
if type(cls.__dict__.get('getrandbits')) is _FunctionType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just 'getrandbits' in cls.__dict__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to preserve the existing behavior as much as possible; if getrandbits gets overridden with just an attribute instead of a method, the fallback mechanism stays activated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already change the behavior by allowing getrandbits() to win even if random() is overridden in a parent. I think it would be better to allow overriding getrandbits() not only with Python functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meaning that when you override it with something that isn't callable _randbelow will fail with an exception; it doesn't do that now, but I agree it may actually be the right thing to do because such a situation would probably only indicate a mistake.
Lib/random.py
Outdated
# so we can only use random() from here. | ||
cls._randbelow = cls._randbelow_without_getrandbits | ||
else: | ||
cls._randbelow = getattr(cls, cls._randbelow.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the exotic case that a subclass overrides one of the private _randbelow implementations without providing random or getrandbits methods. Even though that would be considered a hack, inheritance mechanisms should still work and _randbelow_with_getrandbits or _randbelow_without_getrandbits get replaced. The test with Subclass5 is for this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It deserves a comment in the source though. I'll add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_randbelow
is an implementation detail. It isn't designed for overriding. The code will be simpler if don't support this hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that one extra line (I'm not really counting the else because if/elif without else looks questionable anyway) is worth it. Hack or not, our class-level manipulation of things should not make inheritance unpredictable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serhiy is correct in pointing out the _randbelow() is an implementation detail. We really don't want people using (overriding) this non public api.
Let's not feature-creep this optimization patch. Ideally, it should be the smallest possible change that moves algorithmic decision to class instantiation time.
Also, any possible "bug fix" to change the exception raised should be in a separate PR. Unless we've guaranteed a particular exception (which is not usually the case), the C code is allowed to differ in small ways from the Python code. That isn't to say that it shouldn't be changes, but that it is a separate discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok: separate PR #6338 and corresponding https://bugs.python.org/issue33203 for the ValueError fix. Once that's settled I'll rebase this PR.
Speaking of not "feature-creeping" this simple optimization: would you prefer this patch then to retain the old behaviour that as soon as getrandbits was overridden once in a subclass, overriding random later again did not have an effect anymore on _randbelow (see Serhiy's Case 2 in https://bugs.python.org/msg314534 )?
Lib/test/test_random.py
Outdated
random.Random.__init__(self) | ||
Subclass(newarg=1) | ||
|
||
def test_overriding_random(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test uses implementation details. It is possible to test this behavior using only public API. Make random()
and getrandbits()
in subclasses logging their calls, and check what methods were used when call randrange()
. This will make the test valid even if the implementation be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I'll look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe we need to keep a simple white-box test for random.Random._randbelow and a subclass that don't override random or getrandbits, because there is no way to test this as a black box. It is worth to mark this test as CPython-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable: I'm going to work on all the changes tomorrow, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a blurb entry and resolve the conflict caused by the separate patch for the ZeroDivisionError.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Splits the getrandbits-dependent and -independent branches of random.Random._randbelow into separate methods and selects the implementation to be used by Random and its subclasses at class creation time for increased performance.
I have made the requested changes; please review again. |
Thanks for making the requested changes! @rhettinger: please review the changes made to this pull request. |
@@ -221,22 +242,23 @@ def randint(self, a, b): | |||
|
|||
return self.randrange(a, b+1) | |||
|
|||
def _randbelow(self, n, int=int, maxsize=1<<BPF, type=type, | |||
Method=_MethodType, BuiltinMethod=_BuiltinMethodType): | |||
def _randbelow_with_getrandbits(self, n): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just call this _randbelow and only patch the without getrandbits case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to keep a reference to the "with getrandbits" implementation to be able to return to using it in the case:
class Rand1(Random):
def random(self): ...
# _randbelow should use random()
class Rand2(Rand1):
def getrandbits(self): ...
# _randbelow should use getrandbits() now again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see what you're trying to do.
ranges. | ||
""" | ||
|
||
if (cls.random is _random.Random.random) or ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write this as:
if 'getrandbits' in cls.__dict__:
cls._randbelow = cls._randbelow_with_getrandbits
elif 'random' in cls.__dict__:
cls._randbelow = cls._randbelow_without_getrandbits
#else inherits from the parent
return super().random() | ||
|
||
def getrandbits(self, n): | ||
logging.getLogger('getrandbits').info('used getrandbits') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using logging for testing looks very... strange. You could just set a nonlocal variable.
Splits the getrandbits-dependent and -independent branches of
random.Random._randbelow into separate methods and selects the
implementation to be used by Random and its subclasses at class
creation time for increased performance.
https://bugs.python.org/issue33144