Skip to content

bpo-37863: Optimize Fraction.__hash__() #15298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 16, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 11 additions & 15 deletions Lib/fractions.py
Original file line number Diff line number Diff line change
Expand Up @@ -556,23 +556,19 @@ def __round__(self, ndigits=None):
def __hash__(self):
"""hash(self)"""

# XXX since this method is expensive, consider caching the result

# In order to make sure that the hash of a Fraction agrees
# with the hash of a numerically equal integer, float or
# Decimal instance, we follow the rules for numeric hashes
# outlined in the documentation. (See library docs, 'Built-in
# Types').

# dinv is the inverse of self._denominator modulo the prime
# _PyHASH_MODULUS, or 0 if self._denominator is divisible by
# _PyHASH_MODULUS.
dinv = pow(self._denominator, _PyHASH_MODULUS - 2, _PyHASH_MODULUS)
if not dinv:
# To make sure that the hash of a Fraction agrees with the hash
# of a numerically equal integer, float or Decimal instance, we
# follow the rules for numeric hashes outlined in the
# documentation. (See library docs, 'Built-in Types').

try:
dinv = pow(self._denominator, -1, _PyHASH_MODULUS)
except ValueError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case ValueError is raised?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueError is raised if and only if _PyHASH_MODULUS divides self._denominator. There is no modular inverse if and only if the gcd of the base and the modulus is non-trivial (not 1), but the modulus is a prime in this case, so in this case that's the same as the base being a multiple of the modulus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. We need a test for this.

# ValueError means there is no modular inverse
hash_ = _PyHASH_INF
else:
hash_ = abs(self._numerator) * dinv % _PyHASH_MODULUS
result = hash_ if self >= 0 else -hash_
hash_ = hash(abs(self._numerator)) * dinv % _PyHASH_MODULUS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is to write this as

hash_ = hash(hash(abs(self._numerator)) * dinv)

to avoid the (in general) 122-by-61 bit % division. The int hash should compute the same thing, but doesn't do any division (it's all bit rotations and adds, one pair per internal int "digit"). (While the docs don't point this out, reduction by _PyHASH_MODULUS in binary is akin to "casting out nines" in decimal to find a remainder mod 9 - no division is really needed.)

result = hash_ if self._numerator >= 0 else -hash_
return -2 if result == -1 else result

def __eq__(a, b):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Optimizations for Fraction.__hash__ suggested by Tim Peters.