Skip to content

bpo-29882: Add an efficient popcount method for integers #771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 29, 2020
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,27 @@ class`. In addition, it provides a few more methods:

.. versionadded:: 3.1

.. method:: int.bit_count()

Return the number of ones in the binary representation of the absolute
value of the integer. This is also known as the population count.
Example::

>>> n = 19
>>> bin(n)
'0b10011'
>>> n.bit_count()
3
>>> (-n).bit_count()
3

Equivalent to::

def bit_count(self):
return bin(self).count("1")

.. versionadded:: 3.10

.. method:: int.to_bytes(length, byteorder, \*, signed=False)

Return an array of bytes representing an integer.
Expand Down
3 changes: 3 additions & 0 deletions Doc/whatsnew/3.10.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,9 @@ Summary -- Release highlights
New Features
============

* The :class:`int` type has a new method :meth:`int.bit_count`, returning the
number of ones in the binary expansion of a given integer, also known
as the population count. (Contributed by Niklas Fiekas in :issue:`29882`.)


Other Language Changes
Expand Down
3 changes: 2 additions & 1 deletion Lib/test/test_doctest.py
Original file line number Diff line number Diff line change
Expand Up @@ -669,7 +669,7 @@ def non_Python_modules(): r"""
True
>>> real_tests = [t for t in tests if len(t.examples) > 0]
>>> len(real_tests) # objects that actually have doctests
13
14
>>> for t in real_tests:
... print('{} {}'.format(len(t.examples), t.name))
...
Expand All @@ -682,6 +682,7 @@ def non_Python_modules(): r"""
1 builtins.hex
1 builtins.int
3 builtins.int.as_integer_ratio
2 builtins.int.bit_count
2 builtins.int.bit_length
5 builtins.memoryview.hex
1 builtins.oct
Expand Down
11 changes: 11 additions & 0 deletions Lib/test/test_long.py
Original file line number Diff line number Diff line change
Expand Up @@ -1016,6 +1016,17 @@ def test_bit_length(self):
self.assertEqual((a+1).bit_length(), i+1)
self.assertEqual((-a-1).bit_length(), i+1)

def test_bit_count(self):
for a in range(-1000, 1000):
self.assertEqual(a.bit_count(), bin(a).count("1"))

for exp in [10, 17, 63, 64, 65, 1009, 70234, 1234567]:
a = 2**exp
self.assertEqual(a.bit_count(), 1)
self.assertEqual((a - 1).bit_count(), exp)
self.assertEqual((a ^ 63).bit_count(), 7)
self.assertEqual(((a - 1) ^ 510).bit_count(), exp - 8)

def test_round(self):
# check round-half-even algorithm. For round to nearest ten;
# rounding map is invariant under adding multiples of 20
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :meth:`int.bit_count()`, counting the number of ones in the binary
representation of an integer. Patch by Niklas Fiekas.
27 changes: 26 additions & 1 deletion Objects/clinic/longobject.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

70 changes: 70 additions & 0 deletions Objects/longobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -5304,6 +5304,75 @@ int_bit_length_impl(PyObject *self)
return NULL;
}

static int
popcount_digit(digit d)
{
/* 32bit SWAR popcount. */
uint32_t u = d;
u -= (u >> 1) & 0x55555555;
u = (u & 0x33333333) + ((u >> 2) & 0x33333333);
u = (u + (u >> 4)) & 0x0f0f0f0f;
return (u * 0x01010101) >> 24;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that this line isn't quite portable. The code is relying on the multiplication being a multiplication of 32-bit integers, so that the result is implicitly reduced modulo 2^32.

However, if we're on a platform with a 64-bit int then 0x01010101 will have type int (C99 6.4.4.1p5), u will be promoted to int (following the "usual arithmetic conversions" in C99 6.3.1.8p1), the multiplication will be a multiplication of 64-bit integers and we'll end up with the wrong result.

Copy link
Member

@mdickinson mdickinson May 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix implemented, in the shape of an extra (uint32_t) cast on the multiplication result along with some U suffixes on the integer constants.

Note that the U suffix on 0x01010101U is actually necessary to avoid undefined behaviour: consider a (hypothetical, but permissible under the C standard) machine with a 48-bit int. If we do u * 0x01010101 on that machine, then first the integer promotions are applied to both arguments, and they both end up being of type int. Then we do an int-by-int multiplication. With the right (wrong?) value of d, this could overflow, giving undefined behaviour.

But with the U suffix, we end up after the integer promotions multiplying an int by an unsigned int, and then the usual arithmetic conversions kick in and the multiplication is performed as type unsigned int, which has well-defined wraparound behaviour for the result.

The other U suffixes aren't strictly necessary; they're just there to satisfy my OCD and make the code easier to think about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python/hamt.c has basically a copy of this function, it's just written a little bit differently, but its return type is uint32_t. It exists since Python 3.8 and we didn't get any bug report :-)

I added pycore_byteswap.h. Maybe we could have a pycore_bitutils.h for such function and declare it as static inline, to reuse it here and in hamt.c. Or it can be done later ;-)

@pitrou also asked if "pycore_byteswap.h" filename was not too specific, maybe we could pick a more generic name to put more functions there. Like "bits and bytes utilities" :-)

Note: _testinternalcapi has tests for the byte swap functions ;-) We could have tests on this popcount function as well to detect if a compiler "miscompiles" this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm volunteer to move this function as static inline function and add the new test ;-) As I wrote, it can be done later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exists since Python 3.8 and we didn't get any bug report :-)

That's probably because it was never run on a machine with a 64-bit C int. Which is not surprising, because it's very hard to find such a machine these days. But who knows what could happen in the future. :-)

The version of the code in hamt.c has the same issues: it'll return the wrong result with a 64-bit C int, and it can technically invoke undefined behaviour. The return type of uint32_t on that function doesn't help, because you already have the wrong number / undefined behaviour before you hit the return.

Realistically, the undefined behaviour is unlikely to ever be an issue in practice. But the wrong result with a 64-bit int isn't entirely implausible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this, then we can look at combining the implementations or fixing the hamt.c code as necessary.

}

/*[clinic input]
int.bit_count

Number of ones in the binary representation of the absolute value of self.

Also known as the population count.

>>> bin(13)
'0b1101'
>>> (13).bit_count()
3
[clinic start generated code]*/

static PyObject *
int_bit_count_impl(PyObject *self)
/*[clinic end generated code: output=2e571970daf1e5c3 input=7e0adef8e8ccdf2e]*/
{
assert(self != NULL);
assert(PyLong_Check(self));

PyLongObject *z = (PyLongObject *)self;
Py_ssize_t ndigits = Py_ABS(Py_SIZE(z));
Py_ssize_t bit_count = 0;

/* Each digit has up to PyLong_SHIFT ones, so the accumulated bit count
from the first PY_SSIZE_T_MAX/PyLong_SHIFT digits can't overflow a
Py_ssize_t. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comment, thanks! Before the comment, I was surprised by the two loops.

Py_ssize_t ndigits_fast = Py_MIN(ndigits, PY_SSIZE_T_MAX/PyLong_SHIFT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed to work use size_t for bit_count, but then I realzied that i is a Py_ssize_t, so it's better to leave the code as it is: work on Py_ssize_t for i and bit_count.

for (Py_ssize_t i = 0; i < ndigits_fast; i++) {
bit_count += popcount_digit(z->ob_digit[i]);
}

PyObject *result = PyLong_FromSsize_t(bit_count);
if (result == NULL) {
return NULL;
}

/* Use Python integers if bit_count would overflow. */
for (Py_ssize_t i = ndigits_fast; i < ndigits; i++) {
PyObject *x = PyLong_FromLong(popcount_digit(z->ob_digit[i]));
if (x == NULL) {
goto error;
}
PyObject *y = long_add((PyLongObject *)result, (PyLongObject *)x);
Py_DECREF(x);
if (y == NULL) {
goto error;
}
Py_DECREF(result);
result = y;
}

return result;

error:
Py_DECREF(result);
return NULL;
}

/*[clinic input]
int.as_integer_ratio
Expand Down Expand Up @@ -5460,6 +5529,7 @@ static PyMethodDef long_methods[] = {
{"conjugate", long_long_meth, METH_NOARGS,
"Returns self, the complex conjugate of any int."},
INT_BIT_LENGTH_METHODDEF
INT_BIT_COUNT_METHODDEF
INT_TO_BYTES_METHODDEF
INT_FROM_BYTES_METHODDEF
INT_AS_INTEGER_RATIO_METHODDEF
Expand Down