Skip to content

API: Handle pow & rpow special cases #30097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 8, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 27 additions & 3 deletions doc/source/reference/arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

.. _api.arrays:

=============
Pandas arrays
=============
=========================
Pandas arrays and scalars
=========================

.. currentmodule:: pandas

Expand All @@ -28,6 +28,30 @@ Strings :class:`StringDtype` :class:`str` :ref:`api.array
Boolean (with NA) :class:`BooleanDtype` :class:`bool` :ref:`api.arrays.bool`
=================== ========================= ================== =============================

As the table shows, each extension type is associated with an array class. Pandas may define
a dedicated scalar for the type (for example, :class:`arrays.IntervalArray` uses :class:`Interval`)
or it may re-use Python's scalars (for example, :class:`StringArray` uses Python's :class:`str`).

Additionally, pandas defines a singleton scalar missing value :class:`pandas.NA`. This
value is distinct from ``float('nan')``, :attr:`numpy.nan` and Python's :class:`None`.

.. autosummary::
:toctree: api/

NA

In binary operations, :class:`NA` is treated as numeric. Generally, ``NA`` propagates, so
the result of ``op(NA, other)`` will be ``NA``. There are a few special cases when the
result is known, even when one of the operands is ``NA``.

* ``pd.NA ** 0`` is always 0.
* ``1 ** pd.NA`` is always 1.

In logical operations, :class:`NA` uses Kleene logic.

Creating Arrays
---------------

Pandas and third-party libraries can extend NumPy's type system (see :ref:`extending.extension-types`).
The top-level :meth:`array` method can be used to create a new array, which may be
stored in a :class:`Series`, :class:`Index`, or as a column in a :class:`DataFrame`.
Expand Down
25 changes: 23 additions & 2 deletions pandas/_libs/missing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -365,8 +365,6 @@ class NAType(C_NAType):
__rmod__ = _create_binary_propagating_op("__rmod__")
__divmod__ = _create_binary_propagating_op("__divmod__", divmod=True)
__rdivmod__ = _create_binary_propagating_op("__rdivmod__", divmod=True)
__pow__ = _create_binary_propagating_op("__pow__")
__rpow__ = _create_binary_propagating_op("__rpow__")
# __lshift__ and __rshift__ are not implemented

__eq__ = _create_binary_propagating_op("__eq__")
Expand All @@ -383,6 +381,29 @@ class NAType(C_NAType):
__abs__ = _create_unary_propagating_op("__abs__")
__invert__ = _create_unary_propagating_op("__invert__")

# pow has special
def __pow__(self, other):
if other is C_NA:
return NA
elif isinstance(other, (numbers.Number, np.bool_)):
if other == 0:
return other
else:
return NA

return NotImplemented

def __rpow__(self, other):
if other is C_NA:
return NA
elif isinstance(other, (numbers.Number, np.bool_)):
if other == 1:
return other
else:
return NA

return NotImplemented

# Logical ops using Kleene logic

def __and__(self, other):
Expand Down
24 changes: 23 additions & 1 deletion pandas/tests/scalar/test_na_scalar.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,14 @@ def test_arithmetic_ops(all_arithmetic_functions):
op = all_arithmetic_functions

for other in [NA, 1, 1.0, "a", np.int64(1), np.nan]:
if op.__name__ == "rmod" and isinstance(other, str):
if op.__name__ in ("pow", "rpow", "rmod") and isinstance(other, str):
continue
if op.__name__ in ("divmod", "rdivmod"):
assert op(NA, other) is (NA, NA)
else:
if op.__name__ == "rpow":
# avoid special case
other += 1
assert op(NA, other) is NA


Expand All @@ -69,6 +72,25 @@ def test_comparison_ops():
assert (other <= NA) is NA


@pytest.mark.parametrize(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are fixtures defined for zero / one values in pandas/tests/indexing/conftest.py - can these be combined with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to include arrays. We just want a scalar here I think.

"value", [0, 0.0, False, np.bool_(False), np.int_(0), np.float_(0)]
)
def test_pow_special(value):
result = pd.NA ** value
assert isinstance(result, type(value))
assert result == 0


@pytest.mark.parametrize(
"value", [1, 1.0, True, np.bool_(True), np.int_(1), np.float_(1)]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corner case to check -0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

In [4]: -1 ** np.nan
Out[4]: -1.0

In [5]: np.nan ** -0
Out[5]: 1.0

Will match the NumPy behavior here.

def test_rpow_special(value):
result = value ** pd.NA
assert result == 1
if not isinstance(value, (np.float_, np.bool_, np.int_)):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somebody (Cython?) is converting these to Python scalars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried with/without numexpr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can numexpr be involved in this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont know, but if we're brainstorming things that can do surprising conversions, it comes to minid

assert isinstance(result, type(value))


def test_unary_ops():
assert +NA is NA
assert -NA is NA
Expand Down