Skip to content

Sync Fork from Upstream Repo #173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ repos:
hooks:
- id: isort
- repo: https://github.com/asottile/pyupgrade
rev: v2.11.0
rev: v2.12.0
hooks:
- id: pyupgrade
args: [--py37-plus]
Expand Down
27 changes: 4 additions & 23 deletions doc/source/whatsnew/v1.2.4.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _whatsnew_124:

What's new in 1.2.4 (April ??, 2021)
---------------------------------------
What's new in 1.2.4 (April 12, 2021)
------------------------------------

These are the changes in pandas 1.2.4. See :ref:`release` for a full changelog
including other versions of pandas.
Expand All @@ -20,27 +20,8 @@ Fixed regressions
- Fixed regression in (in)equality comparison of ``pd.NaT`` with a non-datetimelike numpy array returning a scalar instead of an array (:issue:`40722`)
- Fixed regression in :meth:`DataFrame.where` not returning a copy in the case of an all True condition (:issue:`39595`)
- Fixed regression in :meth:`DataFrame.replace` raising ``IndexError`` when ``regex`` was a multi-key dictionary (:issue:`39338`)
-

.. ---------------------------------------------------------------------------

.. _whatsnew_124.bug_fixes:

Bug fixes
~~~~~~~~~

-
-

.. ---------------------------------------------------------------------------

.. _whatsnew_124.other:

Other
~~~~~

-
-
- Fixed regression in repr of floats in an ``object`` column not respecting ``float_format`` when printed in the console or outputted through :meth:`DataFrame.to_string`, :meth:`DataFrame.to_html`, and :meth:`DataFrame.to_latex` (:issue:`40024`)
- Fixed regression in NumPy ufuncs such as ``np.add`` not passing through all arguments for :class:`DataFrame` (:issue:`40662`)

.. ---------------------------------------------------------------------------

Expand Down
3 changes: 3 additions & 0 deletions doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -584,6 +584,7 @@ Performance improvements
- Performance improvement in :class:`Styler` where render times are more than 50% reduced (:issue:`39972` :issue:`39952`)
- Performance improvement in :meth:`core.window.ewm.ExponentialMovingWindow.mean` with ``times`` (:issue:`39784`)
- Performance improvement in :meth:`.GroupBy.apply` when requiring the python fallback implementation (:issue:`40176`)
- Performance improvement for concatenation of data with type :class:`CategoricalDtype` (:issue:`40193`)

.. ---------------------------------------------------------------------------

Expand Down Expand Up @@ -786,6 +787,7 @@ Reshaping
^^^^^^^^^
- Bug in :func:`merge` raising error when performing an inner join with partial index and ``right_index`` when no overlap between indices (:issue:`33814`)
- Bug in :meth:`DataFrame.unstack` with missing levels led to incorrect index names (:issue:`37510`)
- Bug in :func:`merge_asof` propagating the right Index with ``left_index=True`` and ``right_on`` specification instead of left Index (:issue:`33463`)
- Bug in :func:`join` over :class:`MultiIndex` returned wrong result, when one of both indexes had only one level (:issue:`36909`)
- :meth:`merge_asof` raises ``ValueError`` instead of cryptic ``TypeError`` in case of non-numerical merge columns (:issue:`29130`)
- Bug in :meth:`DataFrame.join` not assigning values correctly when having :class:`MultiIndex` where at least one dimension is from dtype ``Categorical`` with non-alphabetically sorted categories (:issue:`38502`)
Expand Down Expand Up @@ -813,6 +815,7 @@ ExtensionArray

- Bug in :meth:`DataFrame.where` when ``other`` is a :class:`Series` with :class:`ExtensionArray` dtype (:issue:`38729`)
- Fixed bug where :meth:`Series.idxmax`, :meth:`Series.idxmin` and ``argmax/min`` fail when the underlying data is :class:`ExtensionArray` (:issue:`32749`, :issue:`33719`, :issue:`36566`)
- Fixed a bug where some properties of subclasses of :class:`PandasExtensionDtype` where improperly cached (:issue:`40329`)
-

Other
Expand Down
6 changes: 4 additions & 2 deletions pandas/core/arraylike.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,15 +357,17 @@ def reconstruct(result):
# * len(inputs) > 1 is doable when we know that we have
# aligned blocks / dtypes.
inputs = tuple(np.asarray(x) for x in inputs)
result = getattr(ufunc, method)(*inputs)
result = getattr(ufunc, method)(*inputs, **kwargs)
elif self.ndim == 1:
# ufunc(series, ...)
inputs = tuple(extract_array(x, extract_numpy=True) for x in inputs)
result = getattr(ufunc, method)(*inputs, **kwargs)
else:
# ufunc(dataframe)
if method == "__call__":
if method == "__call__" and not kwargs:
# for np.<ufunc>(..) calls
# kwargs cannot necessarily be handled block-by-block, so only
# take this path if there are no kwargs
mgr = inputs[0]._mgr
result = mgr.apply(getattr(ufunc, method))
else:
Expand Down
6 changes: 4 additions & 2 deletions pandas/core/dtypes/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,13 @@
)


def _cast_to_common_type(arr: ArrayLike, dtype: DtypeObj) -> ArrayLike:
def cast_to_common_type(arr: ArrayLike, dtype: DtypeObj) -> ArrayLike:
"""
Helper function for `arr.astype(common_dtype)` but handling all special
cases.
"""
if is_dtype_equal(arr.dtype, dtype):
return arr
if (
is_categorical_dtype(arr.dtype)
and isinstance(dtype, np.dtype)
Expand Down Expand Up @@ -121,7 +123,7 @@ def is_nonempty(x) -> bool:
# for axis=0
if not single_dtype:
target_dtype = find_common_type([x.dtype for x in to_concat])
to_concat = [_cast_to_common_type(arr, target_dtype) for arr in to_concat]
to_concat = [cast_to_common_type(arr, target_dtype) for arr in to_concat]

if isinstance(to_concat[0], ExtensionArray):
cls = type(to_concat[0])
Expand Down
30 changes: 17 additions & 13 deletions pandas/core/dtypes/dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import pytz

from pandas._libs.interval import Interval
from pandas._libs.properties import cache_readonly
from pandas._libs.tslibs import (
BaseOffset,
NaT,
Expand Down Expand Up @@ -81,7 +82,7 @@ class PandasExtensionDtype(ExtensionDtype):
base: DtypeObj | None = None
isbuiltin = 0
isnative = 0
_cache: dict[str_type, PandasExtensionDtype] = {}
_cache_dtypes: dict[str_type, PandasExtensionDtype] = {}

def __str__(self) -> str_type:
"""
Expand All @@ -105,7 +106,7 @@ def __getstate__(self) -> dict[str_type, Any]:
@classmethod
def reset_cache(cls) -> None:
""" clear the cache """
cls._cache = {}
cls._cache_dtypes = {}


class CategoricalDtypeType(type):
Expand Down Expand Up @@ -177,7 +178,7 @@ class CategoricalDtype(PandasExtensionDtype, ExtensionDtype):
str = "|O08"
base = np.dtype("O")
_metadata = ("categories", "ordered")
_cache: dict[str_type, PandasExtensionDtype] = {}
_cache_dtypes: dict[str_type, PandasExtensionDtype] = {}

def __init__(self, categories=None, ordered: Ordered = False):
self._finalize(categories, ordered, fastpath=False)
Expand Down Expand Up @@ -355,7 +356,7 @@ def __hash__(self) -> int:
else:
return -2
# We *do* want to include the real self.ordered here
return int(self._hash_categories(self.categories, self.ordered))
return int(self._hash_categories)

def __eq__(self, other: Any) -> bool:
"""
Expand Down Expand Up @@ -429,14 +430,17 @@ def __repr__(self) -> str_type:
data = data.rstrip(", ")
return f"CategoricalDtype(categories={data}, ordered={self.ordered})"

@staticmethod
def _hash_categories(categories, ordered: Ordered = True) -> int:
@cache_readonly
def _hash_categories(self) -> int:
from pandas.core.util.hashing import (
combine_hash_arrays,
hash_array,
hash_tuples,
)

categories = self.categories
ordered = self.ordered

if len(categories) and isinstance(categories[0], tuple):
# assumes if any individual category is a tuple, then all our. ATM
# I don't really want to support just some of the categories being
Expand Down Expand Up @@ -671,7 +675,7 @@ class DatetimeTZDtype(PandasExtensionDtype):
na_value = NaT
_metadata = ("unit", "tz")
_match = re.compile(r"(datetime64|M8)\[(?P<unit>.+), (?P<tz>.+)\]")
_cache: dict[str_type, PandasExtensionDtype] = {}
_cache_dtypes: dict[str_type, PandasExtensionDtype] = {}

def __init__(self, unit: str_type | DatetimeTZDtype = "ns", tz=None):
if isinstance(unit, DatetimeTZDtype):
Expand Down Expand Up @@ -837,7 +841,7 @@ class PeriodDtype(dtypes.PeriodDtypeBase, PandasExtensionDtype):
num = 102
_metadata = ("freq",)
_match = re.compile(r"(P|p)eriod\[(?P<freq>.+)\]")
_cache: dict[str_type, PandasExtensionDtype] = {}
_cache_dtypes: dict[str_type, PandasExtensionDtype] = {}

def __new__(cls, freq=None):
"""
Expand All @@ -859,12 +863,12 @@ def __new__(cls, freq=None):
freq = cls._parse_dtype_strict(freq)

try:
return cls._cache[freq.freqstr]
return cls._cache_dtypes[freq.freqstr]
except KeyError:
dtype_code = freq._period_dtype_code
u = dtypes.PeriodDtypeBase.__new__(cls, dtype_code)
u._freq = freq
cls._cache[freq.freqstr] = u
cls._cache_dtypes[freq.freqstr] = u
return u

def __reduce__(self):
Expand Down Expand Up @@ -1042,7 +1046,7 @@ class IntervalDtype(PandasExtensionDtype):
_match = re.compile(
r"(I|i)nterval\[(?P<subtype>[^,]+)(, (?P<closed>(right|left|both|neither)))?\]"
)
_cache: dict[str_type, PandasExtensionDtype] = {}
_cache_dtypes: dict[str_type, PandasExtensionDtype] = {}

def __new__(cls, subtype=None, closed: str_type | None = None):
from pandas.core.dtypes.common import (
Expand Down Expand Up @@ -1099,12 +1103,12 @@ def __new__(cls, subtype=None, closed: str_type | None = None):

key = str(subtype) + str(closed)
try:
return cls._cache[key]
return cls._cache_dtypes[key]
except KeyError:
u = object.__new__(cls)
u._subtype = subtype
u._closed = closed
cls._cache[key] = u
cls._cache_dtypes[key] = u
return u

@property
Expand Down
116 changes: 112 additions & 4 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -7352,16 +7352,124 @@ def _clip_with_one_bound(self, threshold, method, axis, inplace):
threshold = align_method_FRAME(self, threshold, axis, flex=None)[1]
return self.where(subset, threshold, axis=axis, inplace=inplace)

@overload
def clip(
self: FrameOrSeries,
lower=...,
upper=...,
axis: Axis | None = ...,
inplace: Literal[False] = ...,
*args,
**kwargs,
) -> FrameOrSeries:
...

@overload
def clip(
self: FrameOrSeries,
lower,
*,
axis: Axis | None,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
lower,
*,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
*,
upper,
axis: Axis | None,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
*,
upper,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
*,
axis: Axis | None,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
lower,
upper,
axis: Axis | None,
inplace: Literal[True],
*args,
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
lower,
upper,
*,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
*,
inplace: Literal[True],
**kwargs,
) -> None:
...

@overload
def clip(
self: FrameOrSeries,
lower=...,
upper=...,
axis: Axis | None = ...,
inplace: bool_t = ...,
*args,
**kwargs,
) -> FrameOrSeries | None:
...

@final
def clip(
self: FrameOrSeries,
lower=None,
upper=None,
axis=None,
axis: Axis | None = None,
inplace: bool_t = False,
*args,
**kwargs,
) -> FrameOrSeries:
) -> FrameOrSeries | None:
"""
Trim values at input threshold(s).

Expand Down Expand Up @@ -10843,7 +10951,7 @@ def median(
@doc(
_num_doc,
desc="Return the maximum of the values over the requested axis.\n\n"
"If you want the *index* of the maximum, use ``idxmax``. This is"
"If you want the *index* of the maximum, use ``idxmax``. This is "
"the equivalent of the ``numpy.ndarray`` method ``argmax``.",
name1=name1,
name2=name2,
Expand All @@ -10860,7 +10968,7 @@ def max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs):
@doc(
_num_doc,
desc="Return the minimum of the values over the requested axis.\n\n"
"If you want the *index* of the minimum, use ``idxmin``. This is"
"If you want the *index* of the minimum, use ``idxmin``. This is "
"the equivalent of the ``numpy.ndarray`` method ``argmin``.",
name1=name1,
name2=name2,
Expand Down
Loading