-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
ENH: Handle extension arrays in algorithms.diff #31025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
fcde96b
7c5e6f7
3cc7c11
5017912
dfea6a5
38fe40c
fc6eef0
84e5e93
4183b5b
ab9b23f
2f5d55f
e0ce8be
bd18da2
1c0a9fe
f3af8f5
4d0c5cf
6843e2b
bd6c157
7861f57
a496f13
869ce96
8fa2836
d34ffe3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -578,6 +578,11 @@ def dropna(self): | |
""" | ||
return self[~self.isna()] | ||
|
||
def diff(self, periods: int = 1): | ||
if hasattr(self, "__sub__"): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not thrilled about this, but it may be unavoidable. Happy to hear of an alternative. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But if it doesn't support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. I meant the need for the hasattr just to please the type checker. If you do a diff on an EA that doens't implement |
||
return self - self.shift(periods) | ||
raise TypeError() | ||
|
||
def shift(self, periods: int = 1, fill_value: object = None) -> ABCExtensionArray: | ||
""" | ||
Shift values by desired number. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,7 +15,7 @@ | |
from pandas.core.dtypes.missing import isna | ||
|
||
from pandas import compat | ||
from pandas.core import nanops | ||
from pandas.core import algorithms, nanops | ||
from pandas.core.algorithms import searchsorted, take, unique | ||
from pandas.core.arrays.base import ExtensionArray, ExtensionOpsMixin | ||
import pandas.core.common as com | ||
|
@@ -164,6 +164,10 @@ def _from_sequence(cls, scalars, dtype=None, copy=False): | |
result = result.copy() | ||
return cls(result) | ||
|
||
def diff(self, periods: int = 1): | ||
result = algorithms.diff(com.values_from_object(self._ndarray), periods) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do you need to override the base class version? Doesn't that work as well for numpy? |
||
return type(self)(result) | ||
|
||
@classmethod | ||
def _from_factorized(cls, values, original): | ||
return cls(values) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1962,6 +1962,14 @@ class ObjectValuesExtensionBlock(ExtensionBlock): | |
Series[T].values is an ndarray of objects. | ||
""" | ||
|
||
def diff(self, n: int, axis: int = 1) -> List["Block"]: | ||
# Block.shape vs. Block.values.shape mismatch | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More technical debt from the 1D arrays inside 2D blocks :( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, and this is on ObjectValuesExtensionBlock, but is only useful for PeriodArray. IntervalArray is the only other array to use this, and doesn't implement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this only needed for ObjectValuesExtensionBlock, and not for ExtensionBlock? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suppose that in principle, we can hit this from ExtensionBlock. We hit the problem when going from a NonConsolidatable block type (like period) to a consolidatable one (like object). In that case, the values passed to In practice, I think that for most EAs, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It certainly seems fine to ignore this corner case for now.
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Do the op, get the object-dtype ndarray, and reshape | ||
# to put into an ObjectBlock | ||
new_values = algos.diff(self.values, n, axis=axis) | ||
new_values = np.atleast_2d(new_values) | ||
return [self.make_block(values=new_values)] | ||
|
||
def external_values(self, dtype=None): | ||
return self.values.astype(object) | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.