-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
TST: base test for ExtensionArray.astype to its own type + copy keyword #35116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 18 commits
2b8f738
6cc98c9
ecbc762
4fa6d92
6ff48ea
1470c6b
9e96547
2f287ca
c7baa62
b68f4fb
462704d
609b255
b761926
72cb49e
ced29ed
7762eb9
4ff47fb
d5095d4
5f464a6
7f45370
f134cf4
fbbe28a
dcd884d
a4a32db
b5b1ccc
5720f67
7155e0f
047de22
3dcef56
854d2bb
42cd940
10f0f64
3f34d28
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,7 @@ | |
from numpy.lib.mixins import NDArrayOperatorsMixin | ||
|
||
from pandas._libs import lib | ||
from pandas._typing import Scalar | ||
from pandas._typing import Scalar, ArrayLike | ||
from pandas.compat.numpy import function as nv | ||
from pandas.util._decorators import doc | ||
from pandas.util._validators import validate_fillna_kwargs | ||
|
@@ -276,6 +276,15 @@ def __setitem__(self, key, value) -> None: | |
|
||
self._ndarray[key] = value | ||
|
||
def astype(self, dtype, copy: bool = True) -> ArrayLike: | ||
if dtype == self.dtype: | ||
if not copy: | ||
return self | ||
elif copy: | ||
return self.copy() | ||
else: | ||
return super().astype(dtype, copy) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to move this to the base class? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, tried it and it worked, seems logically consistent. Going to also look to see if I can drop the checks elsewhere by leaving it to the base class. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesn't seem so, without doing more restructuring than I'm comfortable with ATM. |
||
|
||
def isna(self) -> np.ndarray: | ||
return isna(self._ndarray) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -263,6 +263,11 @@ def fillna(self, value=None, method=None, limit=None): | |
return super().fillna(value, method, limit) | ||
|
||
def astype(self, dtype, copy=True): | ||
if dtype == self.dtype: | ||
if not copy: | ||
return self | ||
elif copy: | ||
return self.copy() | ||
dtype = pandas_dtype(dtype) | ||
if isinstance(dtype, StringDtype): | ||
if copy: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here you can remove the new code, as you can see on this line and the two lines below, the logic is already present There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point; must've not been paying attention. |
||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -131,6 +131,11 @@ def copy(self): | |||||
return type(self)(self._data.copy()) | ||||||
|
||||||
def astype(self, dtype, copy=True): | ||||||
if dtype == self._dtype: | ||||||
if not copy: | ||||||
return self | ||||||
elif copy: | ||||||
return self.copy() | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche Is this implemented correctly? I'm concerned because of the way returning a copy is implemented on line 141, which seems to take into account There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is correct, yes. The first There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, but we might need to pass through the (anyway, this array implementation is only for testing, it's not actually used, so it's not super important) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Easy enough to check, so I changed it to pass through the |
||||||
dtype = pandas_dtype(dtype) | ||||||
if isinstance(dtype, type(self.dtype)): | ||||||
return type(self)(self._data, context=dtype.context) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem is that
np.dtype(object) == ExtensionDtype
raises an error, which is an error coming from numpy and not directly solvable in pandas.So to workaround it, can you do
is_dtype_equal(dtype, self.dtype):
, withis_dtype_equal
imported frompandas.core.dtypes.common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed and moved checks to base class. This just calls super().astype now.