-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: Coerce to object for mixed concat #20799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
from pandas.core.dtypes.common import ( | ||
is_categorical_dtype, | ||
is_sparse, | ||
is_extension_array_dtype, | ||
is_datetimetz, | ||
is_datetime64_dtype, | ||
is_timedelta64_dtype, | ||
|
@@ -173,6 +174,10 @@ def is_nonempty(x): | |
elif 'sparse' in typs: | ||
return _concat_sparse(to_concat, axis=axis, typs=typs) | ||
|
||
extensions = [is_extension_array_dtype(x) for x in to_concat] | ||
if any(extensions) and not all(extensions): | ||
to_concat = [np.atleast_2d(x.astype('object')) for x in to_concat] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm this is not correct what about categorical? which is EA you need much more comprehensive tests here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. both categorical and datetimes cases are already filtered out before (couple of lines above) and use the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ahh ok some extra comments would be helpful |
||
|
||
if not nonempty: | ||
# we have all empties, but may need to coerce the result dtype to | ||
# object if we have non-numeric type operands (numpy would otherwise | ||
|
@@ -210,7 +215,7 @@ def _concat_categorical(to_concat, axis=0): | |
|
||
def _concat_asobject(to_concat): | ||
to_concat = [x.get_values() if is_categorical_dtype(x.dtype) | ||
else x.ravel() for x in to_concat] | ||
else np.asarray(x).ravel() for x in to_concat] | ||
res = _concat_compat(to_concat) | ||
if axis == 1: | ||
return res.reshape(1, len(res)) | ||
|
@@ -548,6 +553,8 @@ def convert_sparse(x, axis): | |
# coerce to native type | ||
if isinstance(x, SparseArray): | ||
x = x.get_values() | ||
else: | ||
x = np.asarray(x) | ||
x = x.ravel() | ||
if axis > 0: | ||
x = np.atleast_2d(x) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have an actual case where the "and not all(extensions)" is hit?
Because if that is the case, I also think the
np.concatenate(to_concat, axis=axis)
will fail as it has the wrong dimensions to both work at the same time?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhmm I suppose I don't have an example, because the all extension dtype case should never get here, right? It should be going down the
is_uniform_join_units
path. I'll change it, to remove theall(extensions)
condition, so that it makes sense locally.