-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
CLN: groupby.ops follow-up cleanup #41204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
... | ||
|
||
@overload | ||
def _get_result_dtype(self, dtype: ExtensionDtype) -> ExtensionDtype: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a TypeVar work here? Might end up being useful elsewhere too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think that would be accurate here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be misunderstanding overloads, but is this expressing ExtensionDtype -> ExtensionDtype
and np.dtype -> np.dtype
?
So was thinking a TypeVar like TypeVar('DtypeObjT', np.dtype, ExtensionDtype)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do TypeVars not preserve subclasses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running mypy
on something like
from typing import TypeVar
AnyStr = TypeVar('AnyStr', str, bytes)
class Upper(str):
def __new__(cls, val):
return str.__new__(cls, val.upper())
class Lower(str):
def __new__(cls, val):
return str.__new__(cls, val.lower())
def swap_lower_upper(s: AnyStr) -> AnyStr:
if isinstance(s, bytes):
return s
elif isinstance(s, Lower):
return Upper(s)
else:
return Lower(s)
swap_lower_upper(Lower("hello"))
swap_lower_upper(Upper("hello"))
gives no error, so I don't think subclasses are preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool. will make a separate PR to implement DtypeObjT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implenting this with DtypeObjT and getting rid of the overloads im seeing
error: Incompatible return value type (got "dtype[signedinteger[_64Bit]]", expected "ExtensionDtype") [return-value]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will have time to look in more detail later, initial guess would be that this return is the problem
pandas/pandas/core/groupby/ops.py
Lines 288 to 290 in 59b2db1
if how in ["add", "cumsum", "sum", "prod"]: | |
if dtype == np.dtype(bool): | |
return np.dtype(np.int64) |
IIRC mypy can't narrow based on type equality, so it's thinking that an ExtensionDtype
can hit that return. But then not sure why it wouldn't complain with the overload, but those are confusing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine
cc @simonjayhawkins if comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jbrockmendel lgtm. TypeVars with a union of types do have different semantics than the bound=
variant that we normally use, so can address @mzeitlin11 comments here or as follow-up.
get the namespaces/privateness "right"
avoid runtime imports
fix "type: ignore"s