Skip to content

Use TypeAlias in code where types are declared #61504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Dr-Irv
Copy link
Contributor

@Dr-Irv Dr-Irv commented May 27, 2025

After introducing TypeAlias in _typing.py, goal of this PR is to use it in any other source files that are creating types in this way. Also, make all of those types private.

I believe that I caught them all via some searching, but may have missed a few. Couldn't find a rule that enforces use of TypeAlias. Note that in 3.12, the recommendation is to do a type declaration, which is probably why there isn't such a rule.

@mroeschke mroeschke added the Typing type annotations, mypy/pyright type checking label May 27, 2025
@@ -71,7 +72,7 @@
from pandas.core.resample import Resampler
from pandas.core.window.rolling import BaseWindow

ResType = dict[int, Any]
_ResType: TypeAlias = dict[int, Any]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this idea of making these lead with a starting _ is to they are marked as "private"? Should they instead just be moved to _typing.py with other defined "private" annotations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this idea of making these lead with a starting _ is to they are marked as "private"? Should they instead just be moved to _typing.py with other defined "private" annotations?

The difference with the annotations in _typing.py is that many (but not all) are used in more than one pandas module. For the ones that are marked private in this PR, they are only used locally within that module.

I'm also going to be doing an MR to make more of the ones in _typing.py public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were planning on moving core to _core, but ran into some bikeshedding issues. I think we should still operate under the assumption we will eventually do that (I believe there is consensus that we really want to do it). Assuming that is done, what will we want to do about private variables like this? I think we should strive for consistency, either all with a leading underscore or none.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still like the idea of having a TypeAlias being private within a module. If someone is making a change to pandas and then wants to use that alias in another module, they then have to think about moving it to _typing.py, which is where it belongs.

Let's say that we had already changed core to _core. If you're down deep in the pandas code, e.g., in pandas/_core/arrays/datetimelike.py, you shouldn't have to remember that your are 2 levels down from _core. IMHO, anything within that file should be private unless it is meant to be imported from somewhere else.

Copy link
Member

@rhshadrach rhshadrach Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, anything within that file should be private unless it is meant to be imported from somewhere else.

This seems to me to be a very large change. I count 2432 functions (not merely callables) at the top level of modules across pandas that do not start with an _. Now some are undoubtedly public, but many are not. On the other hand, there are 438 functions at the top level of a module that do start with an _.

Edit: I misread your comment on the first pass. This means that as soon as you want to import something you need to remove the _ from the name? That will mean constantly changing top-level functions in modules. I'm strongly opposed to this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR is just keeping along with that inconsistency.

This PR is changing the names of identifiers to mark them as private when they already reside in modules that are documented as private. This is code churn with no benefit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is changing the names of identifiers to mark them as private when they already reside in modules that are documented as private. This is code churn with no benefit.

Disagree. Right now, someone can do from pandas.core.apply import ResType, which is "valid" from a code perspective, invalid from a documentation perspective. If we make the change as I suggested, then whenever pyright implements their "disallow private imports" or this ruff rule https://docs.astral.sh/ruff/rules/import-private-name/ comes out of preview, then from pandas.core.apply import _ResType would get flagged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the sustainable change to accomplish what you want is from pandas._core.apply. Not changing the hundreds of symbols from across pandas.core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - then I'll change this to just add TypeAlias and keep the symbol names the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhshadrach @mroeschke I've kept all the type aliases with the same names as before - now this just introduces TypeAlias and removes any unused aliases.

@Dr-Irv Dr-Irv changed the title Change internal types in individual files to be private. Use TypeAlias in code where types are declared. Use TypeAlias in code where types are declared Jun 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants