Skip to content

Fine-grained: Treat files changed only if md5 changes #4505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 24, 2018

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Jan 24, 2018

Previously the daemon could do a lot of extra work if you switched
to another git branch and immediately back to the original branch,
since many file timestamps would change. Now this isn't sufficient
to consider a file as changed.

Also add a new cached file system abstraction (that is currently only
used in dmypy) to avoid redundant file system operations and to make
file system state easier to reason about. The idea is that we cache
output of previous file system operations during a single increment,
for both consistency and performance. My plan is to eventually use it
about everywhere.

This should also make it slightly easier to switch to listening to
file system events instead of stat()ing everything, but that's not
a priority yet.

Fixes #4499.

Previously the daemon could do a lot of extra work if you switched
to another git branch and immediately back to the original branch,
since many file timestamps would change. Now this isn't sufficient
to consider a file as changed.

Also add a new cached file system abstraction (that is currently only
used in dmypy) to avoid redundant file system operations and to make
file system state easier to reason about. The idea is that we cache
output of previous file system operations during a single increment,
for both consistency and performance.  My plan is to eventually use it
everywhere.

This should also make it slightly easier to switch to listening to
file system events instead of stat()ing everything, but that's not
a priority yet.

Fixes #4499.
@JukkaL
Copy link
Collaborator Author

JukkaL commented Jan 24, 2018

Travis is failing on master as well -- the failures seem unrelated to this.

Copy link
Collaborator

@msullivan msullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems basically reasonable.

My only concern is that now we have two separate mechanisms for tracking the hashes of files to see if they have changed, but they are at very different places in the abstraction stack, so it isn't too concerning.

@JukkaL JukkaL merged commit 71d85d7 into master Jan 24, 2018
@JukkaL
Copy link
Collaborator Author

JukkaL commented Jan 24, 2018

My only concern is that now we have two separate mechanisms for tracking the hashes of files to see if they have changed, but they are at very different places in the abstraction stack, so it isn't too concerning.

This is a concern, but this shouldn't be significantly worse now than it was previously. In the near/medium term I want to get the regular build using the new abstraction as well. This should give a minor perf boost and also make things a bit cleaner in general.

@MakonnenMak
Copy link
Contributor

@JukkaL is this item from the docs still relevant with this change?

https://mypy.readthedocs.io/en/stable/additional_features.html#remote-cache

If you use the mypy daemon, you may want to restart the daemon each time after the merge base or local branch has changed to avoid processing a potentially large number of changes in an incremental build, as this can be much slower than downloading cache data and restarting the daemon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fine-grained: Redundant work after git operations
3 participants