-
Notifications
You must be signed in to change notification settings - Fork 534
dotgit: rewrite the way references are looked up #563
Conversation
Codecov Report
@@ Coverage Diff @@
## master #563 +/- ##
==========================================
- Coverage 78.1% 77.45% -0.65%
==========================================
Files 129 129
Lines 9807 9822 +15
==========================================
- Hits 7660 7608 -52
- Misses 1316 1393 +77
+ Partials 831 821 -10
Continue to review full report at Codecov.
|
If the latency is because It seems to me that putting a cache around one of the few data types in git that is mutable is a little tricky to get right. It is certainly more difficult to reason about during troubleshooting. Perhaps the troubles with getting the caching working for Rather than trading off memory to save CPU we should make sure our current CPU usage is not unnecessarily inefficient. |
@orirawlings nope, that does not cut it, sadly :(. All the refs are not in all the different formats. All the packed-refs are not in under the Cache the packed refs (we can check that modtime and have them cached), just check for name in There is another option I've been working on and it's caching everything without checking any modtime (which makes the operation take ~50ns) and just updating the cache when calling |
Now there's only two ways of getting a reference, by checking under refs/ directory or in packed-refs. refs/ directory is checked using a direct read by reference name and packed refs are cached until they have been changed. Signed-off-by: Miguel Molina <[email protected]>
00c51f1
to
5a7b7af
Compare
@orirawlings I've just rewritten the |
Yeah, I think just caching everything and only updating on This updated patch looks really good to me. It also changes the implementation to be much more consistent with the documentation on the standard git tools. Here is a snippet from the man page for
I think the caching based on the mod time should be sufficient too, and not introduce any data consistency issues. |
This change introduces a cache of already visited references for future visits.
Right now, using
Ref
method of DotGit fetches all the references and thenreturns the correct one, this is a very expensive operation if executed a lot
of times. Adding the cache, makes this orders of mangnitude faster on successive
calls at the expense of a little bit more of memory consumption.
Results of the added benchmark before:
Results of the benchmark now:
Caveats:
Refs
is not cached. Can't make it passRemoteSuite.TestPushNewReference
andRemoteSuite.TestPushRejectNonFastForward
even checking the modification time of.git
. The cached results are not the same as the ones obtained with the usual process, even if the modtime of the .git is the same as it was.Ref
the first step is try to look directly for the file named after the ref. Same reason as above, only for other tests.I've been trying another few implementations of this to try to make it work with
References
and not require that I/O operation at the beginning ofRef
, but it's been proving really hard to cache at this level if we cannot rely on theModTime
.This change makes a push that took ~30 mins with 0 changes take just a few seconds.