Skip to content

Commit 8a30a1e

Browse files
jonathantanmygitster
authored andcommitted
index-pack: prefetch missing REF_DELTA bases
When fetching, the client sends "have" commit IDs indicating that the server does not need to send any object referenced by those commits, reducing network I/O. When the client is a partial clone, the client still sends "have"s in this way, even if it does not have every object referenced by a commit it sent as "have". If a server omits such an object, it is fine: the client could lazily fetch that object before this fetch, and it can still do so after. The issue is when the server sends a thin pack containing an object that is a REF_DELTA against such a missing object: index-pack fails to fix the thin pack. When support for lazily fetching missing objects was added in 8b4c010 ("sha1_file: support lazily fetching missing objects", 2017-12-08), support in index-pack was turned off in the belief that it accesses the repo only to do hash collision checks. However, this is not true: it also needs to access the repo to resolve REF_DELTA bases. Support for lazy fetching should still generally be turned off in index-pack because it is used as part of the lazy fetching process itself (if not, infinite loops may occur), but we do need to fetch the REF_DELTA bases. (When fetching REF_DELTA bases, it is unlikely that those are REF_DELTA themselves, because we do not send "have" when making such fetches.) To resolve this, prefetch all missing REF_DELTA bases before attempting to resolve them. This both ensures that all bases are attempted to be fetched, and ensures that we make only one request per index-pack invocation, and not one request per missing object. Signed-off-by: Jonathan Tan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 385d1bf commit 8a30a1e

File tree

2 files changed

+85
-2
lines changed

2 files changed

+85
-2
lines changed

builtin/index-pack.c

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
#include "thread-utils.h"
1515
#include "packfile.h"
1616
#include "object-store.h"
17+
#include "fetch-object.h"
1718

1819
static const char index_pack_usage[] =
1920
"git index-pack [-v] [-o <index-file>] [--keep | --keep=<msg>] [--verify] [--strict] (<pack-file> | --stdin [--fix-thin] [<pack-file>])";
@@ -1351,6 +1352,25 @@ static void fix_unresolved_deltas(struct hashfile *f)
13511352
sorted_by_pos[i] = &ref_deltas[i];
13521353
QSORT(sorted_by_pos, nr_ref_deltas, delta_pos_compare);
13531354

1355+
if (repository_format_partial_clone) {
1356+
/*
1357+
* Prefetch the delta bases.
1358+
*/
1359+
struct oid_array to_fetch = OID_ARRAY_INIT;
1360+
for (i = 0; i < nr_ref_deltas; i++) {
1361+
struct ref_delta_entry *d = sorted_by_pos[i];
1362+
if (!oid_object_info_extended(the_repository, &d->oid,
1363+
NULL,
1364+
OBJECT_INFO_FOR_PREFETCH))
1365+
continue;
1366+
oid_array_append(&to_fetch, &d->oid);
1367+
}
1368+
if (to_fetch.nr)
1369+
fetch_objects(repository_format_partial_clone,
1370+
to_fetch.oid, to_fetch.nr);
1371+
oid_array_clear(&to_fetch);
1372+
}
1373+
13541374
for (i = 0; i < nr_ref_deltas; i++) {
13551375
struct ref_delta_entry *d = sorted_by_pos[i];
13561376
enum object_type type;
@@ -1650,8 +1670,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
16501670
int report_end_of_input = 0;
16511671

16521672
/*
1653-
* index-pack never needs to fetch missing objects, since it only
1654-
* accesses the repo to do hash collision checks
1673+
* index-pack never needs to fetch missing objects except when
1674+
* REF_DELTA bases are missing (which are explicitly handled). It only
1675+
* accesses the repo to do hash collision checks and to check which
1676+
* REF_DELTA bases need to be fetched.
16551677
*/
16561678
fetch_if_missing = 0;
16571679

t/t5616-partial-clone.sh

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -339,4 +339,65 @@ test_expect_success 'when partial cloning, tolerate server not sending target of
339339
! test -e "$HTTPD_ROOT_PATH/one-time-sed"
340340
'
341341

342+
test_expect_success 'tolerate server sending REF_DELTA against missing promisor objects' '
343+
SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
344+
rm -rf "$SERVER" repo &&
345+
test_create_repo "$SERVER" &&
346+
test_config -C "$SERVER" uploadpack.allowfilter 1 &&
347+
test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
348+
349+
# Create a commit with a blob to be used as a delta base.
350+
for i in $(test_seq 10)
351+
do
352+
echo "this is a line" >>"$SERVER/foo.txt"
353+
done &&
354+
git -C "$SERVER" add foo.txt &&
355+
git -C "$SERVER" commit -m bar &&
356+
git -C "$SERVER" rev-parse HEAD:foo.txt >deltabase &&
357+
358+
git -c protocol.version=2 clone --no-checkout \
359+
--filter=blob:none $HTTPD_URL/one_time_sed/server repo &&
360+
361+
# Sanity check to ensure that the client does not have that blob.
362+
git -C repo rev-list --objects --exclude-promisor-objects \
363+
-- $(cat deltabase) >objlist &&
364+
test_line_count = 0 objlist &&
365+
366+
# Another commit. This commit will be fetched by the client.
367+
echo "abcdefghijklmnopqrstuvwxyz" >>"$SERVER/foo.txt" &&
368+
git -C "$SERVER" add foo.txt &&
369+
git -C "$SERVER" commit -m baz &&
370+
371+
# Pack a thin pack containing, among other things, HEAD:foo.txt
372+
# delta-ed against HEAD^:foo.txt.
373+
printf "%s\n--not\n%s\n" \
374+
$(git -C "$SERVER" rev-parse HEAD) \
375+
$(git -C "$SERVER" rev-parse HEAD^) |
376+
git -C "$SERVER" pack-objects --thin --stdout >thin.pack &&
377+
378+
# Ensure that the pack contains one delta against HEAD^:foo.txt. Since
379+
# the delta contains at least 26 novel characters, the size cannot be
380+
# contained in 4 bits, so the object header will take up 2 bytes. The
381+
# most significant nybble of the first byte is 0b1111 (0b1 to indicate
382+
# that the header continues, and 0b111 to indicate REF_DELTA), followed
383+
# by any 3 nybbles, then the OID of the delta base.
384+
git -C "$SERVER" rev-parse HEAD^:foo.txt >deltabase &&
385+
printf "f.,..%s" $(intersperse "," <deltabase) >want &&
386+
hex_unpack <thin.pack | intersperse "," >have &&
387+
grep $(cat want) have &&
388+
389+
replace_packfile thin.pack &&
390+
391+
# Use protocol v2 because the sed command looks for the "packfile"
392+
# section header.
393+
test_config -C "$SERVER" protocol.version 2 &&
394+
395+
# Fetch the thin pack and ensure that index-pack is able to handle the
396+
# REF_DELTA object with a missing promisor delta base.
397+
git -C repo -c protocol.version=2 fetch &&
398+
399+
# Ensure that the one-time-sed script was used.
400+
! test -e "$HTTPD_ROOT_PATH/one-time-sed"
401+
'
402+
342403
test_done

0 commit comments

Comments
 (0)