Skip to content

Commit c523035

Browse files
derrickstoleegitster
authored andcommitted
commit-graph: allow cross-alternate chains
In an environment like a fork network, it is helpful to have a commit-graph chain that spans both the base repo and the fork repo. The fork is usually a small set of data on top of the large repo, but sometimes the fork is much larger. For example, git-for-windows/git has almost double the number of commits as git/git because it rebases its commits on every major version update. To allow cross-alternate commit-graph chains, we need a few pieces: 1. When looking for a graph-{hash}.graph file, check all alternates. 2. When merging commit-graph chains, do not merge across alternates. 3. When writing a new commit-graph chain based on a commit-graph file in another object directory, do not allow success if the base file has of the name "commit-graph" instead of "commit-graphs/graph-{hash}.graph". Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 1771be9 commit c523035

File tree

4 files changed

+123
-11
lines changed

4 files changed

+123
-11
lines changed

Documentation/technical/commit-graph.txt

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,42 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum
266266
number of commits) could be extracted into config settings for full
267267
flexibility.
268268

269+
## Chains across multiple object directories
270+
271+
In a repo with alternates, we look for the `commit-graph-chain` file starting
272+
in the local object directory and then in each alternate. The first file that
273+
exists defines our chain. As we look for the `graph-{hash}` files for
274+
each `{hash}` in the chain file, we follow the same pattern for the host
275+
directories.
276+
277+
This allows commit-graphs to be split across multiple forks in a fork network.
278+
The typical case is a large "base" repo with many smaller forks.
279+
280+
As the base repo advances, it will likely update and merge its commit-graph
281+
chain more frequently than the forks. If a fork updates their commit-graph after
282+
the base repo, then it should "reparent" the commit-graph chain onto the new
283+
chain in the base repo. When reading each `graph-{hash}` file, we track
284+
the object directory containing it. During a write of a new commit-graph file,
285+
we check for any changes in the source object directory and read the
286+
`commit-graph-chain` file for that source and create a new file based on those
287+
files. During this "reparent" operation, we necessarily need to collapse all
288+
levels in the fork, as all of the files are invalid against the new base file.
289+
290+
It is crucial to be careful when cleaning up "unreferenced" `graph-{hash}.graph`
291+
files in this scenario. It falls to the user to define the proper settings for
292+
their custom environment:
293+
294+
1. When merging levels in the base repo, the unreferenced files may still be
295+
referenced by chains from fork repos.
296+
297+
2. The expiry time should be set to a length of time such that every fork has
298+
time to recompute their commit-graph chain to "reparent" onto the new base
299+
file(s).
300+
301+
3. If the commit-graph chain is updated in the base, the fork will not have
302+
access to the new chain until its chain is updated to reference those files.
303+
(This may change in the future [5].)
304+
269305
Related Links
270306
-------------
271307
[0] https://bugs.chromium.org/p/git/issues/detail?id=8
@@ -292,3 +328,7 @@ Related Links
292328

293329
[4] https://public-inbox.org/git/[email protected]/T/#u
294330
A patch to remove the ahead-behind calculation from 'status'.
331+
332+
[5] https://public-inbox.org/git/[email protected]/
333+
A discussion of a "two-dimensional graph position" that can allow reading
334+
multiple commit-graph chains at the same time.

commit-graph.c

Lines changed: 45 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,9 @@ static struct commit_graph *load_commit_graph_v1(struct repository *r, const cha
320320
struct commit_graph *g = load_commit_graph_one(graph_name);
321321
free(graph_name);
322322

323+
if (g)
324+
g->obj_dir = obj_dir;
325+
323326
return g;
324327
}
325328

@@ -379,9 +382,10 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r, const
379382
count = st.st_size / (the_hash_algo->hexsz + 1);
380383
oids = xcalloc(count, sizeof(struct object_id));
381384

382-
for (i = 0; i < count && valid; i++) {
383-
char *graph_name;
384-
struct commit_graph *g;
385+
prepare_alt_odb(r);
386+
387+
for (i = 0; i < count; i++) {
388+
struct object_directory *odb;
385389

386390
if (strbuf_getline_lf(&line, fp) == EOF)
387391
break;
@@ -393,14 +397,29 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r, const
393397
break;
394398
}
395399

396-
graph_name = get_split_graph_filename(obj_dir, line.buf);
397-
g = load_commit_graph_one(graph_name);
398-
free(graph_name);
400+
valid = 0;
401+
for (odb = r->objects->odb; odb; odb = odb->next) {
402+
char *graph_name = get_split_graph_filename(odb->path, line.buf);
403+
struct commit_graph *g = load_commit_graph_one(graph_name);
399404

400-
if (g && add_graph_to_chain(g, graph_chain, oids, i))
401-
graph_chain = g;
402-
else
403-
valid = 0;
405+
free(graph_name);
406+
407+
if (g) {
408+
g->obj_dir = odb->path;
409+
410+
if (add_graph_to_chain(g, graph_chain, oids, i)) {
411+
graph_chain = g;
412+
valid = 1;
413+
}
414+
415+
break;
416+
}
417+
}
418+
419+
if (!valid) {
420+
warning(_("unable to find all commit-graph files"));
421+
break;
422+
}
404423
}
405424

406425
free(oids);
@@ -1418,7 +1437,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
14181437

14191438
if (ctx->split && ctx->base_graph_name && ctx->num_commit_graphs_after > 1) {
14201439
char *new_base_hash = xstrdup(oid_to_hex(&ctx->new_base_graph->oid));
1421-
char *new_base_name = get_split_graph_filename(ctx->obj_dir, new_base_hash);
1440+
char *new_base_name = get_split_graph_filename(ctx->new_base_graph->obj_dir, new_base_hash);
14221441

14231442
free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 2]);
14241443
free(ctx->commit_graph_hash_after[ctx->num_commit_graphs_after - 2]);
@@ -1493,6 +1512,9 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
14931512

14941513
while (g && (g->num_commits <= split_strategy_size_mult * num_commits ||
14951514
num_commits > split_strategy_max_commits)) {
1515+
if (strcmp(g->obj_dir, ctx->obj_dir))
1516+
break;
1517+
14961518
num_commits += g->num_commits;
14971519
g = g->base_graph;
14981520

@@ -1501,6 +1523,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
15011523

15021524
ctx->new_base_graph = g;
15031525

1526+
if (ctx->num_commit_graphs_after == 2) {
1527+
char *old_graph_name = get_commit_graph_filename(g->obj_dir);
1528+
1529+
if (!strcmp(g->filename, old_graph_name) &&
1530+
strcmp(g->obj_dir, ctx->obj_dir)) {
1531+
ctx->num_commit_graphs_after = 1;
1532+
ctx->new_base_graph = NULL;
1533+
}
1534+
1535+
free(old_graph_name);
1536+
}
1537+
15041538
ALLOC_ARRAY(ctx->commit_graph_filenames_after, ctx->num_commit_graphs_after);
15051539
ALLOC_ARRAY(ctx->commit_graph_hash_after, ctx->num_commit_graphs_after);
15061540

commit-graph.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ struct commit_graph {
4848
uint32_t num_commits;
4949
struct object_id oid;
5050
char *filename;
51+
const char *obj_dir;
5152

5253
uint32_t num_commits_in_base;
5354
struct commit_graph *base_graph;

t/t5324-split-commit-graph.sh

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,21 @@ test_expect_success 'add more commits, and write a new base graph' '
9090
graph_read_expect 12
9191
'
9292

93+
test_expect_success 'fork and fail to base a chain on a commit-graph file' '
94+
test_when_finished rm -rf fork &&
95+
git clone . fork &&
96+
(
97+
cd fork &&
98+
rm .git/objects/info/commit-graph &&
99+
echo "$(pwd)/../.git/objects" >.git/objects/info/alternates &&
100+
test_commit new-commit &&
101+
git commit-graph write --reachable --split &&
102+
test_path_is_file $graphdir/commit-graph-chain &&
103+
test_line_count = 1 $graphdir/commit-graph-chain &&
104+
verify_chain_files_exist $graphdir
105+
)
106+
'
107+
93108
test_expect_success 'add three more commits, write a tip graph' '
94109
git reset --hard commits/3 &&
95110
git merge merge/1 &&
@@ -132,4 +147,26 @@ test_expect_success 'add one commit, write a merged graph' '
132147

133148
graph_git_behavior 'merged commit-graph: commit 12 vs 6' commits/12 commits/6
134149

150+
test_expect_success 'create fork and chain across alternate' '
151+
git clone . fork &&
152+
(
153+
cd fork &&
154+
git config core.commitGraph true &&
155+
rm -rf $graphdir &&
156+
echo "$(pwd)/../.git/objects" >.git/objects/info/alternates &&
157+
test_commit 13 &&
158+
git branch commits/13 &&
159+
git commit-graph write --reachable --split &&
160+
test_path_is_file $graphdir/commit-graph-chain &&
161+
test_line_count = 3 $graphdir/commit-graph-chain &&
162+
ls $graphdir/graph-*.graph >graph-files &&
163+
test_line_count = 1 graph-files &&
164+
git -c core.commitGraph=true rev-list HEAD >expect &&
165+
git -c core.commitGraph=false rev-list HEAD >actual &&
166+
test_cmp expect actual
167+
)
168+
'
169+
170+
graph_git_behavior 'alternate: commit 13 vs 6' commits/13 commits/6
171+
135172
test_done

0 commit comments

Comments
 (0)