Skip to content

Commit 8b09a90

Browse files
newrengitster
authored andcommitted
merge-ort: restart merge with cached renames to reduce process entry cost
The merge algorithm mostly consists of the following three functions: collect_merge_info() detect_and_process_renames() process_entries() Prior to the trivial directory resolution optimization of the last half dozen commits, process_entries() was consistently the slowest, followed by collect_merge_info(), then detect_and_process_renames(). When the trivial directory resolution applies, it often dramatically decreases the amount of time spent in the two slower functions. Looking at the performance results in the previous commit, the trivial directory resolution optimization helps amazingly well when there are no relevant renames. It also helps really well when reapplying a long series of linear commits (such as in a rebase or cherry-pick), since the relevant renames may well be cached from the first reapplied commit. But when there are any relevant renames that are not cached (represented by the just-one-mega testcase), then the optimization does not help at all. Often, I noticed that when the optimization does not apply, it is because there are a handful of relevant sources -- maybe even only one. It felt frustrating to need to recurse into potentially hundreds or even thousands of directories just for a single rename, but it was needed for correctness. However, staring at this list of functions and noticing that process_entries() is the most expensive and knowing I could avoid it if I had cached renames suggested a simple idea: change collect_merge_info() detect_and_process_renames() process_entries() into collect_merge_info() detect_and_process_renames() <cache all the renames, and restart> collect_merge_info() detect_and_process_renames() process_entries() This may seem odd and look like more work. However, note that although we run collect_merge_info() twice, the second time we get to employ trivial directory resolves, which makes it much faster, so the increased time in collect_merge_info() is small. While we run detect_and_process_renames() again, all renames are cached so it's nearly a no-op (we don't call into diffcore_rename_extended() but we do have a little bit of data structure checking and fixing up). And the big payoff comes from the fact that process_entries(), will be much faster due to having far fewer entries to process. This restarting only makes sense if we can save recursing into enough directories to make it worth our while. Introduce a simple heuristic to guide this. Note that this heuristic uses a "wanted_factor" that I have virtually no actual real world data for, just some back-of-the-envelope quasi-scientific calculations that I included in some comments and then plucked a simple round number out of thin air. It could be that tweaking this number to make it either higher or lower improves the optimization. (There's slightly more here; when I first introduced this optimization, I used a factor of 10, because I was completely confident it was big enough to not cause slowdowns in special cases. I was certain it was higher than needed. Several months later, I added the rough calculations which make me think the optimal number is close to 2; but instead of pushing to the limit, I just bumped it to 3 to reduce the risk that there are special cases where this optimization can result in slowing down the code a little. If the ratio of path counts is below 3, we probably will only see minor performance improvements at best anyway.) Also, note that while the diffstat looks kind of long (nearly 100 lines), more than half of it is in two comments explaining how things work. For the testcases mentioned in commit 557ac03 ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 205.1 ms ± 3.8 ms 204.2 ms ± 3.0 ms mega-renames: 1.564 s ± 0.010 s 1.076 s ± 0.015 s just-one-mega: 479.5 ms ± 3.9 ms 364.1 ms ± 7.0 ms Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 7bee6c1 commit 8b09a90

File tree

2 files changed

+87
-7
lines changed

2 files changed

+87
-7
lines changed

merge-ort.c

Lines changed: 86 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ struct rename_info {
213213
* MERGE_SIDE2: cached data from side2 can be reused
214214
* MERGE_SIDE1: cached data from side1 can be reused
215215
* 0: no cached data can be reused
216+
* -1: See redo_after_renames; both sides can be reused.
216217
*/
217218
int cached_pairs_valid_side;
218219

@@ -258,6 +259,28 @@ struct rename_info {
258259
*/
259260
struct strset cached_irrelevant[3];
260261

262+
/*
263+
* redo_after_renames: optimization flag for "restarting" the merge
264+
*
265+
* Sometimes it pays to detect renames, cache them, and then
266+
* restart the merge operation from the beginning. The reason for
267+
* this is that when we know where all the renames are, we know
268+
* whether a certain directory has any paths under it affected --
269+
* and if a directory is not affected then it permits us to do
270+
* trivial tree merging in more cases. Doing trivial tree merging
271+
* prevents the need to run process_entry() on every path
272+
* underneath trees that can be trivially merged, and
273+
* process_entry() is more expensive than collect_merge_info() --
274+
* plus, the second collect_merge_info() will be much faster since
275+
* it doesn't have to recurse into the relevant trees.
276+
*
277+
* Values for this flag:
278+
* 0 = don't bother, not worth it (or conditions not yet checked)
279+
* 1 = conditions for optimization met, optimization worthwhile
280+
* 2 = we already did it (don't restart merge yet again)
281+
*/
282+
unsigned redo_after_renames;
283+
261284
/*
262285
* needed_limit: value needed for inexact rename detection to run
263286
*
@@ -541,7 +564,8 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
541564
strintmap_func(&renames->relevant_sources[i]);
542565
if (!reinitialize)
543566
assert(renames->cached_pairs_valid_side == 0);
544-
if (i != renames->cached_pairs_valid_side) {
567+
if (i != renames->cached_pairs_valid_side &&
568+
-1 != renames->cached_pairs_valid_side) {
545569
strset_func(&renames->cached_target_names[i]);
546570
strmap_func(&renames->cached_pairs[i], 1);
547571
strset_func(&renames->cached_irrelevant[i]);
@@ -1249,7 +1273,9 @@ static int handle_deferred_entries(struct merge_options *opt,
12491273
struct hashmap_iter iter;
12501274
struct strmap_entry *entry;
12511275
int side, ret = 0;
1276+
int path_count_before, path_count_after = 0;
12521277

1278+
path_count_before = strmap_get_size(&opt->priv->paths);
12531279
for (side = MERGE_SIDE1; side <= MERGE_SIDE2; side++) {
12541280
unsigned optimization_okay = 1;
12551281
struct strintmap copy;
@@ -1385,7 +1411,32 @@ static int handle_deferred_entries(struct merge_options *opt,
13851411
path));
13861412
resolve_trivial_directory_merge(ci, side);
13871413
}
1414+
if (!optimization_okay || path_count_after)
1415+
path_count_after = strmap_get_size(&opt->priv->paths);
13881416
}
1417+
if (path_count_after) {
1418+
/*
1419+
* The choice of wanted_factor here does not affect
1420+
* correctness, only performance. When the
1421+
* path_count_after / path_count_before
1422+
* ratio is high, redoing after renames is a big
1423+
* performance boost. I suspect that redoing is a wash
1424+
* somewhere near a value of 2, and below that redoing will
1425+
* slow things down. I applied a fudge factor and picked
1426+
* 3; see the commit message when this was introduced for
1427+
* back of the envelope calculations for this ratio.
1428+
*/
1429+
const int wanted_factor = 3;
1430+
1431+
/* We should only redo collect_merge_info one time */
1432+
assert(renames->redo_after_renames == 0);
1433+
1434+
if (path_count_after / path_count_before >= wanted_factor) {
1435+
renames->redo_after_renames = 1;
1436+
renames->cached_pairs_valid_side = -1;
1437+
}
1438+
} else if (renames->redo_after_renames == 2)
1439+
renames->redo_after_renames = 0;
13891440
return ret;
13901441
}
13911442

@@ -2828,8 +2879,8 @@ static int compare_pairs(const void *a_, const void *b_)
28282879
}
28292880

28302881
/* Call diffcore_rename() to update deleted/added pairs into rename pairs */
2831-
static void detect_regular_renames(struct merge_options *opt,
2832-
unsigned side_index)
2882+
static int detect_regular_renames(struct merge_options *opt,
2883+
unsigned side_index)
28332884
{
28342885
struct diff_options diff_opts;
28352886
struct rename_info *renames = &opt->priv->renames;
@@ -2842,7 +2893,7 @@ static void detect_regular_renames(struct merge_options *opt,
28422893
* side had directory renames.
28432894
*/
28442895
resolve_diffpair_statuses(&renames->pairs[side_index]);
2845-
return;
2896+
return 0;
28462897
}
28472898

28482899
partial_clear_dir_rename_count(&renames->dir_rename_count[side_index]);
@@ -2868,6 +2919,8 @@ static void detect_regular_renames(struct merge_options *opt,
28682919
trace2_region_leave("diff", "diffcore_rename", opt->repo);
28692920
resolve_diffpair_statuses(&diff_queued_diff);
28702921

2922+
if (diff_opts.needed_rename_limit > 0)
2923+
renames->redo_after_renames = 0;
28712924
if (diff_opts.needed_rename_limit > renames->needed_limit)
28722925
renames->needed_limit = diff_opts.needed_rename_limit;
28732926

@@ -2877,6 +2930,8 @@ static void detect_regular_renames(struct merge_options *opt,
28772930
diff_queued_diff.nr = 0;
28782931
diff_queued_diff.queue = NULL;
28792932
diff_flush(&diff_opts);
2933+
2934+
return 1;
28802935
}
28812936

28822937
/*
@@ -2966,14 +3021,32 @@ static int detect_and_process_renames(struct merge_options *opt,
29663021
struct diff_queue_struct combined;
29673022
struct rename_info *renames = &opt->priv->renames;
29683023
int need_dir_renames, s, clean = 1;
3024+
unsigned detection_run = 0;
29693025

29703026
memset(&combined, 0, sizeof(combined));
29713027
if (!possible_renames(renames))
29723028
goto cleanup;
29733029

29743030
trace2_region_enter("merge", "regular renames", opt->repo);
2975-
detect_regular_renames(opt, MERGE_SIDE1);
2976-
detect_regular_renames(opt, MERGE_SIDE2);
3031+
detection_run |= detect_regular_renames(opt, MERGE_SIDE1);
3032+
detection_run |= detect_regular_renames(opt, MERGE_SIDE2);
3033+
if (renames->redo_after_renames && detection_run) {
3034+
int i, side;
3035+
struct diff_filepair *p;
3036+
3037+
/* Cache the renames, we found */
3038+
for (side = MERGE_SIDE1; side <= MERGE_SIDE2; side++) {
3039+
for (i = 0; i < renames->pairs[side].nr; ++i) {
3040+
p = renames->pairs[side].queue[i];
3041+
possibly_cache_new_pair(renames, p, side, NULL);
3042+
}
3043+
}
3044+
3045+
/* Restart the merge with the cached renames */
3046+
renames->redo_after_renames = 2;
3047+
trace2_region_leave("merge", "regular renames", opt->repo);
3048+
goto cleanup;
3049+
}
29773050
use_cached_pairs(opt, &renames->cached_pairs[1], &renames->pairs[1]);
29783051
use_cached_pairs(opt, &renames->cached_pairs[2], &renames->pairs[2]);
29793052
trace2_region_leave("merge", "regular renames", opt->repo);
@@ -4403,6 +4476,7 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
44034476
opt->subtree_shift);
44044477
}
44054478

4479+
redo:
44064480
trace2_region_enter("merge", "collect_merge_info", opt->repo);
44074481
if (collect_merge_info(opt, merge_base, side1, side2) != 0) {
44084482
/*
@@ -4422,6 +4496,12 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
44224496
result->clean = detect_and_process_renames(opt, merge_base,
44234497
side1, side2);
44244498
trace2_region_leave("merge", "renames", opt->repo);
4499+
if (opt->priv->renames.redo_after_renames == 2) {
4500+
trace2_region_enter("merge", "reset_maps", opt->repo);
4501+
clear_or_reinit_internal_opts(opt->priv, 1);
4502+
trace2_region_leave("merge", "reset_maps", opt->repo);
4503+
goto redo;
4504+
}
44254505

44264506
trace2_region_enter("merge", "process_entries", opt->repo);
44274507
process_entries(opt, &working_tree_oid);

t/t6423-merge-rename-directories.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4797,7 +4797,7 @@ test_setup_12f () {
47974797
)
47984798
}
47994799

4800-
test_expect_merge_algorithm failure failure '12f: Trivial directory resolve, caching, all kinds of fun' '
4800+
test_expect_merge_algorithm failure success '12f: Trivial directory resolve, caching, all kinds of fun' '
48014801
test_setup_12f &&
48024802
(
48034803
cd 12f &&

0 commit comments

Comments
 (0)