Skip to content

Commit c30fe25

Browse files
committed
Merge branch 'tg/rerere' into pu
* tg/rerere: rerere: recalculate conflict ID when unresolved conflict is committed rerere: teach rerere to handle nested conflicts rerere: return strbuf from handle path rerere: factor out handle_conflict function rerere: only return whether a path has conflicts or not rerere: fix crash when conflict goes unresolved rerere: add documentation for conflict normalization rerere: mark strings for translation rerere: wrap paths in output in sq rerere: lowercase error messages rerere: unify error messages when read_cache fails
2 parents 8fb522c + 88f905e commit c30fe25

File tree

4 files changed

+366
-129
lines changed

4 files changed

+366
-129
lines changed

Documentation/technical/rerere.txt

Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
Rerere
2+
======
3+
4+
This document describes the rerere logic.
5+
6+
Conflict normalization
7+
----------------------
8+
9+
To ensure recorded conflict resolutions can be looked up in the rerere
10+
database, even when branches are merged in a different order,
11+
different branches are merged that result in the same conflict, or
12+
when different conflict style settings are used, rerere normalizes the
13+
conflicts before writing them to the rerere database.
14+
15+
Different conflict styles and branch names are normalized by stripping
16+
the labels from the conflict markers, and removing extraneous
17+
information from the `diff3` conflict style. Branches that are merged
18+
in different order are normalized by sorting the conflict hunks. More
19+
on each of those steps in the following sections.
20+
21+
Once these two normalization operations are applied, a conflict ID is
22+
calculated based on the normalized conflict, which is later used by
23+
rerere to look up the conflict in the rerere database.
24+
25+
Stripping extraneous information
26+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27+
28+
Say we have three branches AB, AC and AC2. The common ancestor of
29+
these branches has a file with a line containing the string "A" (for
30+
brevity this is called "line A" in the rest of the document). In
31+
branch AB this line is changed to "B", in AC, this line is changed to
32+
"C", and branch AC2 is forked off of AC, after the line was changed to
33+
"C".
34+
35+
Forking a branch ABAC off of branch AB and then merging AC into it, we
36+
get a conflict like the following:
37+
38+
<<<<<<< HEAD
39+
B
40+
=======
41+
C
42+
>>>>>>> AC
43+
44+
Doing the analogous with AC2 (forking a branch ABAC2 off of branch AB
45+
and then merging branch AC2 into it), using the diff3 conflict style,
46+
we get a conflict like the following:
47+
48+
<<<<<<< HEAD
49+
B
50+
||||||| merged common ancestors
51+
A
52+
=======
53+
C
54+
>>>>>>> AC2
55+
56+
By resolving this conflict, to leave line D, the user declares:
57+
58+
After examining what branches AB and AC did, I believe that making
59+
line A into line D is the best thing to do that is compatible with
60+
what AB and AC wanted to do.
61+
62+
As branch AC2 refers to the same commit as AC, the above implies that
63+
this is also compatible what AB and AC2 wanted to do.
64+
65+
By extension, this means that rerere should recognize that the above
66+
conflicts are the same. To do this, the labels on the conflict
67+
markers are stripped, and the diff3 output is removed. The above
68+
examples would both result in the following normalized conflict:
69+
70+
<<<<<<<
71+
B
72+
=======
73+
C
74+
>>>>>>>
75+
76+
Sorting hunks
77+
~~~~~~~~~~~~~
78+
79+
As before, lets imagine that a common ancestor had a file with line A
80+
its early part, and line X in its late part. And then four branches
81+
are forked that do these things:
82+
83+
- AB: changes A to B
84+
- AC: changes A to C
85+
- XY: changes X to Y
86+
- XZ: changes X to Z
87+
88+
Now, forking a branch ABAC off of branch AB and then merging AC into
89+
it, and forking a branch ACAB off of branch AC and then merging AB
90+
into it, would yield the conflict in a different order. The former
91+
would say "A became B or C, what now?" while the latter would say "A
92+
became C or B, what now?"
93+
94+
As a reminder, the act of merging AC into ABAC and resolving the
95+
conflict to leave line D means that the user declares:
96+
97+
After examining what branches AB and AC did, I believe that
98+
making line A into line D is the best thing to do that is
99+
compatible with what AB and AC wanted to do.
100+
101+
So the conflict we would see when merging AB into ACAB should be
102+
resolved the same way---it is the resolution that is in line with that
103+
declaration.
104+
105+
Imagine that similarly previously a branch XYXZ was forked from XY,
106+
and XZ was merged into it, and resolved "X became Y or Z" into "X
107+
became W".
108+
109+
Now, if a branch ABXY was forked from AB and then merged XY, then ABXY
110+
would have line B in its early part and line Y in its later part.
111+
Such a merge would be quite clean. We can construct 4 combinations
112+
using these four branches ((AB, AC) x (XY, XZ)).
113+
114+
Merging ABXY and ACXZ would make "an early A became B or C, a late X
115+
became Y or Z" conflict, while merging ACXY and ABXZ would make "an
116+
early A became C or B, a late X became Y or Z". We can see there are
117+
4 combinations of ("B or C", "C or B") x ("X or Y", "Y or X").
118+
119+
By sorting, the conflict is given its canonical name, namely, "an
120+
early part became B or C, a late part becames X or Y", and whenever
121+
any of these four patterns appear, and we can get to the same conflict
122+
and resolution that we saw earlier.
123+
124+
Without the sorting, we'd have to somehow find a previous resolution
125+
from combinatorial explosion.
126+
127+
Conflict ID calculation
128+
~~~~~~~~~~~~~~~~~~~~~~~
129+
130+
Once the conflict normalization is done, the conflict ID is calculated
131+
as the sha1 hash of the conflict hunks appended to each other,
132+
separated by <NUL> characters. The conflict markers are stripped out
133+
before the sha1 is calculated. So in the example above, where we
134+
merge branch AC which changes line A to line C, into branch AB, which
135+
changes line A to line C, the conflict ID would be
136+
SHA1('B<NUL>C<NUL>').
137+
138+
If there are multiple conflicts in one file, the sha1 is calculated
139+
the same way with all hunks appended to each other, in the order in
140+
which they appear in the file, separated by a <NUL> character.
141+
142+
Nested conflicts
143+
~~~~~~~~~~~~~~~~
144+
145+
Nested conflicts are handled very similarly to "simple" conflicts.
146+
Similar to simple conflicts, the conflict is first normalized by
147+
stripping the labels from conflict markers, stripping the diff3
148+
output, and the sorting the conflict hunks, both for the outer and the
149+
inner conflict. This is done recursively, so any number of nested
150+
conflicts can be handled.
151+
152+
The only difference is in how the conflict ID is calculated. For the
153+
inner conflict, the conflict markers themselves are not stripped out
154+
before calculating the sha1.
155+
156+
Say we have the following conflict for example:
157+
158+
<<<<<<< HEAD
159+
1
160+
=======
161+
<<<<<<< HEAD
162+
3
163+
=======
164+
2
165+
>>>>>>> branch-2
166+
>>>>>>> branch-3~
167+
168+
After stripping out the labels of the conflict markers, and sorting
169+
the hunks, the conflict would look as follows:
170+
171+
<<<<<<<
172+
1
173+
=======
174+
<<<<<<<
175+
2
176+
=======
177+
3
178+
>>>>>>>
179+
>>>>>>>
180+
181+
and finally the conflict ID would be calculated as:
182+
`sha1('1<NUL><<<<<<<\n3\n=======\n2\n>>>>>>><NUL>')`

builtin/rerere.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ int cmd_rerere(int argc, const char **argv, const char *prefix)
7575
if (!strcmp(argv[0], "forget")) {
7676
struct pathspec pathspec;
7777
if (argc < 2)
78-
warning("'git rerere forget' without paths is deprecated");
78+
warning(_("'git rerere forget' without paths is deprecated"));
7979
parse_pathspec(&pathspec, 0, PATHSPEC_PREFER_CWD,
8080
prefix, argv + 1);
8181
return rerere_forget(&pathspec);
@@ -107,7 +107,7 @@ int cmd_rerere(int argc, const char **argv, const char *prefix)
107107
const char *path = merge_rr.items[i].string;
108108
const struct rerere_id *id = merge_rr.items[i].util;
109109
if (diff_two(rerere_path(id, "preimage"), path, path, path))
110-
die("unable to generate diff for %s", rerere_path(id, NULL));
110+
die(_("unable to generate diff for '%s'"), rerere_path(id, NULL));
111111
}
112112
} else
113113
usage_with_options(rerere_usage, options);

0 commit comments

Comments
 (0)