Skip to content

Commit bdcc737

Browse files
chriscoolgitster
authored andcommitted
partial-clone: add multiple remotes in the doc
While at it, let's remove a reference to ODB effort as the ODB effort has been replaced by directly enhancing partial clone and promisor remote features. Signed-off-by: Christian Couder <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 5eea123 commit bdcc737

File tree

1 file changed

+84
-33
lines changed

1 file changed

+84
-33
lines changed

Documentation/technical/partial-clone.txt

Lines changed: 84 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,20 @@ advance* during clone and fetch operations and thereby reduce download
3030
times and disk usage. Missing objects can later be "demand fetched"
3131
if/when needed.
3232

33+
A remote that can later provide the missing objects is called a
34+
promisor remote, as it promises to send the objects when
35+
requested. Initialy Git supported only one promisor remote, the origin
36+
remote from which the user cloned and that was configured in the
37+
"extensions.partialClone" config option. Later support for more than
38+
one promisor remote has been implemented.
39+
3340
Use of partial clone requires that the user be online and the origin
34-
remote be available for on-demand fetching of missing objects. This may
35-
or may not be problematic for the user. For example, if the user can
36-
stay within the pre-selected subset of the source tree, they may not
37-
encounter any missing objects. Alternatively, the user could try to
38-
pre-fetch various objects if they know that they are going offline.
41+
remote or other promisor remotes be available for on-demand fetching
42+
of missing objects. This may or may not be problematic for the user.
43+
For example, if the user can stay within the pre-selected subset of
44+
the source tree, they may not encounter any missing objects.
45+
Alternatively, the user could try to pre-fetch various objects if they
46+
know that they are going offline.
3947

4048

4149
Non-Goals
@@ -100,18 +108,18 @@ or commits that reference missing trees.
100108
Handling Missing Objects
101109
------------------------
102110

103-
- An object may be missing due to a partial clone or fetch, or missing due
104-
to repository corruption. To differentiate these cases, the local
105-
repository specially indicates such filtered packfiles obtained from the
106-
promisor remote as "promisor packfiles".
111+
- An object may be missing due to a partial clone or fetch, or missing
112+
due to repository corruption. To differentiate these cases, the
113+
local repository specially indicates such filtered packfiles
114+
obtained from promisor remotes as "promisor packfiles".
107115
+
108116
These promisor packfiles consist of a "<name>.promisor" file with
109117
arbitrary contents (like the "<name>.keep" files), in addition to
110118
their "<name>.pack" and "<name>.idx" files.
111119

112120
- The local repository considers a "promisor object" to be an object that
113-
it knows (to the best of its ability) that the promisor remote has promised
114-
that it has, either because the local repository has that object in one of
121+
it knows (to the best of its ability) that promisor remotes have promised
122+
that they have, either because the local repository has that object in one of
115123
its promisor packfiles, or because another promisor object refers to it.
116124
+
117125
When Git encounters a missing object, Git can see if it is a promisor object
@@ -123,12 +131,12 @@ expensive-to-modify list of missing objects.[a]
123131
- Since almost all Git code currently expects any referenced object to be
124132
present locally and because we do not want to force every command to do
125133
a dry-run first, a fallback mechanism is added to allow Git to attempt
126-
to dynamically fetch missing objects from the promisor remote.
134+
to dynamically fetch missing objects from promisor remotes.
127135
+
128136
When the normal object lookup fails to find an object, Git invokes
129-
fetch-object to try to get the object from the server and then retry
130-
the object lookup. This allows objects to be "faulted in" without
131-
complicated prediction algorithms.
137+
promisor_remote_get_direct() to try to get the object from a promisor
138+
remote and then retry the object lookup. This allows objects to be
139+
"faulted in" without complicated prediction algorithms.
132140
+
133141
For efficiency reasons, no check as to whether the missing object is
134142
actually a promisor object is performed.
@@ -157,8 +165,7 @@ and prefetch those objects in bulk.
157165
+
158166
We are not happy with this global variable and would like to remove it,
159167
but that requires significant refactoring of the object code to pass an
160-
additional flag. We hope that concurrent efforts to add an ODB API can
161-
encompass this.
168+
additional flag.
162169

163170

164171
Fetching Missing Objects
@@ -182,21 +189,63 @@ has been updated to not use any object flags when the corresponding argument
182189
though they are not necessary.
183190

184191

192+
Using many promisor remotes
193+
---------------------------
194+
195+
Many promisor remotes can be configured and used.
196+
197+
This allows for example a user to have multiple geographically-close
198+
cache servers for fetching missing blobs while continuing to do
199+
filtered `git-fetch` commands from the central server.
200+
201+
When fetching objects, promisor remotes are tried one after the other
202+
until all the objects have been fetched.
203+
204+
Remotes that are considered "promisor" remotes are those specified by
205+
the following configuration variables:
206+
207+
- `extensions.partialClone = <name>`
208+
209+
- `remote.<name>.promisor = true`
210+
211+
- `remote.<name>.partialCloneFilter = ...`
212+
213+
Only one promisor remote can be configured using the
214+
`extensions.partialClone` config variable. This promisor remote will
215+
be the last one tried when fetching objects.
216+
217+
We decided to make it the last one we try, because it is likely that
218+
someone using many promisor remotes is doing so because the other
219+
promisor remotes are better for some reason (maybe they are closer or
220+
faster for some kind of objects) than the origin, and the origin is
221+
likely to be the remote specified by extensions.partialClone.
222+
223+
This justification is not very strong, but one choice had to be made,
224+
and anyway the long term plan should be to make the order somehow
225+
fully configurable.
226+
227+
For now though the other promisor remotes will be tried in the order
228+
they appear in the config file.
229+
185230
Current Limitations
186231
-------------------
187232

188-
- The remote used for a partial clone (or the first partial fetch
189-
following a regular clone) is marked as the "promisor remote".
233+
- It is not possible to specify the order in which the promisor
234+
remotes are tried in other ways than the order in which they appear
235+
in the config file.
190236
+
191-
We are currently limited to a single promisor remote and only that
192-
remote may be used for subsequent partial fetches.
237+
It is also not possible to specify an order to be used when fetching
238+
from one remote and a different order when fetching from another
239+
remote.
240+
241+
- It is not possible to push only specific objects to a promisor
242+
remote.
193243
+
194-
We accept this limitation because we believe initial users of this
195-
feature will be using it on repositories with a strong single central
196-
server.
244+
It is not possible to push at the same time to multiple promisor
245+
remote in a specific order.
197246

198-
- Dynamic object fetching will only ask the promisor remote for missing
199-
objects. We assume that the promisor remote has a complete view of the
247+
- Dynamic object fetching will only ask promisor remotes for missing
248+
objects. We assume that promisor remotes have a complete view of the
200249
repository and can satisfy all such requests.
201250

202251
- Repack essentially treats promisor and non-promisor packfiles as 2
@@ -218,15 +267,17 @@ server.
218267
Future Work
219268
-----------
220269

221-
- Allow more than one promisor remote and define a strategy for fetching
222-
missing objects from specific promisor remotes or of iterating over the
223-
set of promisor remotes until a missing object is found.
270+
- Improve the way to specify the order in which promisor remotes are
271+
tried.
224272
+
225-
A user might want to have multiple geographically-close cache servers
226-
for fetching missing blobs while continuing to do filtered `git-fetch`
227-
commands from the central server, for example.
273+
For example this could allow to specify explicitly something like:
274+
"When fetching from this remote, I want to use these promisor remotes
275+
in this order, though, when pushing or fetching to that remote, I want
276+
to use those promisor remotes in that order."
277+
278+
- Allow pushing to promisor remotes.
228279
+
229-
Or the user might want to work in a triangular work flow with multiple
280+
The user might want to work in a triangular work flow with multiple
230281
promisor remotes that each have an incomplete view of the repository.
231282

232283
- Allow repack to work on promisor packfiles (while keeping them distinct

0 commit comments

Comments
 (0)