Skip to content

Commit 1618bd6

Browse files
committed
Merge branch 'tb/incremental-midx-part-2' into seen
Incremental updates of multi-pack index files. * tb/incremental-midx-part-2: fixup! pack-bitmap.c: open and store incremental bitmap layers fixup! midx: implement writing incremental MIDX bitmaps midx: implement writing incremental MIDX bitmaps pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators pack-bitmap.c: keep track of each layer's type bitmaps ewah: implement `struct ewah_or_iterator` pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs pack-bitmap.c: compute disk-usage with incremental MIDXs pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs pack-bitmap.c: open and store incremental bitmap layers pack-revindex: prepare for incremental MIDX bitmaps Documentation: describe incremental MIDX bitmaps
2 parents aa6a093 + d9ab1cd commit 1618bd6

File tree

10 files changed

+549
-112
lines changed

10 files changed

+549
-112
lines changed

Documentation/technical/multi-pack-index.txt

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,70 @@ objects_nr($H2) + objects_nr($H1) + i
164164
(in the C implementation, this is often computed as `i +
165165
m->num_objects_in_base`).
166166

167+
=== Pseudo-pack order for incremental MIDXs
168+
169+
The original implementation of multi-pack reachability bitmaps defined
170+
the pseudo-pack order in linkgit:gitformat-pack[5] (see the section
171+
titled "multi-pack-index reverse indexes") roughly as follows:
172+
173+
____
174+
In short, a MIDX's pseudo-pack is the de-duplicated concatenation of
175+
objects in packs stored by the MIDX, laid out in pack order, and the
176+
packs arranged in MIDX order (with the preferred pack coming first).
177+
____
178+
179+
In the incremental MIDX design, we extend this definition to include
180+
objects from multiple layers of the MIDX chain. The pseudo-pack order
181+
for incremental MIDXs is determined by concatenating the pseudo-pack
182+
ordering for each layer of the MIDX chain in order. Formally two objects
183+
`o1` and `o2` are compared as follows:
184+
185+
1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then
186+
`o1` is considered less than `o2`.
187+
2. Otherwise, if `o1` and `o2` appear in the same MIDX layer, and that
188+
MIDX layer has no base, then If one of `pack(o1)` and `pack(o2)` is
189+
preferred and the other is not, then the preferred one sorts first. If
190+
there is a base layer (i.e. the MIDX layer is not the first layer in
191+
the chain), then if `pack(o1)` appears earlier in that MIDX layer's
192+
pack order, than `o1` is less than `o2`. Likewise if `pack(o2)`
193+
appears earlier, than the opposite is true.
194+
3. Otherwise, `o1` and `o2` appear in the same pack, and thus in the
195+
same MIDX layer. Sort `o1` and `o2` by their offset within their
196+
containing packfile.
197+
198+
=== Reachability bitmaps and incremental MIDXs
199+
200+
Each layer of an incremental MIDX chain may have its objects (and the
201+
objects from any previous layer in the same MIDX chain) represented in
202+
its own `*.bitmap` file.
203+
204+
The structure of a `*.bitmap` file belonging to an incremental MIDX
205+
chain is identical to that of a non-incremental MIDX bitmap, or a
206+
classic single-pack bitmap. Since objects are added to the end of the
207+
incremental MIDX's pseudo-pack order (see: above), it is possible to
208+
extend a bitmap when appending to the end of a MIDX chain.
209+
210+
(Note: it is possible likewise to compress a contiguous sequence of MIDX
211+
incremental layers, and their `*.bitmap`(s) into a single layer and
212+
`*.bitmap`, but this is not yet implemented.)
213+
214+
The object positions used are global within the pseudo-pack order, so
215+
subsequent layers will have, for example, `m->num_objects_in_base`
216+
number of `0` bits in each of their four type bitmaps. This follows from
217+
the fact that we only write type bitmap entries for objects present in
218+
the layer immediately corresponding to the bitmap).
219+
220+
Note also that only the bitmap pertaining to the most recent layer in an
221+
incremental MIDX chain is used to store reachability information about
222+
the interesting and uninteresting objects in a reachability query.
223+
Earlier bitmap layers are only used to look up commit and pseudo-merge
224+
bitmaps from that layer, as well as the type-level bitmaps for objects
225+
in that layer.
226+
227+
To simplify the implementation, type-level bitmaps are iterated
228+
simultaneously, and their results are OR'd together to avoid recursively
229+
calling internal bitmap functions.
230+
167231
Future Work
168232
-----------
169233

builtin/pack-objects.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1370,7 +1370,8 @@ static void write_pack_file(void)
13701370

13711371
if (write_bitmap_index) {
13721372
bitmap_writer_init(&bitmap_writer,
1373-
the_repository, &to_pack);
1373+
the_repository, &to_pack,
1374+
NULL);
13741375
bitmap_writer_set_checksum(&bitmap_writer, hash);
13751376
bitmap_writer_build_type_index(&bitmap_writer,
13761377
written_list);

ewah/ewah_bitmap.c

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -372,6 +372,39 @@ void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent)
372372
read_new_rlw(it);
373373
}
374374

375+
void ewah_or_iterator_init(struct ewah_or_iterator *it,
376+
struct ewah_bitmap **parents, size_t nr)
377+
{
378+
size_t i;
379+
380+
memset(it, 0, sizeof(*it));
381+
382+
ALLOC_ARRAY(it->its, nr);
383+
for (i = 0; i < nr; i++)
384+
ewah_iterator_init(&it->its[it->nr++], parents[i]);
385+
}
386+
387+
int ewah_or_iterator_next(eword_t *next, struct ewah_or_iterator *it)
388+
{
389+
eword_t buf, out = 0;
390+
size_t i;
391+
int ret = 0;
392+
393+
for (i = 0; i < it->nr; i++)
394+
if (ewah_iterator_next(&buf, &it->its[i])) {
395+
out |= buf;
396+
ret = 1;
397+
}
398+
399+
*next = out;
400+
return ret;
401+
}
402+
403+
void ewah_or_iterator_free(struct ewah_or_iterator *it)
404+
{
405+
free(it->its);
406+
}
407+
375408
void ewah_xor(
376409
struct ewah_bitmap *ewah_i,
377410
struct ewah_bitmap *ewah_j,

ewah/ewok.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,18 @@ void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent);
148148
*/
149149
int ewah_iterator_next(eword_t *next, struct ewah_iterator *it);
150150

151+
struct ewah_or_iterator {
152+
struct ewah_iterator *its;
153+
size_t nr;
154+
};
155+
156+
void ewah_or_iterator_init(struct ewah_or_iterator *it,
157+
struct ewah_bitmap **parents, size_t nr);
158+
159+
int ewah_or_iterator_next(eword_t *next, struct ewah_or_iterator *it);
160+
161+
void ewah_or_iterator_free(struct ewah_or_iterator *it);
162+
151163
void ewah_xor(
152164
struct ewah_bitmap *ewah_i,
153165
struct ewah_bitmap *ewah_j,

midx-write.c

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -827,20 +827,26 @@ static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr
827827
return cb.commits;
828828
}
829829

830-
static int write_midx_bitmap(const char *midx_name,
830+
static int write_midx_bitmap(struct write_midx_context *ctx,
831+
const char *object_dir,
831832
const unsigned char *midx_hash,
832833
struct packing_data *pdata,
833834
struct commit **commits,
834835
uint32_t commits_nr,
835-
uint32_t *pack_order,
836836
unsigned flags)
837837
{
838838
int ret, i;
839839
uint16_t options = 0;
840840
struct bitmap_writer writer;
841841
struct pack_idx_entry **index;
842-
char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name,
843-
hash_to_hex(midx_hash));
842+
struct strbuf bitmap_name = STRBUF_INIT;
843+
844+
if (ctx->incremental)
845+
get_split_midx_filename_ext(&bitmap_name, object_dir, midx_hash,
846+
MIDX_EXT_BITMAP);
847+
else
848+
get_midx_filename_ext(&bitmap_name, object_dir, midx_hash,
849+
MIDX_EXT_BITMAP);
844850

845851
trace2_region_enter("midx", "write_midx_bitmap", the_repository);
846852

@@ -859,7 +865,8 @@ static int write_midx_bitmap(const char *midx_name,
859865
for (i = 0; i < pdata->nr_objects; i++)
860866
index[i] = &pdata->objects[i].idx;
861867

862-
bitmap_writer_init(&writer, the_repository, pdata);
868+
bitmap_writer_init(&writer, the_repository, pdata,
869+
ctx->incremental ? ctx->base_midx : NULL);
863870
bitmap_writer_show_progress(&writer, flags & MIDX_PROGRESS);
864871
bitmap_writer_build_type_index(&writer, index);
865872

@@ -877,19 +884,19 @@ static int write_midx_bitmap(const char *midx_name,
877884
* bitmap_writer_finish().
878885
*/
879886
for (i = 0; i < pdata->nr_objects; i++)
880-
index[pack_order[i]] = &pdata->objects[i].idx;
887+
index[ctx->pack_order[i]] = &pdata->objects[i].idx;
881888

882889
bitmap_writer_select_commits(&writer, commits, commits_nr);
883890
ret = bitmap_writer_build(&writer);
884891
if (ret < 0)
885892
goto cleanup;
886893

887894
bitmap_writer_set_checksum(&writer, midx_hash);
888-
bitmap_writer_finish(&writer, index, bitmap_name, options);
895+
bitmap_writer_finish(&writer, index, bitmap_name.buf, options);
889896

890897
cleanup:
891898
free(index);
892-
free(bitmap_name);
899+
strbuf_release(&bitmap_name);
893900
bitmap_writer_free(&writer);
894901

895902
trace2_region_leave("midx", "write_midx_bitmap", the_repository);
@@ -1073,8 +1080,6 @@ static int write_midx_internal(const char *object_dir,
10731080
trace2_region_enter("midx", "write_midx_internal", the_repository);
10741081

10751082
ctx.incremental = !!(flags & MIDX_WRITE_INCREMENTAL);
1076-
if (ctx.incremental && (flags & MIDX_WRITE_BITMAP))
1077-
die(_("cannot write incremental MIDX with bitmap"));
10781083

10791084
if (ctx.incremental)
10801085
strbuf_addf(&midx_name,
@@ -1116,6 +1121,12 @@ static int write_midx_internal(const char *object_dir,
11161121
if (ctx.incremental) {
11171122
struct multi_pack_index *m = ctx.base_midx;
11181123
while (m) {
1124+
if (flags & MIDX_WRITE_BITMAP && load_midx_revindex(m)) {
1125+
error(_("could not load reverse index for MIDX %s"),
1126+
hash_to_hex(get_midx_checksum(m)));
1127+
result = 1;
1128+
goto cleanup;
1129+
}
11191130
ctx.num_multi_pack_indexes_before++;
11201131
m = m->base_midx;
11211132
}
@@ -1405,8 +1416,8 @@ static int write_midx_internal(const char *object_dir,
14051416
FREE_AND_NULL(ctx.entries);
14061417
ctx.entries_nr = 0;
14071418

1408-
if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata,
1409-
commits, commits_nr, ctx.pack_order,
1419+
if (write_midx_bitmap(&ctx, object_dir,
1420+
midx_hash, &pdata, commits, commits_nr,
14101421
flags) < 0) {
14111422
error(_("could not write multi-pack bitmap"));
14121423
result = 1;

pack-bitmap-write.c

Lines changed: 49 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@
2525
#include "alloc.h"
2626
#include "refs.h"
2727
#include "strmap.h"
28+
#include "midx.h"
29+
#include "pack-revindex.h"
2830

2931
struct bitmapped_commit {
3032
struct commit *commit;
@@ -42,14 +44,16 @@ static inline int bitmap_writer_nr_selected_commits(struct bitmap_writer *writer
4244
}
4345

4446
void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r,
45-
struct packing_data *pdata)
47+
struct packing_data *pdata,
48+
struct multi_pack_index *midx)
4649
{
4750
memset(writer, 0, sizeof(struct bitmap_writer));
4851
if (writer->bitmaps)
4952
BUG("bitmap writer already initialized");
5053
writer->bitmaps = kh_init_oid_map();
5154
writer->pseudo_merge_commits = kh_init_oid_map();
5255
writer->to_pack = pdata;
56+
writer->midx = midx;
5357

5458
string_list_init_dup(&writer->pseudo_merge_groups);
5559

@@ -112,6 +116,11 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer,
112116
struct pack_idx_entry **index)
113117
{
114118
uint32_t i;
119+
uint32_t base_objects = 0;
120+
121+
if (writer->midx)
122+
base_objects = writer->midx->num_objects +
123+
writer->midx->num_objects_in_base;
115124

116125
writer->commits = ewah_new();
117126
writer->trees = ewah_new();
@@ -141,19 +150,19 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer,
141150

142151
switch (real_type) {
143152
case OBJ_COMMIT:
144-
ewah_set(writer->commits, i);
153+
ewah_set(writer->commits, i + base_objects);
145154
break;
146155

147156
case OBJ_TREE:
148-
ewah_set(writer->trees, i);
157+
ewah_set(writer->trees, i + base_objects);
149158
break;
150159

151160
case OBJ_BLOB:
152-
ewah_set(writer->blobs, i);
161+
ewah_set(writer->blobs, i + base_objects);
153162
break;
154163

155164
case OBJ_TAG:
156-
ewah_set(writer->tags, i);
165+
ewah_set(writer->tags, i + base_objects);
157166
break;
158167

159168
default:
@@ -206,19 +215,37 @@ void bitmap_writer_push_commit(struct bitmap_writer *writer,
206215
static uint32_t find_object_pos(struct bitmap_writer *writer,
207216
const struct object_id *oid, int *found)
208217
{
209-
struct object_entry *entry = packlist_find(writer->to_pack, oid);
218+
struct object_entry *entry;
219+
220+
entry = packlist_find(writer->to_pack, oid);
221+
if (entry) {
222+
uint32_t base_objects = 0;
223+
if (writer->midx)
224+
base_objects = writer->midx->num_objects +
225+
writer->midx->num_objects_in_base;
210226

211-
if (!entry) {
212227
if (found)
213-
*found = 0;
214-
warning("Failed to write bitmap index. Packfile doesn't have full closure "
215-
"(object %s is missing)", oid_to_hex(oid));
216-
return 0;
228+
*found = 1;
229+
return oe_in_pack_pos(writer->to_pack, entry) + base_objects;
230+
} else if (writer->midx) {
231+
uint32_t at, pos;
232+
233+
if (!bsearch_midx(oid, writer->midx, &at))
234+
goto missing;
235+
if (midx_to_pack_pos(writer->midx, at, &pos) < 0)
236+
goto missing;
237+
238+
if (found)
239+
*found = 1;
240+
return pos;
217241
}
218242

243+
missing:
219244
if (found)
220-
*found = 1;
221-
return oe_in_pack_pos(writer->to_pack, entry);
245+
*found = 0;
246+
warning("Failed to write bitmap index. Packfile doesn't have full closure "
247+
"(object %s is missing)", oid_to_hex(oid));
248+
return 0;
222249
}
223250

224251
static void compute_xor_offsets(struct bitmap_writer *writer)
@@ -585,7 +612,7 @@ int bitmap_writer_build(struct bitmap_writer *writer)
585612
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
586613
struct prio_queue tree_queue = { NULL };
587614
struct bitmap_index *old_bitmap;
588-
uint32_t *mapping;
615+
uint32_t *mapping = NULL;
589616
int closed = 1; /* until proven otherwise */
590617

591618
if (writer->show_progress)
@@ -1018,7 +1045,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer,
10181045
struct strbuf tmp_file = STRBUF_INIT;
10191046
struct hashfile *f;
10201047
off_t *offsets = NULL;
1021-
uint32_t i;
1048+
uint32_t i, base_objects;
10221049

10231050
struct bitmap_disk_header header;
10241051

@@ -1044,6 +1071,12 @@ void bitmap_writer_finish(struct bitmap_writer *writer,
10441071
if (options & BITMAP_OPT_LOOKUP_TABLE)
10451072
CALLOC_ARRAY(offsets, writer->to_pack->nr_objects);
10461073

1074+
if (writer->midx)
1075+
base_objects = writer->midx->num_objects +
1076+
writer->midx->num_objects_in_base;
1077+
else
1078+
base_objects = 0;
1079+
10471080
for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) {
10481081
struct bitmapped_commit *stored = &writer->selected[i];
10491082
int commit_pos = oid_pos(&stored->commit->object.oid, index,
@@ -1052,7 +1085,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer,
10521085

10531086
if (commit_pos < 0)
10541087
BUG(_("trying to write commit not in index"));
1055-
stored->commit_pos = commit_pos;
1088+
stored->commit_pos = commit_pos + base_objects;
10561089
}
10571090

10581091
write_selected_commits_v1(writer, f, offsets);

0 commit comments

Comments
 (0)