Skip to content

Commit e51217e

Browse files
peffgitster
authored andcommitted
t5000: test tar files that overflow ustar headers
The ustar format only has room for 11 (or 12, depending on some implementations) octal digits for the size and mtime of each file. For values larger than this, we have to add pax extended headers to specify the real data, and git does not yet know how to do so. Before fixing that, let's start off with some test infrastructure, as designing portable and efficient tests for this is non-trivial. We want to use the system tar to check our output (because what we really care about is interoperability), but we can't rely on it: 1. being able to read pax headers 2. being able to handle huge sizes or mtimes 3. supporting a "t" format we can parse So as a prerequisite, we can feed the system tar a reference tarball to make sure it can handle these features. The reference tar here was created with: dd if=/dev/zero seek=64G bs=1 count=1 of=huge touch -d @68719476737 huge tar cf - --format=pax | head -c 2048 using GNU tar. Note that this is not a complete tarfile, but it's enough to contain the headers we want to examine. Likewise, we need to convince git that it has a 64GB blob to output. Running "git add" on that 64GB file takes many minutes of CPU, and even compressed, the result is 64MB. So again, I pre-generated that loose object, and then took only the first 2k of it. That should be enough to generate 2MB of data before hitting an inflate error, which is plenty for us to generate the tar header (and then die of SIGPIPE while streaming the rest out). The tests are split so that we test as much as we can even with an uncooperative system tar. This actually catches the current breakage (which is that we die("BUG") trying to write the ustar header) on every system, and then on systems where we can, we go farther and actually verify the result. Helped-by: Robin H. Johnson <[email protected]> Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 4886081 commit e51217e

File tree

3 files changed

+74
-0
lines changed

3 files changed

+74
-0
lines changed

t/t5000-tar-tree.sh

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -319,4 +319,78 @@ test_expect_success 'catch non-matching pathspec' '
319319
test_must_fail git archive -v HEAD -- "*.abc" >/dev/null
320320
'
321321

322+
# Pull the size and date of each entry in a tarfile using the system tar.
323+
#
324+
# We'll pull out only the year from the date; that avoids any question of
325+
# timezones impacting the result (as long as we keep our test times away from a
326+
# year boundary; our reference times are all in August).
327+
#
328+
# The output of tar_info is expected to be "<size> <year>", both in decimal. It
329+
# ignores the return value of tar. We have to do this, because some of our test
330+
# input is only partial (the real data is 64GB in some cases).
331+
tar_info () {
332+
"$TAR" tvf "$1" |
333+
awk '{
334+
split($4, date, "-")
335+
print $3 " " date[1]
336+
}'
337+
}
338+
339+
# See if our system tar can handle a tar file with huge sizes and dates far in
340+
# the future, and that we can actually parse its output.
341+
#
342+
# The reference file was generated by GNU tar, and the magic time and size are
343+
# both octal 01000000000001, which overflows normal ustar fields.
344+
test_lazy_prereq TAR_HUGE '
345+
echo "68719476737 4147" >expect &&
346+
tar_info "$TEST_DIRECTORY"/t5000/huge-and-future.tar >actual &&
347+
test_cmp expect actual
348+
'
349+
350+
test_expect_success 'set up repository with huge blob' '
351+
obj_d=19 &&
352+
obj_f=f9c8273ec45a8938e6999cb59b3ff66739902a &&
353+
obj=${obj_d}${obj_f} &&
354+
mkdir -p .git/objects/$obj_d &&
355+
cp "$TEST_DIRECTORY"/t5000/$obj .git/objects/$obj_d/$obj_f &&
356+
rm -f .git/index &&
357+
git update-index --add --cacheinfo 100644,$obj,huge &&
358+
git commit -m huge
359+
'
360+
361+
# We expect git to die with SIGPIPE here (otherwise we
362+
# would generate the whole 64GB).
363+
test_expect_failure 'generate tar with huge size' '
364+
{
365+
git archive HEAD
366+
echo $? >exit-code
367+
} | test_copy_bytes 4096 >huge.tar &&
368+
echo 141 >expect &&
369+
test_cmp expect exit-code
370+
'
371+
372+
test_expect_failure TAR_HUGE 'system tar can read our huge size' '
373+
echo 68719476737 >expect &&
374+
tar_info huge.tar | cut -d" " -f1 >actual &&
375+
test_cmp expect actual
376+
'
377+
378+
test_expect_success 'set up repository with far-future commit' '
379+
rm -f .git/index &&
380+
echo content >file &&
381+
git add file &&
382+
GIT_COMMITTER_DATE="@68719476737 +0000" \
383+
git commit -m "tempori parendum"
384+
'
385+
386+
test_expect_failure 'generate tar with future mtime' '
387+
git archive HEAD >future.tar
388+
'
389+
390+
test_expect_failure TAR_HUGE 'system tar can read our future mtime' '
391+
echo 4147 >expect &&
392+
tar_info future.tar | cut -d" " -f2 >actual &&
393+
test_cmp expect actual
394+
'
395+
322396
test_done
Binary file not shown.

t/t5000/huge-and-future.tar

2 KB
Binary file not shown.

0 commit comments

Comments
 (0)