Skip to content

Commit faececa

Browse files
avargitster
authored andcommitted
test-lib: have the "check" mode for SANITIZE=leak consider leak logs
As noted in previous on-list discussions[1] we have various tests that will falsely report being leak-free because we're missing the relevant exit code from LSAN as summarized below. We should fix those issues, but in the meantime and as an additional sanity check we can and should consider our own ASAN logs before reporting that a test is leak-free. Before this compiling with SANITIZE=leak and running: ./t6407-merge-binary.sh Will exit successfully, now we'll get an error and an informative message on: GIT_TEST_SANITIZE_LEAK_LOG=true ./t6407-merge-binary.sh Even better, as noted in the updated t/README we'll now error out when combined with the "check" mode: GIT_TEST_PASSING_SANITIZE_LEAK=check \ GIT_TEST_SANITIZE_LEAK_LOG=true \ ./t4058-diff-duplicates.sh Why do we miss these leaks? Because: * We have leaks inside "test_expect_failure" blocks, which by design will not distinguish a "normal" failure from an abort() or segfault. See [1] for a discussion of it shortcomings. * We have "git" invocations outside of "test_expect_success", e.g. setup code in the main body of the test, or in test helper functions that don't use &&-chaining. * Our tests will otherwise catch segfaults and abort(), but if we invoke a command that invokes another command it needs to ferry the exit code up to us. Notably a command that e.g. might invoke "git pack-objects" might itself exit with status 128 if that "pack-objects" segfaults or abort()'s. If the test invoking the parent command(s) is using "test_must_fail" we'll consider it an expected "ok" failure. * run-command.c doesn't (but probably should) ferry up such exit codes, so for e.g. "git push" tests where we expect a failure and an underlying "git" command fails we won't ferry up the segfault or abort exit code. * We have gitweb.perl and some other perl code ignoring return values from close(), i.e. ignoring exit codes from "git rev-parse" et al. * We have in-tree shellscripts like "git-merge-one-file.sh" invoking git commands, they'll usually return their own exit codes on "git" failure, rather then ferrying up segfault or abort() exit code. E.g. these invocations in git-merge-one-file.sh leak, but aren't reflected in the "git merge" exit code: src1=$(git unpack-file $2) src2=$(git unpack-file $3) That case would be easily "fixed" by adding a line like this after each assignment: test $? -ne 0 && exit $? But we'd then in e.g. "t6407-merge-binary.sh" run into write_tree_trivial() in "builtin/merge.c" calling die() instead of ferrying up the relevant exit code. Let's remove "TEST_PASSES_SANITIZE_LEAK=true" from tests we were falsely marking as leak-free. In the case of t6407-merge-binary.sh it was marked as leak-free in 9081a42 (checkout: fix "branch info" memory leaks, 2021-11-16). I'd previously removed other bad "TEST_PASSES_SANITIZE_LEAK=true" opt-ins in the series merged in ea05fd5 (Merge branch 'ab/keep-git-exit-codes-in-tests', 2022-03-16). The case of t1060-object-corruption.sh is more subtle, and will be discussed in a subsequent commit. 1. https://lore.kernel.org/git/[email protected]/ Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent e92684e commit faececa

File tree

4 files changed

+95
-2
lines changed

4 files changed

+95
-2
lines changed

t/README

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -385,6 +385,12 @@ GIT_TEST_SANITIZE_LEAK_LOG=true will log memory leaks to
385385
"dedup_token" (see +"ASAN_OPTIONS=help=1 ./git") and other options to
386386
make logs +machine-readable.
387387

388+
With GIT_TEST_SANITIZE_LEAK_LOG=true we'll look at the leak logs
389+
before exiting and exit on failure if the logs showed that we had a
390+
memory leak, even if the test itself would have otherwise passed. This
391+
allows us to catch e.g. missing &&-chaining. This is especially useful
392+
when combined with "GIT_TEST_PASSING_SANITIZE_LEAK", see below.
393+
388394
GIT_TEST_PASSING_SANITIZE_LEAK=check when combined with "--immediate"
389395
will run to completion faster, and result in the same failing
390396
tests. The only practical reason to run
@@ -393,6 +399,15 @@ combine it with "GIT_TEST_SANITIZE_LEAK_LOG=true". If we stop at the
393399
first failing test case our leak logs won't show subsequent leaks we
394400
might have run into.
395401

402+
GIT_TEST_PASSING_SANITIZE_LEAK=(true|check) will not catch all memory
403+
leaks unless combined with GIT_TEST_SANITIZE_LEAK_LOG=true. Some tests
404+
run "git" (or "test-tool" etc.) without properly checking the exit
405+
code, or git will invoke itself and fail to ferry the abort() exit
406+
code to the original caller. When the two modes are combined we'll
407+
look at the "test-results/$TEST_NAME.leak/trace.*" files at the end of
408+
the test run to see if had memory leaks which the test itself didn't
409+
catch.
410+
396411
GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
397412
default to n.
398413

t/t1060-object-corruption.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
test_description='see how we handle various forms of corruption'
44

5-
TEST_PASSES_SANITIZE_LEAK=true
65
. ./test-lib.sh
76

87
# convert "1234abcd" to ".git/objects/12/34abcd"

t/t6407-merge-binary.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
55
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
66
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
77

8-
TEST_PASSES_SANITIZE_LEAK=true
98
. ./test-lib.sh
109

1110
test_expect_success setup '

t/test-lib.sh

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,13 +309,24 @@ TEST_RESULTS_SAN_FILE_PFX=trace
309309
TEST_RESULTS_SAN_DIR_SFX=leak
310310
TEST_RESULTS_SAN_FILE=
311311
TEST_RESULTS_SAN_DIR="$TEST_RESULTS_DIR/$TEST_NAME.$TEST_RESULTS_SAN_DIR_SFX"
312+
TEST_RESULTS_SAN_DIR_NR_LEAKS_STARTUP=
312313
TRASH_DIRECTORY="trash directory.$TEST_NAME$TEST_STRESS_JOB_SFX"
313314
test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
314315
case "$TRASH_DIRECTORY" in
315316
/*) ;; # absolute path is good
316317
*) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$TRASH_DIRECTORY" ;;
317318
esac
318319

320+
# Utility functions using $TEST_RESULTS_* variables
321+
nr_san_dir_leaks_ () {
322+
# stderr piped to /dev/null because the directory may have
323+
# been "rmdir"'d already.
324+
find "$TEST_RESULTS_SAN_DIR" \
325+
-type f \
326+
-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
327+
wc -l
328+
}
329+
319330
# If --stress was passed, run this test repeatedly in several parallel loops.
320331
if test "$GIT_TEST_STRESS_STARTED" = "done"
321332
then
@@ -1191,6 +1202,66 @@ test_atexit_handler () {
11911202
teardown_malloc_check
11921203
}
11931204

1205+
sanitize_leak_log_message_ () {
1206+
local new="$1" &&
1207+
local old="$2" &&
1208+
local file="$3" &&
1209+
1210+
printf "With SANITIZE=leak at exit we have %d leak logs, but started with %d
1211+
1212+
This means that we have a blindspot where git is leaking but we're
1213+
losing the exit code somewhere, or not propagating it appropriately
1214+
upwards!
1215+
1216+
See the logs at \"%s.*\";
1217+
those logs are reproduced below." \
1218+
"$new" "$old" "$file"
1219+
}
1220+
1221+
check_test_results_san_file_ () {
1222+
if test -z "$TEST_RESULTS_SAN_FILE"
1223+
then
1224+
return
1225+
fi &&
1226+
local old="$TEST_RESULTS_SAN_DIR_NR_LEAKS_STARTUP" &&
1227+
local new="$(nr_san_dir_leaks_)" &&
1228+
1229+
if test $new -le $old
1230+
then
1231+
return
1232+
fi &&
1233+
local out="$(sanitize_leak_log_message_ "$new" "$old" "$TEST_RESULTS_SAN_FILE")" &&
1234+
say_color error "$out" &&
1235+
if test "$old" != 0
1236+
then
1237+
echo &&
1238+
say_color error "The logs include output from past runs to avoid" &&
1239+
say_color error "that remove 'test-results' between runs."
1240+
fi &&
1241+
say_color error "$(cat "$TEST_RESULTS_SAN_FILE".*)" &&
1242+
1243+
if test -n "$passes_sanitize_leak" && test "$test_failure" = 0
1244+
then
1245+
say "As TEST_PASSES_SANITIZE_LEAK=true and our logs show we're leaking, exit non-zero!" &&
1246+
invert_exit_code=t
1247+
elif test -n "$passes_sanitize_leak"
1248+
then
1249+
say "As TEST_PASSES_SANITIZE_LEAK=true and our logs show we're leaking, and we're failing for other reasons too..." &&
1250+
invert_exit_code=
1251+
elif test -n "$sanitize_leak_check" && test "$test_failure" = 0
1252+
then
1253+
say "As TEST_PASSES_SANITIZE_LEAK=true isn't set the above leak is 'ok' with GIT_TEST_PASSING_SANITIZE_LEAK=check" &&
1254+
invert_exit_code=
1255+
elif test -n "$sanitize_leak_check"
1256+
then
1257+
say "As TEST_PASSES_SANITIZE_LEAK=true isn't set the above leak is 'ok' with GIT_TEST_PASSING_SANITIZE_LEAK=check" &&
1258+
invert_exit_code=t
1259+
else
1260+
say "With GIT_TEST_SANITIZE_LEAK_LOG=true our logs revealed a memory leak, exit non-zero!" &&
1261+
invert_exit_code=t
1262+
fi
1263+
}
1264+
11941265
test_done () {
11951266
# Run the atexit commands _before_ the trash directory is
11961267
# removed, so the commands can access pidfiles and socket files.
@@ -1270,6 +1341,8 @@ test_done () {
12701341
error "Tests passed but test cleanup failed; aborting"
12711342
fi
12721343

1344+
check_test_results_san_file_ "$test_failure"
1345+
12731346
if test -z "$skip_all" && test -n "$invert_exit_code"
12741347
then
12751348
say_color warn "# faking up non-zero exit with --invert-exit-code"
@@ -1286,6 +1359,8 @@ test_done () {
12861359
say_color error "# failed $test_failure among $msg"
12871360
say "1..$test_count"
12881361

1362+
check_test_results_san_file_ "$test_failure"
1363+
12891364
if test -n "$invert_exit_code"
12901365
then
12911366
_invert_exit_code_failure_end_blurb
@@ -1464,6 +1539,7 @@ then
14641539

14651540
if test "$GIT_TEST_PASSING_SANITIZE_LEAK" = "check"
14661541
then
1542+
sanitize_leak_check=t
14671543
if test -n "$invert_exit_code"
14681544
then
14691545
BAIL_OUT "cannot use --invert-exit-code under GIT_TEST_PASSING_SANITIZE_LEAK=check"
@@ -1489,6 +1565,10 @@ then
14891565
fi &&
14901566
TEST_RESULTS_SAN_FILE="$TEST_RESULTS_SAN_DIR/$TEST_RESULTS_SAN_FILE_PFX"
14911567

1568+
# In case "test-results" is left over from a previous
1569+
# run: Only report if new leaks show up.
1570+
TEST_RESULTS_SAN_DIR_NR_LEAKS_STARTUP=$(nr_san_dir_leaks_)
1571+
14921572
# Don't litter *.leak dirs if there was nothing to report
14931573
test_atexit "rmdir \"$TEST_RESULTS_SAN_DIR\" 2>/dev/null || :"
14941574

0 commit comments

Comments
 (0)