Skip to content

Race in remote fetcher can cause parallel tests to fail #96

Closed
@vanzin

Description

@vanzin

This code in remote_fetch.go has a race that can be triggered when the cache hasn't been populated yet. If you have multiple tests running in parallel, multiple processes will try to download the remote archive and write it to the cache location.

This can lead to errors like this:

--- FAIL: TestSuite (0.59s)
    database.go:49: could not start database: &{%!e(string=unable to extract postgres archive: xz: data is truncated or corrupt)}
FAIL

That can happen when the test has successfully downloaded the archive into the cache location, and opens the file; at the same time, another test starts writing its own cache file, and the first one ends up reading partially-written file, and failing to uncompress it.

The usual way to fix this is to write the data to a temporary file, and move it into the final location, which is an atomic operation (except on Windows, if that's a worry). If Windows support is desired, you can try the move, and if it fails, check if the error is because the target file exists (which means some other process "won" the race).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions