Skip to content

Output Interleaving

Aaron Riekenberg edited this page May 1, 2023 · 31 revisions

No parallel execution command would be complete without a long discussion of output interleaving. :)

What does the competition do?

xargs (no protection against interleaving)

From the xargs man page, when run with the -P option to enable parallel processes:

Please note that it is up to the called processes to
properly manage parallel access to shared resources.  For
example, if more than one of them tries to print to
stdout, the output will be produced in an indeterminate
order (and very likely mixed up) unless the processes
collaborate in some way to prevent this.  Using some kind
of locking scheme is one way to prevent such problems.  In
general, using a locking scheme will help ensure correct
output but reduce performance. 

Short summary: with xargs all child processes inherit stdout and stderr file descriptors from the parent process. This is relatively fast, but when multiple parallel child processes are writing to stdout or stderr there is no protection from interleaved output.

GNU parallel (slow but does not interleave by default)

From the GNU parallel man page for the --group option:

--group

Group output.

Output from each job is grouped together and is only printed when the command is finished. Stdout (standard output) first followed by stderr (standard error).

This takes in the order of 0.5ms CPU time per job and depends on the speed of your disk for larger output.

--group is the default.

Short summary: GNU parallel writes temporary output to disk. This prevents interleaved output, but its slow. See benchmarks

What does rust-parallel do?

Clone this wiki locally