Command-line utility to execute commands in parallel and aggregate their output.
Similar interface to GNU Parallel or xargs but implemented in rust and tokio.
* Supports running commands read from stdin or input files similar to xargs.
* Supports :::
syntax to run all combinations of argument groups similar to GNU Parallel.
Prevents output interleaving and is very fast.
See the demos for example usage.
``` $ rust-parallel --help Execute commands in parallel
By Aaron Riekenberg aaron.riekenberg@gmail.com
https://github.com/aaronriekenberg/rust-parallel https://crates.io/crates/rust-parallel
Usage: rust-parallel [OPTIONS] [COMMANDANDINITIAL_ARGUMENTS]...
Arguments: [COMMANDANDINITIAL_ARGUMENTS]... Optional command and initial arguments to run for each input line
Options: -c, --commands-from-args Run commands from arguments only.
In this mode the ::: separator is used to run the cartesian product of argument groups.
-d, --discard-output
Possible values:
- stdout: Redirect stdout for commands to /dev/null
- stderr: Redirect stderr for commands to /dev/null
- all: Redirect stdout and stderr for commands to /dev/null
-i, --input-file
-j, --jobs
[default: 8]
-0, --null-separator Use null separator for reading input instead of newline
-s, --shell Use shell for running commands.
If $SHELL environment variable is set use it else use /bin/bash.
Each input line is passed to $SHELL -c <line> as a single argument.
--channel-capacity <CHANNEL_CAPACITY>
Input and output channel capacity, defaults to num cpus * 2
[default: 16]
-h, --help Print help (see a summary with '-h')
-V, --version Print version ```
Recommended:
For manual installation/update:
1. Install Rust
2. Install the latest version of this app from crates.io:
$ cargo install rust-parallel
3. The same cargo install rust-parallel
command will also update to the latest version after initial installation.
Here a file test
is created and piped to stdin of rust-parallel
.
With -j5
all 5 commands are run in parallel. With -j1
commands are run sequentially.
```
$ cat >./test < $ cat test | rust-parallel -j5
are
hi
there
how
you $ cat test | rust-parallel -j1
hi
there
how
are
you
``` The ```
$ rust-parallel -j5 -c echo ::: hi there how are you
how
there
you
are
hi $ rust-parallel -j1 -c echo ::: hi there how are you
hi
there
how
are
you
``` Set environment variable This logs structured information about command line arguments and commands being run. Recommend enabling debug logging for all demos to understand what is happening in more detail. ```
$ RUSTLOG=debug rust-parallel -c echo ::: hi there how are you | grep commandline_args 2023-06-12T15:06:14.616870Z DEBUG rustparallel::commandlineargs: commandlineargs = CommandLineArgs { commandsfromargs: true, discardoutput: None, inputfile: [], jobs: 8, nullseparator: false, shell: false, channelcapacity: 16, commandandinitialarguments: ["echo", ":::", "hi", "there", "how", "are", "you"] } $ RUSTLOG=debug rust-parallel -c echo ::: hi there how are you | grep 'commandline_args:1' 2023-06-12T15:06:30.713226Z DEBUG Command::run{cmdargs=["echo", "there"] line=commandlineargs:1}: rustparallel::command: begin run
2023-06-12T15:06:30.714028Z DEBUG Command::run{cmdargs=["echo", "there"] line=commandlineargs:1 childpid=2838}: rustparallel::command: spawned child process, awaiting output
2023-06-12T15:06:30.716566Z DEBUG Command::run{cmdargs=["echo", "there"] line=commandlineargs:1 childpid=2838}: rustparallel::command: command exit status = exit status: 0
2023-06-12T15:06:30.716597Z DEBUG Command::run{cmdargs=["echo", "there"] line=commandlineargs:1 childpid=2838}: rust_parallel::command: end run
``` Here stdout and stderr from each command run are copied to stdout/stderr of the rust-parallel process. The ```
$ mkdir testdir $ touch 'testdir/a b' 'testdir/b c' 'testdir/c d' $ find testdir -type f -print0 | rust-parallel -0 gzip -f -k $ ls testdir
'a b' 'a b.gz' 'b c' 'b c.gz' 'c d' 'c d.gz'
``` By default ```
$ cat >./test < $ head -5 /usr/share/dict/words | rust-parallel -i - -i ./test echo
A
aalii
aa
a
aal
bar
foo
baz
``` Use ```
$ doit() {
echo Doing it for $1
sleep 2
echo Done with $1
} $ export -f doit $ cat >./test < $ cat test | rust-parallel -s
Doing it for 1
Done with 1
Doing it for 3
Done with 3
Doing it for 2
Done with 2
``` When Commands from arguments mode can be used to invoke a bash function. ```
$ logargs() {
echo "logargs got $@"
} $ export -f logargs $ rust-parallel -c -s logargs ::: A B C ::: D E F
logargs got A F
logargs got A D
logargs got B E
logargs got C E
logargs got B D
logargs got B F
logargs got A E
logargs got C D
logargs got C F
``` See the wiki page for benchmarks.:::
syntax is exactly equivalent and does not need the test
input file:Debug logging.
RUST_LOG=debug
to see debug output.Specifying command and intial arguments on command line:
md5 -s
will be prepended to each input line to form a command like md5 -s aal
$ head -100 /usr/share/dict/words | rust-parallel md5 -s
MD5 ("aal") = ff45e881572ca2c987460932660d320c
MD5 ("A") = 7fc56270e7a70fa81a5935b72eacbe29
MD5 ("aardvark") = 88571e5d5e13a4a60f82cea7802f6255
MD5 ("aalii") = 0a1ea2a8d75d02ae052f8222e36927a5
MD5 ("aam") = 35c2d90f7c06b623fe763d0a4e5b7ed9
MD5 ("aa") = 4124bc0a9335c27f086f24ba207a4912
MD5 ("a") = 0cc175b9c0f1b6a831c399e269772661
MD5 ("Aani") = e9b22dd6213c3d29648e8ad7a8642f2f
MD5 ("Aaron") = 1c0a11cc4ddc0dbd3fa4d77232a4e22e
MD5 ("aardwolf") = 66a4a1a2b442e8d218e8e99100069877
Using
awk
to form complete commands:
$ head -100 /usr/share/dict/words | awk '{printf "md5 -s %s\n", $1}' | rust-parallel
MD5 ("Abba") = 5fa1e1f6e07a6fea3f2bb098e90a8de2
MD5 ("abaxial") = ac3a53971d52d9ce3277eadf03f13a5e
MD5 ("abaze") = 0b08c52aa63d947b6a5601ee975bc3a4
MD5 ("abaxile") = 21f5fc27d7d34117596e41d8c001087e
MD5 ("abbacomes") = 76640eb0c929bc97d016731bfbe9a4f8
MD5 ("abbacy") = 08aeac72800adc98d2aba540b6195921
MD5 ("Abbadide") = 7add1d6f008790fa6783bc8798d8c803
MD5 ("abb") = ea01e5fd8e4d8832825acdd20eac5104
Using as part of a shell pipeline.
$ head -100 /usr/share/dict/words | rust-parallel md5 -s | grep -i abba
MD5 ("Abba") = 5fa1e1f6e07a6fea3f2bb098e90a8de2
MD5 ("abbacomes") = 76640eb0c929bc97d016731bfbe9a4f8
MD5 ("abbacy") = 08aeac72800adc98d2aba540b6195921
MD5 ("Abbadide") = 7add1d6f008790fa6783bc8798d8c803
Working on a set of files from
find
command.-0
option works nicely with find -print0
to handle filenames with newline or whitespace characters:Reading multiple inputs.
rust-parallel
reads input from stdin only. The -i
option can be used 1 or more times to override this behavior. -i -
means read from stdin, -i ./test
means read from the file ./test
:Calling a bash function.
-s
shell mode so that each input line is passed to /bin/bash -c
as a single argument:Commands from arguments mode.
-c/--commands-from-args
is specified, the :::
separator can be used to run the Cartesian Product of command line arguments. This is similar to the :::
behavior in GNU Parallel.
$ rust-parallel -c echo ::: A B ::: C D ::: E F G
B C F
A D E
A C G
A D F
A D G
A C F
B C E
A C E
B D F
B D E
B D G
B C G
Commands from arguments mode bash function.
Benchmarks:
Features:
#![forbid(unsafe_code)]
)O(number of input lines)
memory usage. In support of this:
tokio::sync::Semaphore
is used carefully to limit the number of commands that run concurrently. Do not spawn tasks for all input lines immediately to limit memory usage.Tech Stack:
async
/ await
functions (aka coroutines)CommandLineArgs
instance using tokio::sync::OnceCell
.tokio::process::Command
tokio::sync::Semaphore
used to limit number of commands that run concurrently.tokio::sync::mpsc::channel
used to send command output to an output writer task.
tracing::Instrument
is used to provide structured debug logs.