Print out random number of lines from a line oriented file. Pick up where shuf gets killed.
$ cargo install randlines
```shell $ randlines -h randlines 0.1.1
Emit a random subset of lines from a file. This is a probabilistic program, you
will not get exactly n
lines.
Typically, you can use shuf(1) which uses reservoir sampling and is very efficient. However, if we want to extract 10M random lines from a file of 100M lines, shuf(1) might be killed. However, randlines will not shuffle lines, just skip over random number of lines.
USAGE: randlines [OPTIONS] [input]
FLAGS: -h, --help Prints help information -V, --version Prints version information
OPTIONS:
-n
ARGS: ```
Emit a random subset of lines from a file. This is a probabilistic program, you
will not get exactly n
lines.
Typically, you can use shuf(1) which uses reservoir sampling and is very efficient. However, if we want to extract 10M random lines from a file of 100M lines, shuf(1) might be killed. However, randlines will not shuffle lines, just skip over random number of lines.