
fastar
A faster equivalent of tar -cT <(find . -type f)
, optimized for tarring many small files stored on HDDs.
Optimizations compared to gnu tar:
- directory traversal based on physical disk layout. see platter-walk crate
- readaheads across multiple files at once to keep the drive's command queue filled. see reapfrog crate
- drops disk caches for files onces they are read to prevent disk buffer thrashing.
Current limitations:
- arguments must be directories
- only archives regular files, not symlinks
- filenames longer than 100 bytes are not supported (implementation isn't difficult, just the
tar-rs
API being clunky)
Benchmarks
```
ffcnt . -s
files: 6680901
bytes: 245271028476
echo 3 > /proc/sys/vm/drop_caches ; tar -c . | pv -at > /dev/null
^C0:02:45 [ 2.4MiB/s]
echo 3 > /proc/sys/vm/drop_caches ; tar -cT <(ffcnt --ls --type f --leaf-order content .) | pv -at > /dev/null
^C0:02:50 [4.11MiB/s]
echo 3 > /proc/sys/vm/drop_caches ; fastar . | pv -at > /dev/null
^C0:02:51 [9.28MiB/s]
```