With next-generation sequencing tools capabilities, millions to billions of reads are generated. To reach such a rate in a cost-efficient manner, barcoding individual sequences for multiple lines or species is a common practice.
Sabreur is a tool that aims to demultiplex barcoded reads into separate files. It supports both fasta and fastq files. Input files can be gzip, bzip2 or xz compressed in input or output (Thanks to the awesome niffler crate). If an uncompressed file is provided the output is by default uncompressed. But this behaviour can be changed by settingn the --format
option to the desired compress format. The --format
option if specified while input files are compressed changes output files to the specified compress format. Sabreur in its core compares the provided barcodes with each read, then separates the read into its appropriate file. If a read does not have a recognized barcode, then it is put into an unknown file.
sabreur barcode.txt input_R1.fq.gz input_R2.fq.gz
sabreur barcode.txt input.fa --format xz
Input sequences files can be fasta or fastq, gzipped or not. Just give the sequences, sabreur know how to handle it!
```
USAGE:
sabreur [FLAGS] [OPTIONS]
FLAGS: --force Force reuse of output directory -h, --help Prints help information -q, --quiet Decrease program verbosity -V, --version Prints version information
OPTIONS:
-f, --format
ARGS:
If you already have a functional rust installation do:
cargo install sabreur
``` git clone https://github.com/Ebedthan/sabreur.git cd sabreur
cargo build --release cargo test cargo install --path . ```
We used hyperfine for benchmarking with this dataset.
| Tool | Single-end uncompressed output | Single-end compressed output | Paired-end uncompressed output | Paired-end compressed output | | :--- | :----: | :----: | :----: | :----: | | idemp | - | 211.571 ± 3.718 | - | 366.247 ± 10.482 | | sabre | 32.911 ± 2.411 | - | 109.470 ± 49.909 | - | | sabreur | 10.843 ± 0.531| 93.840 ± 0.446 | 40.878 ± 13.743 | 187.533 ± 0.572 |
Sabreur use colored output in help, nevertheless sabreur honors [NO_COLORS](https://no-color.org/)
environment variable.
Sabreur use a special barcode tab-delimited file format in the form:
barcode1 barcode1_file1.fq barcode1_file2.fq
barcode2 barcode2_file1.fq barcode2_file2.fq
...
Contributions are welcomed under the project code of conduct.
Submit problems or requests to the Issue Tracker.
Licensed under the MIT license http://opensource.org/licenses/MIT. This project may not be copied, modified, or distributed except according to those terms.
Please note that the sabreur project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.