CircleCI

ncbitaxonomy

This is a Rust crate (i.e. library) for working with a local copy of the NCBI Taxonomy database. The database can be downloaded (either taxdump.zip or taxdump.tar.gz) from the NCBI Taxonomy FTP site and reformatted into a SQLite database using the taxonomy_util utility's to_sqlite subcommand.

Documentation is available at crates.io.

taxonomyfilterrefseq

(new in 0.1.1)

A tool to filter a NCBI RefSeq FASTA file so that only the ancestors of a given taxon are retained.

```bash $ taxonomyfilterrefseq --help taxonomyfilterrefseq 1.0.0 Peter van Heusden pvh@sanbi.axc.za Filter NCBI RefSeq FASTA files by taxonomic lineage

USAGE: taxonomyfilterrefseq [FLAGS] [OPTIONS] [OUTPUT_FASTA]

FLAGS: --nocurated Don't accept curated RNAs and proteins (NM, NR_ and NP_ accessions) --nopredicted Don't accept computationally predicted RNAs and proteins (XM, XR_ and XP_ accessions) -h, --help Prints help information -V, --version Prints version information

OPTIONS: -d, --db URL for SQLite taxonomy database

ARGS: FASTA file with RefSeq sequences Name of ancestor to use as ancestor filter Output FASTA filename (or stdout if omitted) ```

taxonomyfilterfastq

(new in version 0.2.0)

```bash $ taxonomyfilterfastq --help taxonomyfilterfastq 1.0.0 Peter van Heusden pvh@sanbi.axc.za Filter FASTQ files whose reads have been classified by Centrifuge or Kraken2, only retaining reads in taxa descending from given ancestor

USAGE: taxonomyfilterfastq [FLAGS] [OPTIONS] ... --ancestortaxid --taxreport_filename <--centrifuge|--kraken2>

FLAGS: -d, --output_dir Directory to deposited filtered output files in -C, --centrifuge Filter using report from Centrifuge -h, --help Prints help information -K, --kraken2 Filter using report from Kraken2 -V, --version Prints version information

OPTIONS: -A, --ancestortaxid Name of ancestor to use as ancestor filter -d, --db URL for SQLite taxonomy database -F, --taxreport_filename Output from Kraken2 (default) or Centrifuge

ARGS: ... FASTA file with RefSeq sequences ```

taxonomy_util

(new in 1.0.0)

Utilities to convert NCBI taxonomy database files into SQLite database (the input format used in other tools).

```bash taxonomy_util 1.0.0 Peter van Heusden pvh@sanbi.axc.za Utilities for working with the NCBI taxonomy database

USAGE: taxonomy_util [OPTIONS] [SUBCOMMAND]

FLAGS: -h, --help Prints help information -V, --version Prints version information

OPTIONS: -d, --db URL for SQLite taxonomy database

SUBCOMMANDS: commonancestordistance find the tree distance to te common ancestor between two taxa getid find taxonomy ID for name getlineage get lineage for name getname find name for taxonomy ID help Prints this message or the help of the given subcommand(s) tosqlite save taxonomy database loaded from files to SQLite database file ```