CircleCI

ncbitaxonomy

This is a Rust crate (i.e. library) for working with a local copy of the NCBI Taxonomy database. The database can be downloaded (either taxdump.zip or taxdump.tar.gz) from the NCBI Taxonomy FTP site and reformatted into a SQLite database using the taxonomy_util utility's to_sqlite subcommand.

Documentation is available at crates.io.

taxonomyfilterrefseq

(new in 0.1.1)

A tool to filter a NCBI RefSeq FASTA file so that only the ancestors of a given taxon are retained.

```bash $ taxonomyfilterrefseq --help taxonomyfilterrefseq 1.0.0 Peter van Heusden pvh@sanbi.axc.za Filter NCBI RefSeq FASTA files by taxonomic lineage

USAGE: taxonomyfilterrefseq [FLAGS] [OPTIONS] [OUTPUT_FASTA]

FLAGS: --nocurated Don't accept curated RNAs and proteins (NM, NR_ and NP_ accessions) --nopredicted Don't accept computationally predicted RNAs and proteins (XM, XR_ and XP_ accessions) -h, --help Prints help information -V, --version Prints version information

OPTIONS: -d, --db URL for SQLite taxonomy database

ARGS: FASTA file with RefSeq sequences Name of ancestor to use as ancestor filter Output FASTA filename (or stdout if omitted) ```

taxonomyfilterfastq

(new in version 0.2.0)

```bash $ taxonomyfilterfastq --help taxonomyfilterfastq 1.0.0 Peter van Heusden pvh@sanbi.axc.za Filter FASTQ files whose reads have been classified by Centrifuge or Kraken2, only retaining reads in taxa descending from given ancestor

USAGE: taxonomyfilterfastq [FLAGS] [OPTIONS] ... --ancestortaxid --taxreport_filename <--centrifuge|--kraken2>

FLAGS: -d, --output_dir Directory to deposited filtered output files in -C, --centrifuge Filter using report from Centrifuge -h, --help Prints help information -K, --kraken2 Filter using report from Kraken2 -V, --version Prints version information

OPTIONS: -A, --ancestortaxid Name of ancestor to use as ancestor filter -d, --db URL for SQLite taxonomy database -F, --taxreport_filename Output from Kraken2 (default) or Centrifuge

ARGS: ... FASTA file with RefSeq sequences ```

taxonomy_util

(new in 1.0.0)

Utilities to convert NCBI taxonomy database files into SQLite database (the input format used in other tools).

```bash taxonomy_util 1.0.0 Peter van Heusden pvh@sanbi.axc.za Utilities for working with the NCBI taxonomy database

USAGE: taxonomy_util [OPTIONS] [SUBCOMMAND]

FLAGS: -h, --help Prints help information -V, --version Prints version information

OPTIONS: -d, --db URL for SQLite taxonomy database

SUBCOMMANDS: commonancestordistance find the tree distance to te common ancestor between two taxa getid find taxonomy ID for name getlineage get lineage for name [unimplemented] getname find name for taxonomy ID help Prints this message or the help of the given subcommand(s) tosqlite save taxonomy database loaded from files to SQLite database file ```