Fastax is a command-line tool that makes phylogenetic trees and lineages from the NCBI Taxonomy database. It uses a local copy of the database, which makes it really fast.
By default, all results are pretty-printed. In addition, it can output trees as Newick and lineages as CSV.
It can also be used to get information about some taxa like there alternative scientific names or the genetic code they use.
Fastax is written in Rust, which makes it safe, fast and portable. The code is managed using Cargo, then all you should have to do is install it (see the Cargo documentation) and run the following:
$ git clone https://github.com/Picani/fastax.git
$ cd fastax
$ cargo build --release
Et voilà ! The executable file is target/release/fastax
. Just move it
somewhere on your PATH
.
First, you need to get the local copy of the NCBI Taxonomy database.
$ fastax populate -ve plop@example.com
populate
will download the latest database dumps, extract them, and load
them in a local SQLite database. -v
asks fastax to tell what it's doing.
-e
asks to connect to the NCBI with that email address. Note that giving
your email is optional but preferred.
The database is located in a fastax
folder inside your local data folder,
which should be $HOME/.local/share
.
For each command, you need to query at least one node. The term used to get
a node can be either its unique NCBI Taxonomy ID (so called taxid), its
binomial scientific name or its binomial scientific name with the two part
separated by an underscore (the character _
). This last option is useful
for scripting.
Note also that for some species, multiple binomial scientific names are in use. Fastax looks for each of them.
show
commandYou can get general information about a node:
``` $ fastax show 4932
NCBI Taxonomy ID: 4932 Same as: * Saccharomyces capensis * Saccharomyces italicus * Saccharomyces oviformis * Saccharomyces uvarum var. melibiosus Commonly named baker's yeast. Also known as: * S. cerevisiae * brewer's yeast Part of the Plants and Fungi. Uses the Standard genetic code. Its mitochondria use the Yeast Mitochondrial genetic code. ```
or:
``` $ fastax show "Homo sapiens"
NCBI Taxonomy ID: 9606 Commonly named human. Also known as: * man First description: * Homo sapiens Linnaeus, 1758 Part of the Primates. Uses the Standard genetic code. Its mitochondria use the Vertebrate Mitochondrial genetic code. ```
or also:
``` $ fastax show Tyrannosaurus_rex
NCBI Taxonomy ID: 436495 Part of the Vertebrates. Uses the Standard genetic code. Its mitochondria use the Vertebrate Mitochondrial genetic code. ```
lineage
commandYou can get the lineage of a node:
$ fastax lineage 4932
root
└┬─ no rank: cellular organisms (taxid: 131567)
└┬─ superkingdom: Eukaryota (taxid: 2759)
└┬─ no rank: Opisthokonta (taxid: 33154)
└┬─ kingdom: Fungi (taxid: 4751)
└┬─ subkingdom: Dikarya (taxid: 451864)
└┬─ phylum: Ascomycota (taxid: 4890)
└┬─ no rank: saccharomyceta (taxid: 716545)
└┬─ subphylum: Saccharomycotina (taxid: 147537)
└┬─ class: Saccharomycetes (taxid: 4891)
└┬─ order: Saccharomycetales (taxid: 4892)
└┬─ family: Saccharomycetaceae (taxid: 4893)
└┬─ genus: Saccharomyces (taxid: 4930)
└── species: Saccharomyces cerevisiae (taxid: 4932)
The same lineage in CSV:
$ fastax lineage Saccharomyces_cerevisiae
no rank:root:1,no rank:cellular organisms:131567,superkingdom:Eukaryota:2759,no rank:Opisthokonta:33154,kingdom:Fungi:4751,subkingdom:Dikarya:451864,phylum:Ascomycota:4890,no rank:saccharomyceta:716545,subphylum:Saccharomycotina:147537,class:Saccharomycetes:4891,order:Saccharomycetales:4892,family:Saccharomycetaceae:4893,genus:Saccharomyces:4930,species:Saccharomyces cerevisiae:4932
tree
commandYou can get a phylogenetic tree:
$ fastax tree "Escherichia coli" 4932 Drosophila_melanogaster 9606 "Mus musculus"
─┬─ no rank: root
└─┬─ no rank: cellular organisms
├─┬─ no rank: Opisthokonta
│ ├─┬─ no rank: Bilateria
│ │ ├─┬─ superorder: Euarchontoglires
│ │ │ ├── species: Mus musculus
│ │ │ └── species: Homo sapiens
│ │ └── species: Drosophila melanogaster
│ └── species: Saccharomyces cerevisiae
└── species: Escherichia coli
The same tree in Newick:
$ fastax tree -n 562 4932 7227 9606 10090
(root,(cellular organisms,(Escherichia coli,Opisthokonta,(Saccharomyces cerevisiae,Bilateria,(Drosophila melanogaster,Euarchontoglires,(Homo sapiens,Mus musculus))))));
subtree
commandYou can get the phylogenetic tree of the children of a node:
$ fastax subtree Homininae
─┬─ subfamily: Homininae
├─┬─ genus: Homo
│ ├── species: Homo heidelbergensis
│ └─┬─ species: Homo sapiens
│ ├── subspecies: Homo sapiens subsp. 'Denisova'
│ └── subspecies: Homo sapiens neanderthalensis
├─┬─ genus: Pan
│ ├─┬─ species: Pan troglodytes
│ │ ├── subspecies: Pan troglodytes verus x troglodytes
│ │ ├── subspecies: Pan troglodytes ellioti
│ │ ├── subspecies: Pan troglodytes vellerosus
│ │ ├── subspecies: Pan troglodytes verus
│ │ ├── subspecies: Pan troglodytes troglodytes
│ │ └── subspecies: Pan troglodytes schweinfurthii
│ └── species: Pan paniscus
└─┬─ genus: Gorilla
├─┬─ species: Gorilla beringei
│ ├── subspecies: Gorilla beringei beringei
│ └── subspecies: Gorilla beringei graueri
└─┬─ species: Gorilla gorilla
├── subspecies: Gorilla gorilla diehli
├── subspecies: Gorilla gorilla uellensis
└── subspecies: Gorilla gorilla gorilla
If you only want the species:
$ fastax subtree -s Homininae
─┬─ subfamily: Homininae
├─┬─ genus: Homo
│ ├── species: Homo heidelbergensis
│ └── species: Homo sapiens
├─┬─ genus: Pan
│ ├── species: Pan troglodytes
│ └── species: Pan paniscus
└─┬─ genus: Gorilla
├── species: Gorilla beringei
└── species: Gorilla gorilla
The same tree in newick:
$ fastax subtree -sn Homininae
(Homininae,(Homo,(Homo sapiens,Homo heidelbergensis),Gorilla,(Gorilla beringei,Gorilla gorilla),Pan,(Pan paniscus,Pan troglodytes)));