uniprot.rs
Rust data structures and parser for the [UniprotKB database(s)].
The uniprot::uniprot::parse
function
can be used to obtain an iterator over the entries of a UniprotKB database in
XML format (either SwissProt or TrEMBL). XML files for UniRef and UniParc can
also be parsed, with uniprot::uniref::parse
and uniprot::uniparc::parse
, respectively.
```rust extern crate uniprot;
let f = std::fs::File::open("tests/uniprot.xml") .map(std::io::BufReader::new) .unwrap();
for r in uniprot::uniprot::parse(f) { let entry = r.unwrap(); // ... process the Uniprot entry ... } ```
Any BufRead
implementor can be used as an input, so the database files can be streamed
directly from their online location with
the help of an HTTP library such as reqwest
, or
using the ftp
library.
The XML format is the same for the EBI REST API and for the UniProt API, so this library can also be used to read single entries or larger queries. For instance, you can search UniProt for a keyword and retrieve all the matching entries:
```rust extern crate ureq; extern crate libflate; extern crate uniprot;
let query = "bacteriorhodopsin"; let query_url = format!("https://www.uniprot.org/uniprot/?query={}&format=xml&compress=yes", query);
let req = ureq::get(&queryurl).set("Accept", "application/xml"); let reader = libflate::gzip::Decoder::new(req.call().unwrap().intoreader()).unwrap();
for r in uniprot::uniprot::parse(std::io::BufReader::new(reader)) { let entry = r.unwrap(); // ... process the Uniprot entry ... } ```
See the online documentation at docs.rs
for more
examples, and some details about the different features available.
threading
(enabled by default):
compiles the multithreaded parser that offers a 90% speed increase when
processing XML files.uniprot.rs
is developed and maintained by:
- Martin Larralde
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
This library is provided under the open-source MIT license.