Bioutils

Simple Biological Utilities with Rust's [u8]

Bioutils provides simple biological utilities including:

Please take a look at the align example to get a full practical walkthrough!

  • Character sets include punctuation, are subdivided, and implemented in Rust's [u8] rather than bitset
  • Implementations are centered around [u8], although character sets are also provided as [&str], hashset u8 and hashset &str. Check back as more functionality gets added!
  • Quick Start

    //! Check out the align example for a full practical walkthrough from downloading files to finding read positions! //! Examples for using checks: //! use bioutils::charsets::*; //! use bioutils::utils::*; //! use bioutils::utils::check::CheckU8; //! //! let dna = b"ACTG"; //! let rna = b"ACUG"; //! let homopolymerN = b"NNNN"; //! let homopolymerA = b"AAAA"; //! let gapna = b"AC-G"; //! let nna = b"ACnG"; //! let quality = b"@ABC"; //! //! assert!(homopolymerN.is_homopolymer()); //! assert!(homopolymerA.is_homopolymer_not_n()); //! assert!(homopolymerN.is_homopolymer_n()); //! //! assert!(gapna.has_gap()); //! assert!(nna.has_n()); //! assert!(dna.is_iupac()); //! assert!(rna.is_basic_rna()); //! //! assert!(quality.is_phred33()); //! assert!(quality.is_phred64()); //! assert!(quality.is_solexa());

    Modules

    charsets

    Numerous IUPAC character sets to either use directly or create your own mix and match

    files

    Download human and mouse Gencode references, download fastq sample files

    references

    Currently includes human NCBI gencode GRCh38. Automatically downloads the latest version of user's choice.

    utils

    Functions for sequence checks, pseudorandom replacement of N or gaps, and functions to create new pseudoranndom sequences