Multi-seq-align

stability-experimental

Rust Rust Documentation Crates.io version Crates.io license

A crate to manipulate multiple sequences alignments in Rust.

Instead of storing aligned sequences as multiple strings, multi_seq_align stores bases or residues in Alignment using a list of characters, like a matrix. This allows easy access to specific rows and columns of the alignment.

Usage

```rust let mut kappacaseinfragmentsalignment = Alignment::withsequences( &[ b"PAPISKWQSMP".tovec(), b"HAQIPQRQYLP".tovec(), b"PAQILQWQVLS".to_vec(), ], )?;

// Let's extract a column of this alignment asserteq!( kappacaseinfragmentsalignment.nth_position(6).unwrap(), [&b'W', &b'R', &b'W'] );

// But we also have the aligned sequence for the Platypus // Let's add it to the original alignment kappacaseinfragmentsalignment.add( b"EHQRP--YVLP".tovec(), )?;

// the new aligned sequence has a gap at the 6th position asserteq!( kappacaseinfragmentsalignment.nth_position(6).unwrap(), [&b'W', &b'R', &b'W', &b'-'] );

// We can also loop over each position of the alignment for aas in kappacaseinfragmentsalignment.iterpositions() { println!("{:?}", aas); assert_eq!(aas.len(), 4); // 4 sequences } ```

Here I instancied an alignment using u8, but Alignment works on generics like numbers, custom or third-party structs.

Features

Ideas

Optimisation

My goal is to reduce the footprint of this crate, there is ome work to do to achieve it. The code will eventually be optimised to be faster and to better use memory.

Issues

Assuring that all the sequences have the same lengths in a generic way is chalenging and result in some not so nice code.

Ideas & bugs

Please create a new issue on the project repository.

License

Aa-regex is distributed under the terms of the Apache License (Version 2.0). See LICENSE for details.