lt-fm-index
is library for locate and count nucleotide and amino acid sequence string.
LtFmIndex have precalculated count lookup table for kmer occurrences.
LtFmIndex
is generated from Text
LtFmIndex
have two functions for Pattern
Pattern
appears in Text
.Pattern
appears in Text
.NucleotideOnly
: ACGTNucleotideWithNoise
: ACGTAminoacidOnly
: ACDEFGHIKLMNPQRSTVWYAminoacidWithNoise
: ACDEFGHIKLMNPQRSTVWYNucleotideOnly
, pattern of ACGTXYZ can be matched with ACGTTTT. Because X, Y and Z are not in ACG (nucleotide except T). And lt-fm-index
generated with text of ACGTXYZ indexes the text as ACGTTTT.
LtFmIndex
to count and locate pattern.```rust use ltfmindex::LtFmIndexBuilder;
// (1) Define builder for lt-fm-index let builder = LtFmIndexBuilder::new() .usenucleotidewithnoise() .setlookuptablekmersize(4).unwrap() .setsuffixarraysampling_ratio(2).unwrap();
// (2) Generate lt-fm-index with text let text = b"CTCCGTACACCTGTTTCGTATCGGANNNN".tovec(); let ltfm_index = builder.build(text); // text is consumed
// (3) Match with pattern let pattern = b"TA".tovec(); // - count let count = ltfmindex.count(&pattern); asserteq!(count, 2); // - locate let locations = ltfmindex.locate(&pattern); assert_eq!(locations, vec![5,18]); ```
LtFmIndex
```rust use ltfmindex::{LtFmIndex, LtFmIndexBuilder};
// (1) Generate lt-fm-index let text = b"CTCCGTACACCTGTTTCGTATCGGA".tovec(); let ltfmindexto_save = LtFmIndexBuilder::new().build(text);
// (2) Save lt-fm-index to buffer let mut buffer = Vec::new(); ltfmindextosave.save_to(&mut buffer).unwrap();
// (3) Load lt-fm-index from buffer let ltfmindexloaded = LtFmIndex::loadfrom(&buffer[..]).unwrap();
asserteq!(ltfmindextosave, ltfmindexloaded); ```
https://github.com/baku4/lt-fm-index
libdivsufsort