lt-fm-index
is a library to (1) locate or (2) count the pattern in the large text of nucleotide and amino acid sequences.
LtFmIndex
is FmIndex using lookup table, the precalculated count of k-mer occurrences.
LtFmIndex
is built from Text
(Vec<u8>
).
LtFmIndex
have two functions.
count
: Count the number of times the Pattern
(&[u8]
) appears in the Text
.locate
: Locate the start index in which the Pattern
appears in the Text
.Text
are supported.
NucleotideOnly
: consists of {ACG}NucleotideWithNoise
: consists of {ACGT}AminoacidOnly
: consists of {ACDEFGHIKLMNPQRSTVW}AminoacidWithNoise
: consists of {ACDEFGHIKLMNPQRSTVWY}*
of each type is treated as a wildcard that can be matched with any characters.
NucleotideOnly
, LtFmIndex
stores the text of ACGTXYZ as ACG.NucleotideWithNoise
, LtFmIndex
stores the same text (ACGTXYZ) as ACGTfastbwt
feature can accelerate the indexing, but needs cmake
to build libdivsufsort
and cannot be built as WASM.
LtFmIndex
to count and locate a pattern.```rust use ltfmindex::LtFmIndexBuilder;
// (1) Define builder for lt-fm-index let builder = LtFmIndexBuilder::new() .texttypeisinferred() .setsuffixarraysamplingratio(2).unwrap() .setlookuptablekmer_size(4).unwrap();
// (2) Generate lt-fm-index with text let text = b"CTCCGTACACCTGTTTCGTATCGGANNNN".tovec(); let ltfm_index = builder.build(text).unwrap(); // text is consumed
// (3) Match with pattern let pattern = b"TA".tovec(); // - count let count = ltfmindex.count(&pattern); asserteq!(count, 2); // - locate let locations = ltfmindex.locate(&pattern); assert_eq!(locations, vec![5,18]); ```
LtFmIndex
```rust use ltfmindex::{LtFmIndex, LtFmIndexBuilder};
// (1) Generate lt-fm-index let text = b"CTCCGTACACCTGTTTCGTATCGGA".tovec(); let ltfmindexto_save = LtFmIndexBuilder::new().build(text).unwrap();
// (2) Save lt-fm-index to buffer let mut buffer = Vec::new(); ltfmindextosave.save_to(&mut buffer).unwrap();
// (3) Load lt-fm-index from buffer let ltfmindexloaded = LtFmIndex::loadfrom(&buffer[..]).unwrap();
asserteq!(ltfmindextosave, ltfmindexloaded); ```
https://github.com/baku4/lt-fm-index
libdivsufsort