lt-fm-index
is library for locate and count nucleotide sequence (ATGC) string.
lt-fm-index
using k-mer lookup table (As you noticed, LT stands for lookup table).
FmIndexOn
supports a text with only genetic nucleotide sequence (ACGT).FmIndexNn
supports a text containing non-nucleotide sequence.FmIndexNn
treats all non-nucleotide as the same character.CAVEAT! This crate
is not stable. Functions can be changed without notice.
Fm-index using KLT with specified k-mer size.
libdivsufsort
library.FmIndex
to locate pattern.```rust use ltfmindex::FmIndexConfig;
// (1) Define configuration for fm-index
let fmiconfig = FmIndexConfig::new()
.setkmerlookuptable(8)
.setsuffixarraysamplingratio(4)
.containnonnucleotide(); // Default is true
// (2) Generate fm-index with text let text = b"CTCCGTACACCTGTTTCGTATCGGANNN".tovec(); let fmindex = fmiconfig.generatefmindex(text); // text is consumed
// (3) Match with pattern let pattern = b"TA".tovec(); // - count let count = fmindex.count(&pattern); asserteq!(count, 2); // - locate without k-mer lookup table let locations = fmindex.locatewoklt(&pattern); asserteq!(locations, vec![5,18]); // - locate with k-mer lookup table let locations = fmindex.locatewklt(&pattern); assert_eq!(locations, vec![5,18]); ```
FmIndex
```rust use ltfmindex::{FmIndexConfig, FmIndex};
// (1) Generate FmIndex
let fmiconfig = FmIndexConfig::new()
.setkmerlookuptable(8)
.setsuffixarraysamplingratio(4);
let text = b"CTCCGTACACCTGTTTCGTATCGGA".tovec();
let fmindexpre = fmiconfig.generate_fmindex(text); // text is consumed
// (2) Write fm-index to buffer (or file path) let mut buffer = Vec::new(); fmindexpre.writeindexto(&mut buffer).unwrap();
// (3) Read fm-index from buffer (or file path) let fmindexpro = FmIndex::readindexfrom(&buffer[..]).unwrap();
asserteq!(fmindexpre, fmindex_pro); ```
32bit
integer
https://github.com/baku4/lt-fm-index
libdivsufsort