LtFmIndex
is a Rust library for building and using a FM-index that contains a lookup table of the first k-mer of a pattern. This index can be used to (1) count the number of occurrences and (2) locate the positions of a pattern in an indexed text.
To use this library, add lt_fm_index
to your Cargo.toml
:
toml
[dependencies]
lt_fm_index = "0.7.0-alpha"
- About fastbwt
features
- This feature can accelerate the indexing, but needs cmake
to build libdivsufsort
and cannot be built as WASM.
``rust
use lt_fm_index::LtFmIndex;
use lt_fm_index::blocks::Block2; //
Block2` can index 3 types of characters.
// (1) Define characters to use let charactersbyindex: &[&[u8]] = &[ &[b'A', b'a'], // 'A' and 'a' are treated as the same &[b'C', b'c'], // 'C' and 'c' are treated as the same &[b'G', b'g'], // 'G' and 'g' are treated as the same ]; // Alternatively, you can use this simpler syntax: let charactersbyindex: &[&[u8]] = &[ b"Aa", b"Cc", b"Gg" ];
// (2) Build index
let text = b"CTCCGTACACCTGTTTCGTATCGGAXXYYZZ".tovec();
let ltfmindex= LtFmIndex::
// (3) Match with pattern let pattern = b"TA"; // - count let count = ltfmindex.count(pattern); asserteq!(count, 2); // - locate let mut locations = ltfmindex.locate(pattern); locations.sort(); // The locations may not be in order. asserteq!(locations, vec![5,18]); // All unindexed characters are treated as the same character. // In the text, X, Y, and Z can match any other unindexed character let mut locations = ltfmindex.locate(b"UNDEF"); locations.sort(); // Using the b"XXXXX", b"YYYYY", or b"!@#$%" gives the same result. assert_eq!(locations, vec![25,26]);
// (4) Save and load let mut buffer = Vec::new(); ltfmindex.saveto(&mut buffer).unwrap(); let loaded = LtFmIndex::loadfrom(&buffer[..]).unwrap(); asserteq!(ltfm_index, loaded); ```
https://github.com/baku4/lt-fm-index
libdivsufsort