A full-text search library, written in Rust, optimized for insertion speed, that provides full control over the scoring calculations.
This start initially as a port of the Node library NDX.
Recipe (title) search with 50k documents.
https://quantleaf.github.io/probly-search-demo/
Three ways to do scoring
ScoreCalculator
trait. Trie based dynamic Inverted Index.
See Integration tests.
See recipe search demo project
Creating an index with a document that has 2 fields. Query documents, and remove a document. ```rust use std::collections::HashSet; use problysearch::{ index::Index, query::{ score::default::{bm25, zeroto_one}, QueryResult, }, };
// A white space tokenizer
fn tokenizer(s: &str) -> Vec
// We have to provide extraction functions for the fields we want to index
// Title fn titleextract(d: &Doc) -> Vec<&str> { vec![d.title.asstr()] }
// Description fn descriptionextract(d: &Doc) -> Vec<&str> { vec![d.description.asstr()] }
// Create index with 2 fields
let mut index = Index::
// Create docs from a custom Doc struct let doc1 = Doc { id: 0, title: "abc".tostring(), description: "dfg".to_string(), };
let doc2 = Doc { id: 1, title: "dfgh".tostring(), description: "abcd".to_string(), };
// Add documents to index index.adddocument( &[titleextract, descriptionextract], tokenizer, doc1.id, &doc_1, );
index.adddocument( &[titleextract, descriptionextract], tokenizer, doc2.id, &doc_2, );
// Search, expected 2 results let mut result = index.query( &"abc", &mut bm25::new(), tokenizer, &[1., 1.], ); asserteq!(result.len(), 2); asserteq!( result[0], QueryResult { key: 0, score: 0.6931471805599453 } ); assert_eq!( result[1], QueryResult { key: 1, score: 0.28104699650060755 } );
// Remove documents from index index.removedocument(doc1.id);
// Vacuum to remove completely index.vacuum();
// Search, expect 1 result result = index.query( &"abc", &mut bm25::new(), tokenizer, &[1., 1.], ); asserteq!(result.len(), 1); asserteq!( result[0], QueryResult { key: 1, score: 0.1166450426074421 } ); ```
Go through source tests in for the BM25 implementation and zero-to-one implementation for more query examples.
Run all tests with
rust
cargo test
Run all benchmarks with
rust
cargo bench