A full-text search library, optimized for insertion speed, that provides full control over the scoring calculations.
This start initially as a port of the Node library NDX.
Recipe (title) search with 50k documents.
https://quantleaf.github.io/probly-search-demo/
Three ways to do scoring
ScoreCalculator
trait. Trie based dynamic Inverted Index.
See Integration tests.
See recipe search demo project
Creating an index with a document that has 2 fields. Query documents, and remove a document. ```rust use std::collections::HashSet; use problysearch::{ index::{adddocumenttoindex, createindex, removedocumentfromindex, Index}, query::{ query, score::default::{bm25, zerotoone}, QueryResult, }, };
// A white space tokenizer
fn tokenizer(s: &str) -> Vec<&str> {
s.split(' ').collect::
// We have to provide extraction functions for the fields we want to index
// Title fn titleextract(d: &Doc) -> Option<&str> { Some(d.title.asstr()) }
// Description fn descriptionextract(d: &Doc) -> Option<&str> { Some(d.description.asstr()) }
// A no-op filter fn filter(s: &str) -> &str { s }
// Create index with 2 fields
let mut index = create_index::
// Create docs from a custom Doc struct let doc1 = Doc { id: 0, title: "abc".tostring(), description: "dfg".to_string(), };
let doc2 = Doc { id: 1, title: "dfgh".tostring(), description: "abcd".to_string(), };
// Add documents to index adddocumenttoindex( &mut index, &[titleextract, descriptionextract], tokenizer, filter, doc1.id, &doc_1, );
adddocumenttoindex( &mut index, &[titleextract, descriptionextract], tokenizer, filter, doc2.id, &doc_2, );
// Search, expected 2 results let mut result = query( &mut index, &"abc", &mut bm25::new(), tokenizer, filter, &[1., 1.], None, ); asserteq!(result.len(), 2); asserteq!( result[0], QueryResult { key: 0, score: 0.6931471805599453 } ); assert_eq!( result[1], QueryResult { key: 1, score: 0.28104699650060755 } );
// Remove documents from index let mut removeddocs = HashSet::new(); removedocumentfromindex(&mut index, &mut removeddocs, doc1.id);
// Vacuum to remove completely vacuumindex(&mut index, &mut removeddocs);
// Search, expect 1 result result = query( &mut index, &"abc", &mut bm25::new(), tokenizer, filter, &[1., 1.], Some(&removeddocs), ); asserteq!(result.len(), 1); assert_eq!( result[0], QueryResult { key: 1, score: 0.1166450426074421 } ); ```
Go through source tests in for the BM25 implementation and zero-to-one implementation for more query examples.