probly-search · GitHub license Coverage Status ![Latest Version] PRs Welcome

A full-text search library, written in Rust, optimized for insertion speed, that provides full control over the scoring calculations.

This start initially as a port of the Node library NDX.

Demo

Recipe (title) search with 50k documents.

https://quantleaf.github.io/probly-search-demo/

Features

Documentation

Adding, Removing and Searching documents

See Integration tests.

Use this library with WASM

See recipe search demo project

A basic example

Creating an index with a document that has 2 fields. Query documents, and remove a document. ```rust use std::collections::HashSet; use problysearch::{ index::Index, query::{ score::default::{bm25, zeroto_one}, QueryResult, }, };

// A white space tokenizer fn tokenizer(s: &str) -> Vec> { s.split(' ').map(Cow::from).collect::>() }

// We have to provide extraction functions for the fields we want to index

// Title fn titleextract(d: &Doc) -> Vec<&str> { vec![d.title.asstr()] }

// Description fn descriptionextract(d: &Doc) -> Vec<&str> { vec![d.description.asstr()] }

// Create index with 2 fields let mut index = Index::::new(2);

// Create docs from a custom Doc struct let doc1 = Doc { id: 0, title: "abc".tostring(), description: "dfg".to_string(), };

let doc2 = Doc { id: 1, title: "dfgh".tostring(), description: "abcd".to_string(), };

// Add documents to index index.adddocument( &[titleextract, descriptionextract], tokenizer, doc1.id, &doc_1, );

index.adddocument( &[titleextract, descriptionextract], tokenizer, doc2.id, &doc_2, );

// Search, expected 2 results let mut result = index.query( &"abc", &mut bm25::new(), tokenizer, &[1., 1.], ); asserteq!(result.len(), 2); asserteq!( result[0], QueryResult { key: 0, score: 0.6931471805599453 } ); assert_eq!( result[1], QueryResult { key: 1, score: 0.28104699650060755 } );

// Remove documents from index index.removedocument(doc1.id);

// Vacuum to remove completely index.vacuum();

// Search, expect 1 result result = index.query( &"abc", &mut bm25::new(), tokenizer, &[1., 1.], ); asserteq!(result.len(), 1); asserteq!( result[0], QueryResult { key: 1, score: 0.1166450426074421 } ); ```

Go through source tests in for the BM25 implementation and zero-to-one implementation for more query examples.

Testing

Run all tests with rust cargo test

Benchmark

Run all benchmarks with rust cargo bench

License

MIT