ChromaDB-rs

A Rust based client library for the Chroma vector database.

Crates.io MIT Licensed Tests

⚙️ Running ChromaDB

ℹ Chroma can be run in-memory in Python (without Docker), but this feature is not yet available in other languages. To use this library you either need a hosted or local version of ChromaDB running.

If you can run docker-compose up -d --build you can run Chroma.

```shell git clone https://github.com/chroma-core/chroma.git cd chroma

Run a ChromaDB instance at localhost:8000

docker-compose up -d --build ```

More information about deploying Chroma to production can be found here.

🚀 Installing the library

shell cargo add chromadb The library crate can be found at crates.io.

📖 Documentation

The library reference can be found here.

🔍 Overview

The library provides 2 modules to interact with the ChromaDB server via API V1:

You can connect to ChromaDB by instantiating a ChromaClient

```rust use chromadb::v1::ChromaClient; use chromadb::v1::collection::{ChromaCollection, GetQuery, GetResult, CollectionEntries}; use serde_json::json;

// With default ChromaClientOptions // Defaults to http://localhost:8000 let client: ChromaClient = ChromaClient::new(Default::default());

// With custom ChromaClientOptions let client: ChromaClient = ChromaClient::new(ChromaClientOptions { url: "".into() }); ```

Now that a client is instantiated, we can interface with the ChromaDB server.

```rust // Get or create a collection with the given name and no metadata. let collection: ChromaCollection = client.getorcreatecollection("mycollection", None)?;

// Get the UUID of the collection let collectionuuid = collection.id(); println!("Collection UUID: {}", collectionuuid); ```

With a collection instance, we can perform queries on the database

```rust // Upsert some embeddings with documents and no metadata. let collectionentries = CollectionEntries { ids: vec!["demo-id-1".into(), "demo-id-2".into()], embeddings: Some(vec![vec![0.0f32; 768], vec![0.0_f32; 768]]), metadatas: None, documents: Some(vec![ "Some document about 9 octopus recipies".into(), "Some other document about DCEU Superman Vs CW Superman".into() ]) };

let result: bool = collection.upsert(collection_entries, None)?;

// Create a filter object to filter by document content. let where_document = json!({ "$contains": "Superman" });

// Get embeddings from a collection with filters and limit set to 1. // An empty IDs vec will return all embeddings. let getquery = GetQuery { ids: vec![], wheremetadata: None, limit: Some(1), offset: None, wheredocument: Some(wheredocument), include: Some(vec!["documents".into(),"embeddings".into()]) }; let getresult: GetResult = collection.get(getquery)?; println!("Get result: {:?}", get_result);

``` Find more information about the available filters and options in the get() documentation.

Performing a similarity search

```rust //Instantiate QueryOptions to perform a similarity search on the collection //Alternatively, an embeddingfunction can also be provided with querytexts to perform the search let query = QueryOptions { querytexts: None, queryembeddings: Some(vec![vec![0.0f32; 768], vec![0.0f32; 768]]), wheremetadata: None, wheredocument: None, n_results: Some(5), include: None, };

let queryresult: QueryResult = collection.query(query, None)?; println!("Query result: {:?}", queryresult); ```

### Support for Embedding providers This crate has built-in support for OpenAI and SBERT embeddings.

To use OpenAI embeddings, enable the openai feature in your Cargo.toml.

```rust let collection: ChromaCollection = client.getorcreatecollection("openaicollection", None)?;

let collection_entries = CollectionEntries { ids: vec!["demo-id-1", "demo-id-2"], embeddings: None, metadatas: None, documents: Some(vec![ "Some document about 9 octopus recipies", "Some other document about DCEU Superman Vs CW Superman"]) };

// Use OpenAI embeddings let openai_embeddings = OpenAIEmbeddings::new(Default::default());

collection.upsert(collectionentries, Some(Box::new(openaiembeddings)))?; ```

To use SBERT embeddings, enable the bert feature in your Cargo.toml.

```rust let collection_entries = CollectionEntries { ids: vec!["demo-id-1", "demo-id-2"], embeddings: None, metadatas: None, documents: Some(vec![ "Some document about 9 octopus recipies", "Some other document about DCEU Superman Vs CW Superman"]) };

// Use SBERT embeddings let sbertembeddings = SentenceEmbeddingsBuilder::remote( SentenceEmbeddingsModelType::AllMiniLmL6V2 ).createmodel()?;

collection.upsert(collectionentries, Some(Box::new(sbertembeddings)))?; ```

Sponsors

OpenSauced logo

OpenSauced provides insights into open source projects by using data science in git commits.

⚖️ LICENSE

MIT © 2023