Disco Rust

:fire: Recommendations for Rust using collaborative filtering

Build Status

Installation

Add this line to your application’s Cargo.toml under [dependencies]:

toml discorec = "0.1"

Getting Started

Prep your data in the format user_id, item_id, value

```rust use discorec::{Dataset, Recommender};

let mut data = Dataset::new(); data.push("usera", "itema", 5.0); data.push("usera", "itemb", 3.5); data.push("userb", "itema", 4.0); ```

IDs can be integers, strings, or any other hashable data type

rust data.push(1, "item_a".to_string(), 5.0);

If users rate items directly, this is known as explicit feedback. Fit the recommender with:

rust let recommender = Recommender::fit_explicit(&data);

If users don’t rate items directly (for instance, they’re purchasing items or reading posts), this is known as implicit feedback. Use 1.0 or a value like number of purchases or page views for the dataset, and fit the recommender with:

rust let recommender = Recommender::fit_implicit(&data);

Get user-based recommendations - “users like you also liked”

rust recommender.user_recs(&user_id, 5);

Get item-based recommendations - “users who liked this item also liked”

rust recommender.item_recs(&item_id, 5);

Get predicted ratings for a specific user and item

rust recommender.predict(&user_id, &item_id);

Get similar users

rust recommender.similar_users(&user_id, 5);

Example

Download the MovieLens 100K dataset.

Add these lines to your application’s Cargo.toml under [dependencies]:

toml csv = "1" serde = { version = "1", features = ["derive"] }

And use:

```rust use csv::ReaderBuilder; use discorec::{Dataset, RecommenderBuilder}; use serde::Deserialize; use std::fs::File;

[derive(Debug, Deserialize)]

struct Row { userid: i32, itemid: i32, rating: f32, time: i32, }

fn main() { let mut trainset = Dataset::new(); let mut validset = Dataset::new();

let file = File::open("u.data").unwrap();
let mut rdr = ReaderBuilder::new()
    .has_headers(false)
    .delimiter(b'\t')
    .from_reader(file);
for (i, record) in rdr.records().enumerate() {
    let row: Row = record.unwrap().deserialize(None).unwrap();
    let dataset = if i < 80000 { &mut train_set } else { &mut valid_set };
    dataset.push(row.user_id, row.item_id, row.rating);
}

let recommender = RecommenderBuilder::new()
    .factors(20)
    .fit_explicit(&train_set);
println!("RMSE: {:?}", recommender.rmse(&valid_set));

} ```

Storing Recommendations

Save recommendations to your database. Alternatively, you can store only the factors and use a library like pgvector-rust.

Algorithms

Disco uses high-performance matrix factorization.

Specify the number of factors and iterations

rust RecommenderBuilder::new() .factors(8) .iterations(20) .fit_explicit(&train_set);

Progress

Pass a callback to show progress

rust RecommenderBuilder::new() .callback(|info| println!("{:?}", info)) .fit_explicit(&train_set);

Note: train_loss and valid_loss are not available for implicit feedback

Validation

Pass a validation set with explicit feedback

rust RecommenderBuilder::new() .callback(|info| println!("{:?}", info)) .fit_eval_explicit(&train_set, &valid_set);

The loss function is RMSE

Reference

Get ids

rust recommender.user_ids(); recommender.item_ids();

Get the global mean

rust recommender.global_mean();

Get factors

rust recommender.user_factors(&user_id); recommender.item_factors(&item_id);

References

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

sh git clone https://github.com/ankane/disco-rust.git cd disco-rust cargo test