Serenade: Low-Latency Session-Based Recommendations

This repository contains the official code for session-based recommender system Serenade, which employs VMIS-kNN. It learns users' preferences by capturing the short-term and sequential patterns from the evolution of user behaviors and predicts interesting next items with low latency with support for millions of distinct items. VMIS-kNN is an index-based variant of a state-of-the-art nearest neighbor algorithm to session-based recommendation, which scales to use cases with hundreds of millions of clicks to search through.

Quick guide: getting started with Serenade.

Table of contents

  1. Downloads
  2. Find the best hyperparameter values
  3. Configure Serenade to use the hyperparameter values
  4. Start the Serenade service
  5. Retrieve recommendations using python
  6. Evaluate the testset
  7. Using your own train- and testset

Downloads

Serenade can be downloaded here. Binary executables are available for Windows, Linux and MacOS. Download the toy example project which contains toy datasets and a preconfigured configuration file.

Extract both downloaded files in the same directoy. You now have the following files: serving tpe_hyperparameter_optm evaluator train.txt test.txt valid.txt example.toml

Find the best hyperparameter values

The next step is finding the hyperparameters using the train and test-datasets. Serenade uses Tree-Structured Parzen Estimator (TPE) for finding the hyperparameters. TPE achieves low validation errors compared to Exhaustive Grid Search (Bergstra et al). The section [hyperparam] in the example.toml contains the ranges of hyperparameter values that will be explored.

The results will be printed out in the terminal, for example:

```

=== HYPER PARAMETER OPTIMIZATION RESULTS ====

MRR@20 for validation data: 0.3197 MRR@20 for test data: 0.3401 enabled businesslogic for evaluation:false best hyperparameter values: nmostrecentsessions:1502 neighborhoodsizek:288 idfweighting:2 lastitemsinsession:4 HPO done and also in the output file defined in the config file, for example: out_path = "results.csv" ```

Configure Serenade to use the hyperparameter values

We now update the [model] values in configuration file example.toml to use the hyperparameter values and set the trainingdatapath with the location of the train.txt. This is the content of the example configuration file with the new [model] paramer values. ``` config_type = "toml"

[server] host = "0.0.0.0" port = 8080 num_workers = 4

[log] level = "info" # not implemented

[data] trainingdatapath="train.txt"

[model] mmostrecentsessions = 1502 neighborhoodsizek = 288 maxitemsinsession = 4 numitemstorecommend = 21 idfweighting = 1

[logic] enablebusinesslogic = "false"

[hyperparam] trainingdatapath = "train.txt" testdatapath = "test.txt" validationdatapath = "valid.txt" numiterations = 15 saverecords = true outpath = "results.csv" enablebusinesslogic = false nmostrecentsessionsrange = [100, 2500] neighborhoodsizekrange = [50, 1500] lastitemsinsessionrange = [1, 20] idfweightingrange = [0, 5] ```

Start the Serenade service

Start the serving binary for your platform with the location of the configuration file as argument bash ./serving example.toml

You can open your webbrowser and goto http://localhost:8080/ you should see an internal page of Serenade.

Serenade exposes Prometheus metrics out-of-the-box for monitoring.

Retrieve recommendations using python

python import requests from requests.exceptions import HTTPError try: myurl = 'http://localhost:8080/v1/recommend' params = dict( session_id='144', user_consent='true', item_id='13598', ) response = requests.get(url=myurl, params=params) response.raise_for_status() # access json content jsonResponse = response.json() print(jsonResponse) except HTTPError as http_err: print(f'HTTP error occurred: {http_err}') except Exception as err: print(f'Other error occurred: {err}') [2835,10,12068,4313,3097,8028,3545,7812,17519,1164,17935,1277,13335,8655,14664,14556,6868,13509,9248,2498,11724] The returned json object is a list with recommended items.

Evaluate the testset

The evaluator application can be used to evaluate a test dataset. It reports on several metrics. * The evaluation can be started using: bash ./evaluator example.toml

```

=== START EVALUATING TEST FILE ====

Mrr@20,Ndcg@20,HitRate@20,Popularity@20,Precision@20,Coverage@20,Recall@20,F1score@20 0.3277,0.3553,0.6402,0.0499,0.0680,0.2765,0.4456,0.1180 Qty test evaluations: 931 Prediction latency p90 (microseconds): 66 p95 (microseconds): 66 p99.5 (microseconds): 66 ```

Using your own train- and testset

A train- and testset must be created from historical user-item click data, outside of Serenade. Each row in the training- or test set should contain an historical user-item interaction event with the following fields: * SessionId the ID of the session. Format: 64 bit Integer * ItemId the ID of the interacted item. Format: 64 bit Integer * Time the time when the user-item interaction occurred. In epoch seconds: 32 bit Floating point.

The last 24 hours in the historical data can be used as test-set while the rest of the sessions can be used as the training-set and written as plain text using a '\t' as field separator. This is an example of a training data CSV file train.txt: SessionId ItemId Time 10036 14957 1592337718.0 10036 14713 1592337765.0 10036 2625 1592338184.0 10037 7267 1591979344.0 10037 13892 1591979380.0 10037 7267 1591979504.0 10037 3595 1591979784.0 10038 6424 1591008704.0

Citation

Serenade - Low-Latency Session-Based Recommendation in e-Commerce at Scale

@article{Kersbergen2022SerenadeScale,
    title = {{Serenade - Low-Latency Session-Based Recommendation in e-Commerce at Scale}},
    year = {2022},
    journal = {SIGMOD},
    author = {Kersbergen, Barrie and Sprangers, Olivier and Schelter, Sebastian}
}

License

This project is licensed under the terms of the Apache 2.0 license.