Border is a reinforcement learning library in Rust.
Border is currently under development.
In order to run examples, install python>=3.7 and gym. Gym is the only built-in environment. The library itself works with any kind of environment.
Random policy: the following command runs a random controller (policy) for 5 episodes in CartPole-v0:
bash
$ cargo run --example random_cartpole
It renders during the episodes and generates a csv file in examples/model
, including the sequences of observation and reward values in the episodes.
bash
$ head -n3 examples/model/random_cartpole_eval.csv
0,0,1.0,-0.012616985477507114,0.19292789697647095,0.04204097390174866,-0.2809212803840637
0,1,1.0,-0.008758427575230598,-0.0027677505277097225,0.036422546952962875,0.024719225242733955
0,2,1.0,-0.008813782595098019,-0.1983925849199295,0.036916933953762054,0.3286677300930023
DQN agent: the following command trains a DQN agent:
bash
$ RUST_LOG=info cargo run --example dqn_cartpole
After training, the trained agent runs for 5 episodes. In the code, the parameters of the trained Q-network (and the target network) are saved in examples/model/dqn_cartpole
and load them for testing saving/loading trained models.
SAC agent: the following command trains a SAC agent on Pendulum-v0, which takes continuous action:
bash
$ RUST_LOG=info cargo run --example sac_pendulum
The code defines an action filter that doubles the torque in the environment.
Pong: the following command trains a DQN agent on PongNoFrameskip-v4:
bash
$ PYTHONPATH=$REPO/examples RUST_LOG=info cargo run --example dqn_pong_vecenv
This demonstrates how to use vectorized environments, in which 4 environments are running synchronously (see code). It took about 11 hours for 2M steps on a g3s.xlarge
instance on EC2. Hyperparameter values, tuned specific to Pong instead of all Atari games, are adapted from the book Deep Reinforcement Learning Hands-On. The learning curve is shown below.
After the training, you can see how the agent plays:
bash
$ PYTHONPATH=$REPO/examples cargo run --example dqn_pong_eval
Border is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0).