ποΈ Build π€ Reinforcement Learning ππΏββοΈ Gym environments with π Bevy engine to train πΎ AI agents that π‘ can learn from πΊ screen pixels or defined obeservation state.
| bevy version | bevy_rl version | | ------------ | :-------------: | | 0.7 | 0.0.5 | | 0.8 | 0.8.4 | | 0.9 | 0.9.5 |
Deref
ergonomicsAIGymState
could not be initialized outside of the crateEnvironment needs to have multiple states, where different system are executed. Typically you will need to implement InGame
, Control
and Reset
states.
```rust
enum AppState { InGame, // where all the game logic is executed Control, // A paused state in which bevy_rl waits for agent actions Reset, // A request to reset environment state } ```
Observation space needs to be Serializable
because it's exported via REST API.
```rust // Action space
pub struct Actions { ... }
// Observation space
pub struct State { ... } ```
Width and height should exceed 256, otherwise wgpu will panic.
```rust let gymsettings = AIGymSettings { width: 256, // set if you need visual observations as state height: 256, numagents: 1, no_graphics: false, // you can disable rendering to buffer };
app
.insertresource(gymsettings.clone())
.insertresource(Arc::new(Mutex::new(AIGymState::
rust
pub(crate) fn spawn_cameras(
ai_gym_settings: Res<AIGymSettings>,
ai_gym_state: Res<AIGymState<Actions, State>>,
) {
let mut ai_gym_state = ai_gym_state.lock().unwrap();
for i in 0..ai_gym_settings.num_agents {
let render_image_handle = ai_gym_state.render_image_handles[i as usize].clone();
let render_target = RenderTarget::Image(render_image_handle);
let camera_bundle = Camera3dBundle {
camera: Camera {
target: render_target,
priority: -1,
..default()
},
..default()
};
commands.spawn(camera_bundle);
}
}
DelayedControlTimer
should pause environment execution to allow agents to take actions.
```rust
struct DelayedControlTimer(Timer); ```
Define systems that implement environment logic.
rust
app.add_startup_system(spawn_cameras);
app.add_system_set(
SystemSet::on_update(AppState::InGame)
.with_system(control_switch),
);
app.insert_resource(DelayedControlTimer(Timer::from_seconds(0.1, true))); // 10 Hz
app.add_system_set(
SystemSet::on_update(AppState::Control)
// Game Systems
.with_system(process_control_request) // System that parses user command
.with_system(process_reset_request), // System that performs environment state reset
);
app.add_system_set(
SystemSet::on_enter(AppState::Reset)
.with_system(reset_envvironment) // System resets environment to initial state
);
control_switch
should pause game world and poll bevy_rl
for agent actions.```rust
pub(crate) fn controlswitch(
mut appstate: ResMut
// Set current state to control to disable simulation systems
app_state.overwrite_push(AppState::Control).unwrap();
// Pause time
physics_engine.pause();
{
// ai_gym_state is behind arc mutex, so we need to lock it
let mut ai_gym_state = ai_gym_state.lock().unwrap();
// This will tell bevy_rl that environeent is ready to receive actions
let results = (0..ai_gym_settings.num_agents).map(|_| true).collect();
ai_gym_state.send_step_result(results);
// Collect data to build environment state
// and send it to bevy_rl to be consumable with REST API
let env_state = State {
...
};
ai_gym_state.set_env_state(env_state);
}
} } ```
process_reset_request
handles environment reset request.```rust
pub(crate) fn processresetrequest(
mut appstate: ResMut
ai_gym_state.receive_reset_request();
app_state.set(AppState::Reset).unwrap();
}
```
turnbased_text_control_system
parses agent actions and issues commands to agents in environment.```rust
pub(crate) fn processcontrolrequest(
aigymstate: ResMut
// Drop the system if users hasn't sent request this frame
if !ai_gym_state.is_next_action() {
return;
}
let unparsed_actions = ai_gym_state.receive_action_strings();
for i in 0..unparsed_actions.len() {
if let Some(unparsed_action) = unparsed_actions[i].clone() {
// Parse action and pass it to the game logic
let action: Vec<...> = serde_json::from_str(&unparsed_action).unwrap();
}
}
physics_engine.resume();
app_state.pop().unwrap();
} ```
| Method | Description |
| -------------------------------------------------- | ------------------------------------------ |
| send_step_result(results: Vec<bool>)
| Send upon agents interactions are complete |
| send_reset_result(result: bool)
| Send when reset request is complete |
| receive_action_strings(Vec<Option<String>>)
| Recieve environment for agent actions |
| receive_reset_request()
| Recieve environment for reset request |
| is_next_action() -> bool
| Whether agent actions are supplied |
| is_reset_request() -> bool
| Whether reset request was sent |
| set_reward(agent_index: usize, score: f32)
| Set reward for an agent |
| set_terminated(agent_index: usize, result: bool)
| Set termination status for an agent |
| reset()
| Reset bevy_rl state |
| set_env_state(state: State)
| Set current environment state |
| Method | Verb | bevy_rl version |
| ----------------- | -------- | --------------------------------------------- |
| Camera Pixels | GET | http://localhost:7878/visual_observations
|
| State | GET | http://localhost:7878/state
|
| Reset Environment | POST | http://localhost:7878/reset
|
| Step | GET | http://localhost:7878/step
payload=ACTION
|