🏋️‍♀️ bevy_rl

🏗️ Build 🤔 Reinforcement Learning 🏋🏿‍♂️ Gym environments with 🕊 Bevy engine to train 👾 AI agents that 💡 learn from 📺 screen pixels.

Compatibility

| bevy version | bevy_rl version | | ------------ | :-------------: | | 0.7 | 0.0.5 | | 0.8 | 0.8.4 | | 0.9 | 0.9.1 |

📝Features

Set of APIs to implement OpenAI Gym interface
REST API to control an agent
Rendering to RAM membuffer

📋 Changelog

0.8.4
- Added object representation of observation space

👩‍💻 Usage

1. Define App States

```rust

[derive(Debug, Clone, Eq, PartialEq, Hash)]

enum AppState { InGame, // where all the game logic is executed Control, // A paused state in which bevy_rl waits for agent actions Reset, // A request to reset environment state } ```

2. Define Action Space and Observation Space

A action space is a set of actions that an agent can take. An observation space is a set of observations that an agent can see. Action space can be discrete or continuous. Observations should be serializable to JSON with serde_json crate.

```rust // Action space bitflags! { #[derive(Default)] pub struct PlayerActionFlags: u32 { const FORWARD = 1 << 0; const BACKWARD = 1 << 1; const LEFT = 1 << 2; const RIGHT = 1 << 3; } }

// Observation space

[derive(Default, Serialize, Clone)]

pub struct EnvironmentState { pub map: GameMap, pub actors: Vec, }

```

3. Enable AI Gym Plugin

Width and hight should exceed 256, otherwise wgpu will panic.

```rust let gymsettings = AIGymSettings { width: 256, height: 256, numagents: 16, };

app
    .insert_resource(gym_settings.clone())
    .insert_resource(Arc::new(Mutex::new(AIGymState::<
        PlayerActionFlags,
        EnvironmentState,
    >::new(gym_settings.clone()))))
    .add_plugin(AIGymPlugin::<PlayerActionFlags, EnvironmentState>::default())

```

4. Implement Environment Logic

DelayedControlTimer should pause environment execution to allow agents to take actions.

rust struct DelayedControlTimer(Timer);

Define systems that implement environment logic.

```rust app.addsystemset( SystemSet::onupdate(AppState::InGame) .withsystem(turnbasedcontrolsystem_switch), );

app.insertresource(DelayedControlTimer(Timer::fromseconds(0.1, true))); // 10 Hz app.addsystemset( SystemSet::onupdate(AppState::Control) // Game Systems .withsystem(turnbasedtextcontrolsystem) // System that parses user command .withsystem(executeresetrequest), // System that performs environment state reset ); ```

turnbased_control_system_switch should pause game world and poll bevy_rl for agent actions.

```rust fn turnbasedcontrolsystemswitch( mut appstate: ResMut>, time: Res, mut timer: ResMut, aigymstate: ResMut>>>, ) { if timer.0.tick(time.delta()).justfinished() { appstate.push(AppState::Control); physics_time.pause();

    let ai_gym_state = ai_gym_state.lock().unwrap();
    ai_gym_state.send_step_result(true);
}

} ```

execute_reset_request handles environment reset request. turnbased_control_system_switch in this example parses agent actions and issues commands to agents in environment via control_agents.

```rust pub(crate) fn executeresetrequest( mut appstate: ResMut>, aigymstate: ResMut>>>, ) { let aigymstate = aigymstate.lock().unwrap(); if !aigymstate.isreset_request() { return; }

ai_gym_state.receive_reset_request();
app_state.set(AppState::Reset).unwrap();

}

pub(crate) fn turnbasedcontrolsystemswitch( mut appstate: ResMut>, time: Res, mut timer: ResMut, aigymstate: ResMut>>>, aigymsettings: Res, mut physicstime: ResMut, ) { if timer.0.tick(time.delta()).justfinished() { appstate.overwritepush(AppState::Control).unwrap(); physics_time.pause();

    let ai_gym_state = ai_gym_state.lock().unwrap();
    let results = (0..ai_gym_settings.num_agents).map(|_| true).collect();
    ai_gym_state.send_step_result(results);
}

}

pub(crate) fn turnbasedtextcontrolsystem( agentmovementq: Query<(&mut heron::prelude::Velocity, &mut Transform, &Actor)>, collisionevents: EventReader, eventgunshot: EventWriter, aigymstate: ResMut>>>, aigymsettings: Res, mut appstate: ResMut>, mut physicstime: ResMut, ) { let mut aigymstate = aigymstate.lock().unwrap();

// Drop the system if users hasn't sent request this frame
if !ai_gym_state.is_next_action() {
    return;
}

let unparsed_actions = ai_gym_state.receive_action_strings();
let mut actions: Vec<Option<PlayerActionFlags>> =
    (0..ai_gym_settings.num_agents).map(|_| None).collect();

for i in 0..unparsed_actions.len() {
    let unparsed_action = unparsed_actions[i].clone();
    ai_gym_state.set_reward(i, 0.0);

    if unparsed_action.is_none() {
        actions[i] = None;
        continue;
    }

    let action = match unparsed_action.unwrap().as_str() {
        "FORWARD" => Some(PlayerActionFlags::FORWARD),
        "BACKWARD" => Some(PlayerActionFlags::BACKWARD),
        "LEFT" => Some(PlayerActionFlags::LEFT),
        "RIGHT" => Some(PlayerActionFlags::RIGHT),
        _ => None,
    };

    actions[i] = action;
}

// Send environment state to AI Gym
ai_gym_state.set_env_state(EnvironmentState {});

physics_time.resume();
control_agents(actions, agent_movement_q, collision_events, event_gun_shot);

app_state.pop().unwrap();

} ```

💻 AIGymState API

| Method | Description | | -------------------------------------------------- | ------------------------------------------ | | send_step_result(results: Vec<bool>) | Send upon agents interactions are complete | | send_reset_result(result: bool) | Send when reset request is complete | | receive_action_strings(Vec<Option<String>>) | Recieve environment for agent actions | | receive_reset_request() | Recieve environment for reset request | | is_next_action() -> bool | Whether agent actions are supplied | | is_reset_request() -> bool | Whether reset request was sent | | set_reward(agent_index: usize, score: f32) | Set reward for an agent | | set_terminated(agent_index: usize, result: bool) | Set termination status for an agent | | reset() | Reset bevy_rl state | | set_env_state(state: B) | Set current environment state |

🌐 REST API

| Method | Verb | bevy_rl version | | ----------------- | -------- | --------------------------------------------- | | Camera Pixels | GET | http://localhost:7878/visual_observations | | State | GET | http://localhost:7878/state | | Reset Environment | POST | http://localhost:7878/reset | | Step | GET | http://localhost:7878/step payload=ACTION |

✍️ Examples

bevyrlshooter — example FPS project