Crates.io | forger |
lib.rs | forger |
version | 0.1.4 |
source | src |
created_at | 2023-11-13 00:47:52.411997 |
updated_at | 2023-11-17 07:02:45.142815 |
description | Forger is a library for reinforcement learning with Rust |
homepage | |
repository | https://github.com/Axect/Forger |
max_upload_size | |
id | 1033156 |
size | 350,858 |
Forger is a Reinforcement Learning (RL) library in Rust, offering a robust and efficient framework for implementing RL algorithms. It features a modular design with components for agents, environments, policies, and utilities, facilitating easy experimentation and development of RL models.
Policy (policy
):
Agent (agent
):
VEveryVisitMC
) and Q-Learning - Every Visit Monte Carlo (QEveryVisitMC
).Environment (env
):
Env
trait to define RL environments.LineWorld
, a simple linear world environment for experimentation.Prelude (prelude
):
env
, agent
, and policy
modules for convenient access.In your project directory, run the following command:
cargo add forger
use forger::prelude::*;
use forger::env::lineworld::{LineWorld, LineWorldAction};
pub type S = usize; // State
pub type A = LineWorldAction; // Action
pub type P = EGreedyPolicy<A>; // Policy
pub type E = LineWorld; // Environment
fn main() {
let env = LineWorld::new(
5, // number of states
1, // initial state
4, // goal state
vec![0] // terminal states
);
let mut agent = QEveryVisitMC::<S, A, P, E>::new(0.9); // Q-learning (Everyvisit MC, gamma = 0.9)
let mut policy = EGreedyPolicy::new(0.5, 0.95); // Epsilon Greedy Policy (epsilon = 0.5, decay = 0.95)
for _ in 0 .. 200 {
let mut episode = vec![];
let mut state = env.get_init_state();
loop {
let action = agent.select_action(&state, &mut policy, &env);
let (next_state, reward) = env.transition(&state, &action);
episode.push((state, action.unwrap(), reward));
match next_state {
Some(s) => state = s,
None => break,
}
}
agent.update(&episode);
policy.decay_epsilon();
}
}
Monte Carlo with Epsilon Decay in LineWorld
:
QEveryVisitMC
) agent with an Epsilon Greedy Policy (with decay) in the LineWorld
environment.TD0 with Epsilon Decay in GridWorld
:
TD0
) agent with an Epsilon Greedy Policy (with decay) in the GridWorld
environment.Contributions to Forger are welcome! If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.
Forger is licensed under the MIT License or the Apache 2.0 License.