| Crates.io | forger |
| lib.rs | forger |
| version | 0.1.4 |
| created_at | 2023-11-13 00:47:52.411997+00 |
| updated_at | 2023-11-17 07:02:45.142815+00 |
| description | Forger is a library for reinforcement learning with Rust |
| homepage | |
| repository | https://github.com/Axect/Forger |
| max_upload_size | |
| id | 1033156 |
| size | 350,858 |
Forger is a Reinforcement Learning (RL) library in Rust, offering a robust and efficient framework for implementing RL algorithms. It features a modular design with components for agents, environments, policies, and utilities, facilitating easy experimentation and development of RL models.
Policy (policy):
Agent (agent):
VEveryVisitMC) and Q-Learning - Every Visit Monte Carlo (QEveryVisitMC).Environment (env):
Env trait to define RL environments.LineWorld, a simple linear world environment for experimentation.Prelude (prelude):
env, agent, and policy modules for convenient access.In your project directory, run the following command:
cargo add forger
use forger::prelude::*;
use forger::env::lineworld::{LineWorld, LineWorldAction};
pub type S = usize; // State
pub type A = LineWorldAction; // Action
pub type P = EGreedyPolicy<A>; // Policy
pub type E = LineWorld; // Environment
fn main() {
let env = LineWorld::new(
5, // number of states
1, // initial state
4, // goal state
vec![0] // terminal states
);
let mut agent = QEveryVisitMC::<S, A, P, E>::new(0.9); // Q-learning (Everyvisit MC, gamma = 0.9)
let mut policy = EGreedyPolicy::new(0.5, 0.95); // Epsilon Greedy Policy (epsilon = 0.5, decay = 0.95)
for _ in 0 .. 200 {
let mut episode = vec![];
let mut state = env.get_init_state();
loop {
let action = agent.select_action(&state, &mut policy, &env);
let (next_state, reward) = env.transition(&state, &action);
episode.push((state, action.unwrap(), reward));
match next_state {
Some(s) => state = s,
None => break,
}
}
agent.update(&episode);
policy.decay_epsilon();
}
}
Monte Carlo with Epsilon Decay in LineWorld:
QEveryVisitMC) agent with an Epsilon Greedy Policy (with decay) in the LineWorld environment.TD0 with Epsilon Decay in GridWorld:
TD0) agent with an Epsilon Greedy Policy (with decay) in the GridWorld environment.Contributions to Forger are welcome! If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.
Forger is licensed under the MIT License or the Apache 2.0 License.