# Rust Monte Carlo Tree Search (MCTS) with Arena Allocator ![License](https://img.shields.io/badge/license-MIT-blue.svg) ![Rust](https://img.shields.io/badge/rust-1.56%2B-orange.svg) A Rust implementation of the Monte Carlo Tree Search (MCTS) algorithm using an arena allocator for efficient memory management. This project features a Tic-Tac-Toe game to showcase the MCTS algorithm in action. ## Table of Contents - [Introduction](#introduction) - [Features](#features) - [Getting Started](#getting-started) - [Prerequisites](#prerequisites) - [Installation](#installation) - [Running the Tic-Tac-Toe Example](#running-the-tic-tac-toe-example) - [Implementation Details](#implementation-details) - [Arena Allocator](#arena-allocator) - [MCTS Algorithm](#mcts-algorithm) - [State Trait](#state-trait) ## Introduction Monte Carlo Tree Search (MCTS) is a search algorithm used for decision-making processes. This project provides a Rust implementation of MCTS that efficiently manages memory using an arena allocator. By storing all nodes in a central arena, we avoid the overhead of reference counting and interior mutability, resulting in a more performant and idiomatic Rust codebase. The included Tic-Tac-Toe game serves as a practical example of how to use the MCTS library. ## Features - **Efficient Memory Management**: Utilizes an arena allocator to store nodes, reducing allocation overhead. - **Generic State Management**: Defines a `State` trait to allow MCTS to work with any game or decision process. ## Getting Started ### Prerequisites - **Rust**: Ensure you have Rust and Cargo installed. You can install Rust using [rustup](https://rustup.rs/). ### Installation 1. **Clone the repository**: ```bash git clone https://github.com/PaytonWebber/mcts-rs.git cd mcts-rs ``` ### Running the Tic-Tac-Toe Example To run the Tic-Tac-Toe game where the MCTS algorithm plays against itself with a random starting move, use the following command: ```bash cargo run --example tic_tac_toe ``` ## Implementation Details ### Arena Allocator An arena allocator is used to efficiently manage memory for the nodes in the MCTS tree. Nodes are stored in a vector, and their relationships are represented by indices rather than pointers or references. This approach avoids the need for reference counting (`Rc`) and interior mutability (`RefCell`), leading to cleaner and more efficient code. **Benefits:** - **Performance**: Reduced allocation overhead and improved cache locality. - **Simplicity**: Simplifies ownership and borrowing by avoiding complex lifetime issues. - **Safety**: Leverages Rust's safety guarantees without resorting to unsafe code. ### MCTS Algorithm The MCTS algorithm consists of four main steps: 1. **Selection**: Starting from the root node, select child nodes based on the Upper Confidence Bound (UCB) until a leaf node is reached. 2. **Expansion**: If the leaf node is not a terminal state, expand it by adding all possible child nodes. 3. **Simulation**: Run a simulation from the expanded node to a terminal state by making random moves. 4. **Backpropagation**: Update the nodes along the path with the simulation result. **Key Components:** - **Node Struct** (`node.rs`): Represents a node in the search tree. - **Arena Struct** (`arena.rs`): Stores all nodes and manages parent-child relationships between nodes. - **MCTS Implementation** (`mod.rs`): Contains the logic for selection, expansion, simulation, and backpropagation. ### State Trait The `State` trait abstracts the game logic, allowing the MCTS algorithm to work with any game or decision process that implements this trait. Here's the trait definition: ```rust pub trait State { /// Checks if the specified player has won the game. fn player_has_won(&self, player: usize) -> bool; /// Determines if the current state is a terminal state (no further moves possible). fn is_terminal(&self) -> bool; /// Returns a vector of legal actions available from the current state. fn get_legal_actions(&self) -> Vec<(usize, usize)>; /// Returns the index of the player whose turn it is to play. fn to_play(&self) -> usize; /// Returns a new state resulting from applying the given action to the current state. fn step(&self, action: (usize, usize)) -> Self; /// Calculates and returns the reward for the specified player in the current state. fn reward(&self, player: usize) -> f32; /// Renders or prints the current state (useful for debugging or display purposes). fn render(&self); } ``` By implementing this trait for your game or decision process, you can integrate it with the MCTS algorithm provided in this library. The `tic_tac_toe.rs` file offers an example implementation of the `State` trait for Tic-Tac-Toe.