| Crates.io | relayrl_framework |
| lib.rs | relayrl_framework |
| version | 0.5.0-alpha |
| created_at | 2025-07-01 17:22:51.738288+00 |
| updated_at | 2026-01-10 21:42:46.718351+00 |
| description | A distributed, system-oriented multi-agent reinforcement learning framework. |
| homepage | |
| repository | |
| max_upload_size | |
| id | 1733433 |
| size | 1,007,118 |
Core Library for Deep Multi-Agent Reinforcement Learning
Version: 0.5.0-alpha
Status: Under active development, expect broken functionality and breaking changes!
Tested Platform Support: macOS (Silicon), Linux (Ubuntu), Windows 10 (x86_64)
With v0.5.0 being a complete rewrite of v0.4.5's client implementation, the relayrl_framework crate now provides a multi-actor native client runtime for deep reinforcement learning experiments. The training server (and new inference server) are under development and remain unavailable in this update.
Without transport or database functionality being implemented yet, the client can only write data to an arrow file on your local device.
As of now, the only way to perform inference is to provide your own TorchScript or ONNX model formatted to the framework's standardized ModelModule interface. Upon implementation of the training server and algorithms, the client will be able to acquire a ModelModule from the training server's algorithm runtime just like in v0.4.5.
All feature flags other than client are (more) unstable - if not entirely unimplemented - and unsuitable for RL experiment usage. Use at your own risk!
Key Features:
NdArray for CPU exclusively and Tch for CPU/CUDA/MPSCurrent Limitations:
Major Changes:
Tch crate dependency with Burn, enabling generic Tensor interfacing with the framework (currently supports Burn's Tch and NdArray Tensor backends, as well as TorchScript and ONNX model inference).ZMQ transport implementation.relayrl_types).relayrl_algorithms), which remains unimplemented for now.relayrl_python), which remains unimplemented for now.// the following instructions assume that the `client` feature flag is the only one enabled;
// parameters for start()/restart()/AgentBuilder will change if `transport_layer` or `database_layer` is enabled.
use relayrl_framework::prelude::network::{RelayRLAgent, AgentBuilder, RelayRLAgentActors};
use relayrl_framework::prelude::types::{ModelModule, DeviceType};
use burn_ndarray::NdArray;
use burn_tensor::{Tensor, Float};
use std::path::PathBuf;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Build and Start
const OBS_RANK: usize = 2;
const ACT_RANK: usize = 2;
let model_path = PathBuf::from("dummy_model");
let (mut agent, params) = AgentBuilder::<NdArray, OBS_RANK, ACT_RANK, Float, Float>::builder()
.actor_count(4)
.default_model(ModelModule::<NdArray>::load_from_path(model_path))
.build().await?;
agent.start(
params.actor_count,
params.router_scale,
params.default_device,
params.default_model,
params.config_path
).await?;
// 2. Interact (using Burn Tensors)
let reward: f32 = 1.0;
let obs = Tensor::<NdArray, OBS_RANK, Float>::zeros([1, 4], &Default::default());
let ids = agent.get_actor_ids()?;
let acts = agent.request_action(ids.clone(), obs, None, reward).await?;
let versions = agent.get_model_version(ids.clone()).await?;
// 3. Actor Runtime Management
agent.new_actor(DeviceType::Cpu, None).await?;
let new_actor_count: u32 = 10;
agent.new_actors(new_actor_count, DeviceType::Mps, None).await?;
let ids = agent.get_actor_ids()?;
if ids.len() >= 2 {
agent.set_actor_id(ids[0], uuid::Uuid::new_v4()).await?;
agent.remove_actor(ids[1]).await?;
}
// 4. Agent Management and Shutdown
let last_reward: Option<f32> = Some(3.0);
let ids = agent.get_actor_ids()?;
agent.flag_last_action(ids.clone(), last_reward).await?;
agent.scale_throughput(2).await?;
agent.scale_throughput(-2).await?;
agent.shutdown().await?;
Ok(())
}
View this guide for agent usage :)
ZMQ transport interface completionPostgreSQL and SQLite database interface completionrelayrl_algorithms crate integration to enable deep RL algorithmic training and Client ModelModule acquisitionrelayrl_algorithms crate creation and publication for training workflowsrelayrl_types updates to minimize serialization overhead and to reduce tensor copy towards zero-copy (preferably)relayrl_cli for ease-of-use and language agnostic execution via a deployable gRPC pipeline for external CLI process interfacingContributions are welcomed! Please open issues or pull requests for bug reports, feature requests, or improvements. I'll be glad to work with you!