Crates.io | ractor-supervisor |
lib.rs | ractor-supervisor |
version | 0.1.9 |
created_at | 2025-01-20 17:37:20.966572+00 |
updated_at | 2025-02-07 18:17:08.554916+00 |
description | Supervisor module for ractor framework. |
homepage | https://github.com/simke9445/ractor-supervisor |
repository | https://github.com/simke9445/ractor-supervisor |
max_upload_size | |
id | 1524071 |
size | 129,272 |
An OTP-style supervisor for the ractor
framework—helping you build supervision trees in a straightforward, Rust-centric way.
Inspired by the Elixir/Erlang supervision concept, ractor-supervisor
provides a robust mechanism for overseeing one or more child actors and automatically restarting them under configurable policies. If too many restarts happen in a brief time window—a "meltdown"—the supervisor itself shuts down abnormally, preventing errant restart loops.
This crate provides three types of supervisors, each designed for specific use cases:
Supervisor
)DynamicSupervisor
)max_children
limitTaskSupervisor
)The strategy defines what happens when a child fails:
Strategies apply to all failure scenarios, including:
pre_start
/post_start
)Example: If spawning a child fails during pre_start, it will count as a restart and trigger strategy logic.
max_restarts
and max_window
: The "time window" for meltdown counting, expressed as a Duration
. If more than max_restarts
occur within max_window
, the supervisor shuts down abnormally (meltdown).reset_after
: If the supervisor sees no failures for the specified duration, it clears its meltdown log and effectively resets the meltdown counters.reset_after
(per child): If a specific child remains up for the given duration, its own failure count is reset to zero on the next failure.backoff_fn
: An optional function to delay a child's restart. For instance, you might implement exponential backoff to prevent immediate thrashing restarts.Actor Names: Both supervisors and their child actors must have names set. These names are used for:
Proper Spawning: When spawning supervisors or child actors, always use:
Supervisor::spawn_linked
or Supervisor::spawn
for static supervisorsDynamicSupervisor::spawn_linked
or DynamicSupervisor::spawn
for dynamic supervisorsActor::spawn_linked
directlySupervisors can manage other supervisors as children, forming a hierarchical or tree structure. This way, different subsystems can each have their own meltdown thresholds or strategies. A meltdown in one subtree doesn't necessarily mean the entire application must go down, unless the top-level supervisor is triggered.
For example:
Root Supervisor (Static, OneForOne)
├── API Supervisor (Static, OneForAll)
│ ├── HTTP Server
│ └── WebSocket Server
├── Worker Supervisor (Dynamic)
│ └── [Dynamic Worker Pool]
└── Task Supervisor
└── [Background Jobs]
Here's a complete example using a static supervisor:
use ractor::Actor;
use ractor_supervisor::*;
use ractor::concurrency::Duration;
use tokio::time::Instant;
use futures_util::FutureExt;
// A minimal child actor that simply does some work in `handle`.
struct MyWorker;
#[ractor::async_trait]
impl Actor for MyWorker {
type Msg = ();
type State = ();
type Arguments = ();
// Called before the actor fully starts. We can set up the actor's internal state here.
async fn pre_start(
&self,
_myself: ractor::ActorRef<Self::Msg>,
_args: Self::Arguments,
) -> Result<Self::State, ractor::ActorProcessingErr> {
Ok(())
}
// The main message handler. This is where you implement your actor's behavior.
async fn handle(
&self,
_myself: ractor::ActorRef<Self::Msg>,
_msg: Self::Msg,
_state: &mut Self::State
) -> Result<(), ractor::ActorProcessingErr> {
// do some work...
Ok(())
}
}
// A function to spawn the child actor. This will be used in ChildSpec::spawn_fn.
async fn spawn_my_worker(
supervisor_cell: ractor::ActorCell,
child_id: String
) -> Result<ractor::ActorCell, ractor::SpawnErr> {
// We name the child actor using `child_spec.id` (though naming is optional).
let (child_ref, _join) = Supervisor::spawn_linked(
Some(child_id), // actor name
MyWorker, // actor instance
(), // arguments
supervisor_cell // link to the supervisor
).await?;
Ok(child_ref.get_cell())
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// A child-level backoff function that implements exponential backoff after the second failure.
// Return Some(delay) to make the supervisor wait before restarting this child.
let my_backoff: ChildBackoffFn = Arc::new(
|_child_id: &str, restart_count: usize, last_fail: Instant, child_reset_after: Option<u64>| {
// On the first failure, restart immediately (None).
// After the second failure, double the delay each time (exponential).
if restart_count <= 1 {
None
} else {
Some(Duration::from_secs(1 << restart_count))
}
}
);
// This specification describes exactly how to manage our single child actor.
let child_spec = ChildSpec {
id: "myworker".into(), // Unique identifier for meltdown logs and debugging.
restart: Restart::Transient, // Only restart if the child fails abnormally.
spawn_fn: SpawnFn::new(|cell, id| spawn_my_worker(cell, id).boxed()),
backoff_fn: Some(my_backoff), // Apply our custom exponential backoff on restarts.
// If the child remains up for 60s, its individual failure counter resets on the next failure.
reset_after: Some(Duration::from_secs(60)),
};
// Supervisor-level meltdown configuration. If more than 5 restarts occur within a 10s window, meltdown is triggered.
// Also, if we stay quiet for 30s (no restarts), the meltdown log resets.
let options = SupervisorOptions {
strategy: SupervisorStrategy::OneForOne, // If one child fails, only that child is restarted.
max_restarts: 5, // Permit up to 5 restarts in the meltdown window.
max_window: Duration::from_secs(10), // The meltdown window.
reset_after: Some(Duration::from_secs(30)), // If no failures for 30s, meltdown log is cleared.
};
// Group all child specs and meltdown options together:
let args = SupervisorArguments {
child_specs: vec![child_spec], // We only have one child in this example
options,
};
// Spawn the supervisor with our arguments.
let (sup_ref, sup_handle) = Supervisor::spawn(
"root".into(), // name for the supervisor
args
).await?;
let _ = sup_ref.kill();
let _ = sup_handle.await;
Ok(())
}
For more examples, see the test files: