pure-stage

Crates.io	pure-stage
lib.rs	pure-stage
version	0.1.0
created_at	2025-06-15 07:31:43.83726+00
updated_at	2025-06-15 07:31:43.83726+00
description	A library for building and running simulations of distributed systems.
homepage	https://github.com/pragma-org/amaru
repository	https://github.com/pragma-org/amaru
max_upload_size
id	1713041
size	191,827

Roland Kuhn (rkuhn)

documentation

https://docs.rs/pure-stage

README

Pure Stage: Actor Model for deterministic scheduling

Design goals:

no side effects, only explicit effects (which does include internal state changes)
easy to use, including the ability to hold references across effects
executable on concurrent thread pool or using a deterministic simulator
fully back-pressured
ability for biased reading from inputs (which is necessary to avoid deadlocks with back-pressure)
wiring code should be nicely readable

Design elements

We will need to model state machines, processing network nodes, and wiring between those nodes.

State Machines

Writing state machines in Rust can be done explicitly using an enum that implements a trait for computing state × input → state × output. One advantage of this approach is that infrastructure like a (virtual) clock can be passed into the transition function as well. The main disadvantage is that additional states are needed for modelling (biased) input operations, complicating the state machine code.

Another syntactic means to this end is an async fn, which gets translated by the compiler into a state machine like the one above, albeit with a limited transition function: the Future::poll() method. When using this approach, effects like input and output need to be offered as Futures; the programmer may also await other Futures which perform untracked effects. Such abuse can only be rejected using runtime checks because the return type of an .await point cannot be constrained.

The downside of the second approach is that the internal state of a stage would be inaccessible, it is wrapped in the opaque Future generated by the compiler. One way to fix this is to compromise: offer all effects apart from receiving input as Futures and thus have the programmer write an explicit state machine that only needs to switch for inputs from upstream, not results from other effects. Since logging state progression is a rather important debugging tool, we are going for this variant.

Processing Nodes

The main purpose of an API for declaring processing nodes is to obtain type-level information on the connectivity expected by this node, i.e. the input message type, the typed outputs, and the internally handled state (for inspection from the outside during tests). Whether the state machine inside is implemented explicitly or using async fn doesn't matter at this level.

Wiring

While Rust can in principle model type-state to track what has already been wired, this is inconvenient in practice because it requires shadowing and thus restricts the code a programmer can write. Therefore, a compromise would be to establish that all processing nodes are declared first, followed by the declaration of the wiring. The wiring function ensures that message types do match, but it won't prevent connecting the same output to multiple inputs or multiple outputs to the same input; in fact, this freedom is quite desirable.

Future Work

add Call effect to allow internal asynchronous collaboration before accepting the next input (to keep back pressure intact in this case)
pass effect factory object into the transition function instead of requiring it in closures or states
factor out effect handler so that it can then also be used to run things in the Tokio StageGraph implementation
keep ring buffer to effects and (infrequent) state snapshots to enable later replay of system behaviour in a deterministic fashion
add virtual time tracking (i.e. implement Clock and Wait effects)
add deterministic shuffling of effect scheduling in the Simulation StageGraph implementation
output an execution log for debugging that notes all effects and state changes
add stage error handling (currently a stage just stops working)

Notes on TraceBuffer and Serialization

Storing all effects and responses requires serialization (because trait objects cannot implement Clone), which in turn requires some gymnastics due to the incompatibility of serde's Serialize and Deserialize traits with trait objects. While typetag can solve the serialization issue (via erased_serde::Serialize), deserialization will always require the full and concrete target type to be known plus matching deserialization code. The solution typetag provides therefore cannot support generic types, which closes the door on any kind of type parameter occurring within messages, states, and effects in the system. This is clearly a restriction that weighs too heavily, thus a different solution is required.

The current design shifts all deserialization to places where the concrete target type can be named and where thus the compiler can instantiate a Deserialize instance. This unfortunately requires some trace entries to be deserialized multiple times, with information for the simulation machinery being done in the generic parts and then the specific application data type (for messages or effect responses) to be deserialized from the generic cbor4ii::core::Value on the application side of the airlock; let us give thanks to the universe for schemaless deserialization in this context.

Commit count: 1448