staircase

Crates.io	staircase
lib.rs	staircase
version	0.0.6
created_at	2025-06-20 08:11:26.507906+00
updated_at	2025-07-21 10:41:06.111689+00
description	Kubernetes Step-based Operator
homepage	https://gitlab.com/xMAC94x/staircase
repository	https://gitlab.com/xMAC94x/staircase
max_upload_size
id	1719278
size	92,710

Marcel Märtens (xMAC94x)

documentation

https://docs.rs/staircase

README

staircase - Kubernetes Step-based Operator

During the eventually consistency of kubernetes your controllers need to be idempotent. It's very easy to mess up and end in a path that in uncovered and needs manual cleanup, something to be avoided in production environments.

A pattern that helped here is the stap-based controller, see below. This crate enables implementing such a step-based controller which can be integrated with kube and k8s_openapi.

Step-based Controller

A controller task is to match a desired state with the current state. If they differ, it should do adjustments until they match again. Those changes are either beeing done within the kubernetes api (e.g. starting/stopping Deployments/Jobs) or within external services (e.g. calling rest apis). Those state changes can fail, be reverted or bit-flipped by cosmic rays and often need to be synced between multiple services.

A good idea to reduce complexity is do only ever do one change at a time. Within your controll-loop, when seeing that the states DO NOT match, you decide whats the most important change, and do that. After this change, you requeue your resource and continue. In the next iteration you are hopefully left with n-1 differences.

We call an iteration of your control-loop: Run. Each Run can be split up in multiple Steps who are executed sequentially.

Installation

Add the following dependency to your Cargo.toml:

[dependencies]
staircase = "0.0.6"
## we depend on 1.0.0 of kube which itself depends on 0.25 of k8s-openapi. Choose a kubernetes version as feature of `k8s-openapi`
kube = { version = "1.0.0", features = ["runtime", "derive"] }
k8s-openapi = { version = "0.25", features = ["v1_33"] }
serde_json = { version = "1" }

Usage

See examples/simple.rs for a detailed example.

// derive some custom CR as usual
#[derive(CustomResource, Deserialize, Serialize, Clone, Debug, JsonSchema)]
#[kube(group = "example", version = "v1", kind = "Foo", status = "FooStatus", namespaced)]
pub struct FooSpec { /* ... */ }

// impl some steps where you do some stuff
use staircase::{Step, RunContext, StepResult, StepOutcome};
pub struct StepA {}

impl Step for StepA {
    type Error = ();
    type InOutData = ();
    type Resource = Foo;

    fn run<'a, 'b: 'a>(&'a self, context: &'b RunContext<Self::Resource>, data: Self::InOutData) -> Box<dyn Future<Output = StepResult<Self::Resource, Self::InOutData, Self::Error>> + Send + 'a> {
        Box::new(async move {
            // Do work here
            Ok(StepOutcome::NoModification { data })
        })
    }
}

// specify order of steps within a reconciler
use staircase::{Step, Reconciler};
struct CustomReconciler {}

impl Reconciler<(), ()> for CustomReconciler {
    type Resource = Foo;

    async fn evaluate_steps<'a>(&self, resource: &Self::Resource) -> (
        (),
        impl IntoIterator<Item = Box<dyn Step<Resource = Self::Resource, InOutData = (), Error = ()>>> + 'a,
    ) {
        let stepa: Box<dyn Step<Resource = _, InOutData = _, Error = _>> = Box::new(StepA {});
        ((), vec![stepa])
    }
}

Rules of a step-based approach

only ever change 1 thing per Run
- Exception: It's allowed to change the OWN status after an modification, however this change must be allowed to fail. And nothing within the controll-loop should depend on it.
- Exception: Addition changes are ONLY allowed if they are allowed to fail. Be extra careful with code when this exception is applied
After a Step did a modification, stop the current Run and start over again.
Each Step verifies a single fix, and applies it when necessary.
When any operation fails at any time (or the operator gets restarted) the next Run must continue like if it would without a failure/restart.

What to do when I need to depend on the status?

Above we stated that status changes within another modification must be allowed to fail. In case your logic depends on the status in a following Step you MUST extract this status-change in its own Step.

How to craft steps when I need to do 2 things at once.

Sometimes you NEED to do 2 things at once. E.g. order a item from an external restapi and annotate a Custom Resource that the order was done.

Ideally, the external service has a order_exist(id) endpoint. In case its possible to check for existing orders you can create 2 steps like:

If !order_exist(cr.id) then place order
If order_exist(cr.id) then update CR

Features

metrics - measure runtimes and results via opentelemetry
trace - get scopes and ids for each execution via tracing
util - utility functions that makes integrating staircase with kube easier.
_k8s_openapi_latest - technical feature used to enable latest feature of k8s_openapi. Especially useful when doing CI and otherwise compilation would fail (like in docs.rs). Note: the latest version might change over time, for direct use its better to pin a specific version.

Contributing

Pull Requests welcome! Start by checking open issues or feature requests in our GitLab repo.

License

MIT License or APACHE-2.0 License

Commit count: 0