Deterministic workflow engine built on top of the WASM Component Model
Project status / Disclaimer
This is a pre-release.
This repo contains backend code for local development and testing.
The software doesn't have backward compatibility guarantees for CLI nor database format.
Please exercise caution if attempting to use it for production.
Supported platforms
Linux x64
Core principles
Schema first, using WASM Component model's WIT language as the interface between workflows and activities.
Backend developer's delight
Single process for running the executor, workflows and activities, with an escape hatch for external activities (planned).
Automatic retries on errors, timeouts, workflow executions continuing after a server crash.
Observability (planned) - parameters and results together with function hierarchy must be preserved.
Composability - nesting workflows, calling activities written in any supported language
Replay and fork existing workflows(planned). Fix problems and continue.
Concepts and features
Activities that must be idempotent (retriable), so that they can be stopped and retried at any moment. This contract must be fulfilled by the activity itself.
WASI activities are executed in a WASM sandbox
Able to contact HTTP servers using the WASI 0.2 HTTP client.
Able to read/write to the filesystem (planned).
Max execution duration support, after which the execution is suspended into intermittent timeout.
Retries on errors - on WASM traps (panics), or when returning an Error result.
Retries on timeouts with exponential backoff.
Execution result is persisted.
Performance option to keep the parent workflow execution hot or unload and replay the event history.
Deterministic workflows
Are replayable: Execution is persisted at every state change, so that it can be replayed after an interrupt or an error.
Running in a WASM sandbox, isolated from the environment
Automatically retried on failures like database errors, timeouts or even traps(panics).
Able to spawn child workflows or activities, either blocking until result arrives or awaiting the result asynchronously.
Workflows can be replayed with added log messages and other changes that do not alter the determinism of the execution (planned).
Join sets allow for structured concurrency, either blocking until child executions are done, or cancelling those that were not awaited (planned).
WASI webhooks
Mounted as a URL path, serving HTTP traffic.
Able to spawn child workflows or activities.
Work stealing executor
Periodically locking a batch of currently pending executions, starts/continues their execution
Cleaning up old hanging executions with expired locks. Executions that have the budget will be retried (planned).
Concurrency control - limit on the number of workers that can run simultaneously.
nix --extra-experimental-features nix-command --extra-experimental-features flakes run github:obeli-sk/obelisk
Usage
Generating a sample configuration file
obelisk server generate-config
Starting the server
obelisk server run
Getting the list of loaded functions
obelisk client component list
Submitting a function to execute (either workflow or activity)
# Call fibonacci(10) activity from the workflow 500 times in series.
obelisk client execution submit testing:fibo-workflow/workflow.fiboa '[10, 500]' --follow
Milestones
Milestone 1: Release the binary - done
Getting the obelisk application up and running as a Linux binary
Scheduling of workflows and wasm activities, retries on timeouts and failures
Persistence using sqlite
Launching child workflows/activities concurrently using join sets
Basic CLI for wasm component configuration and scheduling
Github release, docker image, publish to crates.io, support cargo-binstall
Milestone 2: Allow remote interaction via CLI - done
Move component and general configuration into a TOML file
Pull components -from an OCI registry
Publish the obelisk image to the Docker Hub (ubuntu, alpine)
obelisk client component push
gRPC API for execution management
Track the topmost parent
Params typecheck on creation, introspection of types of all functions in the system
Logging and tracing configuration, sending events to an OTLP collector
Milestone 3: Webhooks, Verify, Structured concurrency, Web UI - started
HTTP webhook triggers able to start new executions (workflows and activities), able to wait for result before sending the response.
Forward stdout and stderr (configurable) of activities and webhooks
Support for distributed tracing, logging from components collected by OTLP
Mapping from any execution result (e.g. traps, timeouts, err variants) to other execution results via -await-next
Server verification - downloads components, checks the TOML configuration and matches component imports with exports.
Structured concurrency for join sets - blocking parent until all child executions are finished
HTML based UI for showing executions, event history and relations
Print each component's imports and exports in the WIT format
Heterogenous join sets, allowing one join set to combine multiple function signatures and delays
Expose filesystem with directory mapping for activities, webhooks
Expose network configuration for activities, webhooks
Keepalives for activities, extending the lock until completion
Examples with C#, Go, JS, Python
Future ideas
Interactive CLI for execution management
External activities gRPC API
OpenAPI activity generator
Spawning processes from WASM activities, reading their outputs
Backpressure: Limits on pending queues, or an eviction strategy, slow down on LimitReached
External executors support - starting executions solely based on WIT exports. External executors must share write access to the sqlite database.
Labels restricting workflows/activities to executors
This project has a roadmap and features are added and tested in a certain order.
If you would like to contribute a feature, please discuss the feature on GitHub.
In order for us to accept patches and other contributions, you need to adopt our Contributor License Agreement (the "CLA"). The current version of the CLA can be found here.