| Crates.io | job-orchestrator |
| lib.rs | job-orchestrator |
| version | 1.4.0 |
| created_at | 2025-11-04 13:09:01.7312+00 |
| updated_at | 2026-01-19 19:04:58.620151+00 |
| description | Asynchronous job orchestrator for managing and routing payloads between services and computing resources with quota tracking |
| homepage | |
| repository | https://github.com/rvhonorato/job-orchestrator |
| max_upload_size | |
| id | 1916241 |
| size | 385,284 |
An asynchronous job orchestration system for managing and distributing computational workloads across heterogeneous computing resources with intelligent quota-based load balancing.
job-orchestrator is a central component of WeNMR, a worldwide e-Infrastructure for structural biology operated by the BonvinLab at Utrecht University. It serves as a reactive middleware layer that connects web applications to diverse computing resources, enabling efficient job distribution for scientific computing workflows.
flowchart TB
subgraph Tasks["Background Tasks"]
Sender["Sender<br>500ms"]
Getter["Getter<br>500ms"]
Cleaner["Cleaner<br>60s"]
end
subgraph Server["Orchestrator Server"]
API["REST API<br>upload/download"]
DB[("SQLite<br>Persistent")]
FS[/"Filesystem<br>Job Storage"/]
Tasks
Queue["Queue Manager<br>Quota Enforcement"]
end
subgraph Client["Client Service"]
ClientAPI["REST API<br>submit/retrieve/load"]
ClientDB[("SQLite<br>In-Memory")]
ClientFS[/"Working Dir"/]
Runner["Runner Task<br>500ms"]
Executor["Bash Executor<br>run.sh"]
end
User(["User/Web App"]) -- POST /upload --> API
User -- GET /download/:id --> API
API --> DB & FS
DB --> Queue
Queue --> Sender
Sender -- POST /submit --> ClientAPI
Getter -- GET /retrieve/:id --> ClientAPI
Getter --> FS
Cleaner --> DB & FS
ClientAPI --> ClientDB
ClientDB --> Runner
Runner --> Executor
Executor --> ClientFS
sequenceDiagram
participant User
participant Server
participant Client
participant Executor
User->>Server: POST /upload (files, user_id, service)
Server->>Server: Store job (status: Queued)
Server-->>User: Job ID
Note over Server: Sender task (500ms interval)
Server->>Server: Update status: Processing
Server->>Client: POST /submit (job files)
Client->>Client: Store payload (status: Prepared)
Client-->>Server: Payload ID
Server->>Server: Update status: Submitted
Note over Client: Runner task (500ms interval)
Client->>Executor: Execute run.sh
Executor->>Executor: Process files
Executor-->>Client: Exit code
Client->>Client: Update status: Completed
Note over Server: Getter task (500ms interval)
Server->>Client: GET /retrieve/:id
Client-->>Server: ZIP results
Server->>Server: Store results, status: Completed
User->>Server: GET /download/:id
Server-->>User: results.zip
Note over Server: Cleaner task (60s interval)
Server->>Server: Remove jobs older than MAX_AGE
stateDiagram-v2
[*] --> Queued: Job submitted
Queued --> Processing: Sender picks up job
Processing --> Submitted: Sent to client
Processing --> Failed: Client unreachable
Submitted --> Completed: Execution successful
Submitted --> Unknown: Retrieval failed or execution failed
Unknown --> Completed: Retry successful
Completed --> Cleaned: After MAX_AGE
Failed --> Cleaned: After MAX_AGE
Unknown --> Cleaned: After MAX_AGE (if applicable)
Cleaned --> [*]
The orchestrator will support automatic scaling of client instances based on workload, creating and terminating cloud instances dynamically to handle varying job demands efficiently.
---
config:
layout: dagre
---
flowchart TB
subgraph Server["Orchestrator Server"]
API["REST API"]
Queue["Queue Manager"]
AutoScaler["Auto-Scaler"]
ServicePool["Service Pool"]
end
subgraph Cloud["Cloud Provider"]
CloudAPI["Cloud API"]
end
subgraph Clients["Client Instances"]
Dynamic["Dynamic Clients<br>Auto-created"]
Static["Static Client<br>"]
end
User(["User/Web App"]) -- Submits/Retrieves --> API
API --> Queue
Queue -- Distribute jobs --> Clients
ServicePool <-- Monitors --> Queue
AutoScaler <-- Register/Trigger --> ServicePool
AutoScaler -- Scale Up/Down --> CloudAPI
CloudAPI -- Create/Terminate --> Clients
Both server and client modes are provided by the same binary, configured via command-line arguments.
docker compose up --build
This starts both the orchestrator server (port 5000) and an example client (port 9000).
curl -X POST http://localhost:5000/upload \
-F "file=@example/run.sh" \
-F "file=@example/2oob.pdb" \
-F "user_id=1" \
-F "service=example" | jq
Response:
{
"id": 1,
"user_id": 1,
"service": "example",
"status": "Queued",
"loc": "/opt/data/978e5a14-dc94-46ab-9507-fe0a94d688b8",
"dest_id": ""
}
Use HTTP HEAD to check status without downloading:
curl -I http://localhost:5000/download/1
Status Codes:
200 - Job completed, ready to download202 - Job queued or running204 - Job failed or cleaned up404 - Job not found500 - Internal server errorOnce the job completes (status 200):
curl -o results.zip http://localhost:5000/download/1
/upload endpoint with user_id and service parametersThe orchestrator enforces per-user, per-service quotas to ensure fair resource allocation. Configuration example:
SERVICE_EXAMPLE_RUNS_PER_USER=5 # Max 5 concurrent jobs per user for "example" service
This prevents any single user from monopolizing computing resources.
Submit multiple jobs to observe quota-based throttling:
for i in {1..250}; do
cat <<EOF > run.sh
#!/bin/bash
sleep \$((RANDOM % 36 + 25))
echo 'Computation complete!' > output.txt
EOF
curl -s -X POST http://localhost:5000/upload \
-F "file=@run.sh" \
-F "user_id=1" \
-F "service=example" > /dev/null
echo "Submitted job $i"
done
Monitor the orchestration in real-time:
docker compose logs server --follow
You'll see jobs dispatched gradually according to the configured quota limits.
Orchestrator is designed for scenarios requiring:
Current State: Production-ready with server/client architecture
Planned Features:
http://localhost:5000/swagger-ui/ when runningContributions, bug reports, and feature requests are welcome via GitHub issues.
MIT License - see LICENSE for details.
For questions, collaborations, or if you think this project could benefit your use case: