runctl

Crates.iorunctl
lib.rsrunctl
version0.1.1
created_at2026-01-01 15:53:06.276494+00
updated_at2026-01-01 17:44:13.723735+00
descriptionML training orchestration CLI for AWS EC2, RunPod, and local environments
homepagehttps://github.com/arclabs561/runctl
repositoryhttps://github.com/arclabs561/runctl
max_upload_size
id2016363
size2,764,270
Henry Wallace (arclabs561)

documentation

README

runctl

ML training orchestration CLI for local, RunPod, and AWS EC2 with unified checkpoint management.

Prerequisites

  • Rust 1.70+
  • AWS credentials configured (aws configure or IAM role)
  • SSM agent enabled on EC2 (default on Amazon Linux 2)

Installation

cargo install --path .
# or
cargo build --release

Quick Start

# Initialize config
runctl init

# Create instance (use t3.micro for testing, ~$0.01/hr)
INSTANCE_ID=$(runctl aws create --spot --instance-type t3.micro --wait --output instance-id)

# Train (waits until complete)
runctl aws train $INSTANCE_ID training/train_mnist.py --sync-code --wait

# Or use workflow command
runctl workflow train training/train_mnist.py --instance-type t3.micro --spot

Examples: ./examples/complete_workflow.sh (examples/README.md)

Commands

AWS EC2

runctl aws create [--instance-type TYPE] [--spot] [--wait] [--output FORMAT]
runctl aws train <instance-id> <script> [--sync-code] [--wait] [--data-s3 PATH] [--output-s3 PATH]
runctl aws monitor <instance-id> [--follow]
runctl aws processes <instance-id> [--watch]
runctl aws start|stop|terminate <instance-id>
runctl aws status|wait <instance-id>

Local

runctl local <script> [args...]

Uses uv for Python scripts if available, otherwise falls back to python3.

RunPod

runctl runpod create [--gpu TYPE] [--disk GB]
runctl runpod train <pod-id> <script>
runctl runpod monitor <pod-id> [--follow]
runctl runpod download <pod-id> <remote> <local>

Resources

runctl resources list [--platform aws|runpod|local] [--detailed]
runctl resources summary
runctl resources insights
runctl resources cleanup [--dry-run] [--force]

S3

runctl s3 upload <local> <s3://bucket/key> [--recursive]
runctl s3 download <s3://bucket/key> <local> [--recursive]
runctl s3 sync <source> <dest> [--direction up|down]
runctl s3 list <s3://bucket/prefix> [--recursive]
runctl s3 cleanup <s3://bucket/prefix> --keep-last-n <N> [--dry-run]

Monitoring & Checkpoints

runctl monitor --log <file> [--follow]
runctl monitor --checkpoint <dir> [--follow]
runctl checkpoint list <dir>
runctl checkpoint info <path>
runctl checkpoint resume <path> <script>
runctl top

Workflow

runctl workflow train <script> [--instance-type TYPE] [--spot]

Docker

runctl docker build [--push] [--repository NAME]
runctl docker train <image> <script>

Configuration

Create .runctl.toml or use runctl init:

[aws]
region = "us-east-1"
default_instance_type = "t3.medium"
use_spot = true
s3_bucket = "your-bucket"

[runpod]
api_key = "your-key"  # or from ~/.cursor/mcp.json
default_gpu = "NVIDIA GeForce RTX 4080 SUPER"

[checkpoint]
dir = "checkpoints"
save_interval = 5

Development

just test    # Run tests
just lint    # Lint
just fmt     # Format
just dev     # Dev build

License

MIT OR Apache-2.0

Commit count: 0

cargo fmt