gflow

Crates.iogflow
lib.rsgflow
version0.3.1
created_at2025-02-10 08:45:54.857353+00
updated_at2025-08-04 10:58:14.295874+00
descriptionA lightweight, single-node job scheduler written in Rust.
homepage
repositoryhttps://github.com/AndPuQing/gflow.git
max_upload_size
id1549788
size223,721
PuQing (AndPuQing)

documentation

README

gflow - A lightweight, single-node job scheduler

GitHub Actions Workflow Status Crates.io Version Crates.io Downloads (recent) dependency status Crates.io License Crates.io Size

gflow is a lightweight, single-node job scheduler written in Rust, inspired by Slurm. It is designed for efficiently managing and scheduling tasks, especially on machines with GPU resources.

Snapshot

gflow

Core Features

  • Daemon-based Scheduling: A persistent daemon (gflowd) manages the job queue and resource allocation.
  • Rich Job Submission: Supports dependencies, priorities, and job arrays via the gbatch command.
  • Service and Job Control: Provides clear commands to manage the scheduler daemon (gctl), query the job queue (gqueue), and signal jobs (gsignal).
  • tmux Integration: Uses tmux for robust, background task execution and session management.
  • Simple Command-Line Interface: Offers a user-friendly and powerful set of command-line tools.

Component Overview

The gflow suite consists of several command-line tools:

  • gflowd: The scheduler daemon that runs in the background, managing jobs and resources.
  • gctl: Controls the gflowd daemon (start, stop, status).
  • gbatch: Submits jobs to the scheduler, similar to Slurm's sbatch.
  • gqueue: Lists and filters jobs in the queue, similar to Slurm's squeue.
  • gsignal: Sends signals (e.g., finish, fail) to running jobs.

Installation

Install via cargo (Recommended)

cargo install gflow

This will install all the necessary binaries (gflowd, gctl, gbatch, gqueue, gsignal).

Build Manually

  1. Clone the repository:

    git clone https://github.com/AndPuQing/gflow.git
    cd gflow
    
  2. Build the project:

    cargo build --release
    

    The executables will be available in the target/release/ directory.

Quick Start

  1. Start the scheduler daemon:

    gctl start
    

    You can check its status with gctl status.

  2. Submit a job: Create a script my_job.sh:

    #!/bin/bash
    echo "Starting job on GPU: $CUDA_VISIBLE_DEVICES"
    sleep 30
    echo "Job finished."
    

    Submit it using gbatch:

    gbatch --gpus 1 ./my_job.sh
    
  3. Check the job queue:

    gqueue
    

    You can also watch the queue update live: watch gqueue.

  4. Stop the scheduler:

    gctl stop
    

Usage Guide

Submitting Jobs with gbatch

gbatch provides flexible options for job submission.

  • Submit a command directly:

    gbatch --gpus 1 --command "python train.py --epochs 10"
    
  • Set a job name and priority:

    gbatch --gpus 1 --name "training-run-1" --priority 10 ./my_job.sh
    
  • Create a job that depends on another:

    # First job
    gbatch --gpus 1 --name "job1" ./job1.sh
    # Get job ID from gqueue, e.g., 123
    
    # Second job depends on the first
    gbatch --gpus 1 --name "job2" --depends-on 123 ./job2.sh
    

Querying Jobs with gqueue

gqueue allows you to filter and format the job list.

  • Filter by job state:

    gqueue --states Running,Queued
    
  • Filter by job ID or name:

    gqueue --jobs 123,124
    gqueue --names "training-run-1"
    
  • Customize output format:

    gqueue --format "ID,Name,State,GPUs"
    

Configuration

Configuration for gflowd can be customized. The default configuration file is located at ~/.config/gflow/gflowd.toml.

Contributing

If you find any bugs or have feature requests, feel free to create an Issue and contribute by submitting Pull Requests.

License

gflow is licensed under the MIT License. See LICENSE for more details.

Commit count: 0

cargo fmt