ml-cellar

Crates.io	ml-cellar
lib.rs	ml-cellar
version	0.2.0
created_at	2025-12-02 00:37:03.528622+00
updated_at	2026-01-24 12:17:29.737546+00
description	CLI of ML model registry for minimum MLOps
homepage
repository	https://github.com/scepter914/ml-cellar-rs/
max_upload_size
id	1960921
size	2,041,006

Satoshi Tanaka (scepter914)

documentation

https://docs.rs/ml-cellar/

README

ml-cellar-rs

ml-cellar provides a CLI to support model registry, storing ML models like a wine cellar and enabling minimal MLOps. By using Git LFS, ml-cellar offers essential MLOps functions including Artifact Store, Model Registry, and Serving.

If MLOps at BigTech companies can be compared to a large-scale winery, ml-cellar functions like a wine cellar in a small brewery. While AI projects should ideally adopt software like MLflow for MLOps, many projects and organizations cannot afford the development resources that BigTech companies have. As a result, existing MLOps software has primarily targeted companies that can allocate significant development costs to MLOps (like BigTech). ml-cellar makes "compromises" on MLOps by focusing only on the essential functions, enabling easy adoption of minimal MLOps.

Feature
- Model storage with GitLFS system
- Model storage with Custom Transfer Agent

Related resource

ml-cellar-example: An example of ML model registry by ml-cellar.

GitHub Git LFS usage & billing notice

[!WARNING] This repository uses Git LFS (Large File Storage) to manage large files. Please note that GitHub Git LFS has storage and bandwidth limits. If the free quota included in your GitHub plan is exceeded, you may incur additional charges, or LFS uploads and downloads may be restricted depending on your billing settings.

Any costs incurred by exceeding Git LFS quotas are the user’s responsibility. Before cloning or pushing large files, check your GitHub plan and the repository’s current Git LFS usage.

Contribution

If you want to contribute this project, please see the documents.

Get started

If you want to see example of the usage of ml-cellar as minimum MLOps, please see ml-cellar-example as reference.

0. Word definition

Artifact

Files produced by training/evaluation and stored in the registry (e.g., checkpoints, configs, logs, evaluation results).

Git LFS (Large File Storage)

A Git extension for tracking large files (e.g., model checkpoints) efficiently by storing pointers in Git and file contents in LFS storage.

Model Registry Repository (The word in ml-cellar)

The Git repository (often on GitHub) used as the “cellar” to manage model versions and their artifacts. A repository that stores and serves ML model artifacts, analogous to a wine cellar.

Rack (The word in ml-cellar)

A logical shelf for a single model family/algorithm (e.g., vit-l, llm-model). Each rack has metadata files (e.g., README.md, config.toml, template.md) and multiple "ML-bins" for version subdirectories. It corresponds to a “rack” in a wine cellar.

ML-bin (The word in ml-cellar)

A specific release of a model stored under a rack (e.g., vit-l/1.1/). ML-bin has artifacts like checkpoints, configs, logs. It corresponds to a “vintage” in the wine analogy.

1. Install

Install GitLFS
- See the official document.

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get update
sudo apt-get install git-lfs
git lfs install

Install Rust
- See the official document.

curl https://sh.rustup.rs -sSf | sh

Install ml-cellar.

cargo install ml-cellar

How to build on local device (If you need)

If you build on local device, run below command.

git clone https://github.com/scepter914/ml-cellar-rs
cd ml-cellar-rs
cargo install --path .

2. Setup repository

Let's open the cellar for AI models.

Start ml-cellar.

mkdir {your_ml_registry}
cd {your_ml_registry}
ml-cellar init

Make remote repository at GitHub and add remote repository

git remote add origin {remote_repository}

Edit .gitattributes if you want to add for GitLFS

# --- Log ---
*.log   filter=lfs diff=lfs merge=lfs -text

# --- Your data ---
*.db   filter=lfs diff=lfs merge=lfs -text

Do initial commit

git add -A
git commit -m "Initial commit"
git push -u origin main

3. (Option) Setup with AWS S3

[!WARNING] The feature of ml-cellar with Custom Transfer Agent is not implemented for now. I’d appreciate it if you could wait for the implementation.

If you need to handle many large files in your model registry, consider using a Custom Transfer Agent to replace the storage backend with AWS S3 or a similar service.

4. Construct a rack to start ML cellar

A rack in ml-cellar is like creating racks in a wine cellar. If ML models are like wine, then racks are the shelves where you store the wine. Just as wine is labeled with its vintage year, we attach versions to AI models, called "ML-bin" in ml-cellar.

The rack consists of the following directory structure:

- model_registry_repository/
  - vit-l/
    - 0.1/
      - config.yaml
      - checkpoint.pth
      - result.json
      - log.log
    - 0.2/
      - ...
    - 1.0/
      - ...
    - 1.1/
      - ...
  - vit-m/
    - ...

In this directory, we call the ML model like vit-l 1.1 for "Large size of Vision Transformer, version 1.1". It is like vintage wine called by "Dom Pérignon Vintage 2010".

Make rack.

ml-cellar rack {path}

If you run ml-cellar rack {my_rack}, then you can see as below.
- At first, it is easier to understand if you name the algorithm the same as the rack name.
- Without ml-cellar rack {my_rack} command, you can make rack by making directory and config.toml manually.

- {your_ml_registry}/
  - {my_rack}/
    - README.md
    - config.toml
    - template.md

Edit config.toml.
- Add required files to save ML registry.
- If you allow to save any kind of file, you set "*" for optional files.

[artifact]
required_files = ["config.yaml", "logs/"]
optional_files = ["*"]

Edit README.md.
- When developing ML models as a team, or providing ML models to various projects, a cellar isn’t opened for just one person.
- Just as with wine, where you introduce its flavor and the foods it pairs well with, let’s write a description of the ML model for users.

5. (Option) Set up for MLOps

1. Set up GitHub repository.

When using GitHub Git LFS, you can use GitHub's functions as they are. Some of the more useful features include pull request, code owner, GitHub Actions with ml-cellar check, and GitHub issue. If you want to know more about this, please see tips with GitHub repository.

1. Start fine-tuning for projects from base model.

If you use fixed dataset and focus on algorithm development like Kaggle competition or research usage, the simple ML cellar is fine to share various experimental results using single rack. However, if you are an engineer and you have many projects you want to fine-tune for, I recommend to use "project-based model registry". Please see the document for project-based model registry.

1. Setup template.md and result.json for MLOps.

Manually writing evaluation results in practice is costly. And even if you try to organize them in a spreadsheet, it’s hard to keep the spreadsheet entries correctly linked to the exact ML-bin (model version) they came from.

To address this, the training-side code can store the information you want to include in the README as JSON. Then ml-cellar supports a semi-automated workflow: it generates evaluation results from a template file and fills in the values automatically. If you create a template and run a command ml-cellar docs {my_rack}/{version} such as: ml-cellar docs test_registry/test_docs/base/0.1/, then the documentation is generated automatically.

If you want to setup template.md and result.json for MLOps, then please see the document for template.md.

6. Commit ML model

Let's actually stock AI models on the racks.

Make branch and switch the branch.
- Create a feature branch for the new ML-bin version.

git sw -c feat/my_rack/0.1

Make ML-bin directory and put files you want to commit for {your_ml_registry}/{my_rack}/0.1/*.

- {your_ml_registry}/
  - {my_rack}/
    - README.md
    - config.toml
    - template.md
    - 0.1/
      - config.yaml
      - result.json
      - log.log
      - best_score.pth

Check files you want to commit.
- Checks whether the required_file defined in rack's config.toml exists and whether any files not defined in optional_file have been committed.

ml-cellar check {path}

In this example, please run this command.

cd {your_ml_registry}
ml-cellar check {my_rack}/0.1/

When check is fine, commit the files by git command as usual.

git add {path}
git commit -m "{commit message}"

Push the files (you can also push using the native git command in the directory).

ml-cellar push origin HEAD

Make PR and review.
- You can see examples of review as below.
- https://github.com/scepter914/ml-cellar-example/pull/2
- https://github.com/scepter914/ml-cellar-example/pull/3
- https://github.com/scepter914/ml-cellar-example/pull/4
After PR merged and if you want to pull the repository, run pull command.

ml-cellar pull

7. Use `ml-cellar` as model serve

A wine cellar is a storage facility, but it's also a facility for serving.

Clone repository.
- At this point, there are only Git LFS references, and the actual files do not exist.

ml-cellar clone {your_ml_registry}

Download all files.
- This command downloads the actual file contents from Git LFS storage.

ml-cellar download-all

Download a specific directory or file.

ml-cellar download {path to a directory or a file}

Reference

https://github.com/regen100/lfs-dal
- https://www.regentechlog.com/2024/01/07/lfs-dal/ (Japanese blog)

Commit count: 25