cadi

Crates.iocadi
lib.rscadi
version2.0.0
created_at2026-01-12 03:17:16.284849+00
updated_at2026-01-12 06:25:59.079876+00
descriptionCADI CLI - Content-Addressed Development Interface
homepagehttps://conflictingtheories.github.io/cadi
repositoryhttps://github.com/ConflictingTheories/cadi
max_upload_size
id2036938
size239,943
Kyle Derby MacInnis (ConflictingTheories)

documentation

https://docs.rs/cadi

README

cadi

CADI CLI - Content-Addressed Development Interface

A universal build and distribution system treating all artifacts as content-addressed chunks with support for multiple representations (source → WASM IR → binaries → containers), registry federation, provenance tracking, and LLM optimization.

Installation

cargo install cadi

Quick Start

Scrape a Project

cadi scrape ./my-project --strategy semantic --output ./chunks

Publish to Registry

cadi publish \
  --registry https://registry.example.com \
  --auth-token YOUR_TOKEN \
  --namespace myorg/project

Build Artifacts

cadi build ./cadi.yaml

Query Registry

cadi query --registry URL --namespace myorg

Commands

scrape

Convert source code and files into CADI chunks.

cadi scrape <INPUT> [OPTIONS]

Options:
  --output DIR                    Output directory (default: ./cadi-chunks)
  --strategy <STRATEGY>           by-file|semantic|fixed-size|hierarchical|by-line-count
  --max-chunk-size BYTES          Maximum chunk size (default: 50MB)
  --include-overlap               Include context between chunks
  --hierarchy                     Create hierarchical relationships
  --extract-api                   Extract API surfaces
  --detect-licenses               Detect licenses
  --publish                       Publish after scraping
  --namespace NAMESPACE           Registry namespace
  --format FORMAT                 table|json

publish

Publish chunks to a registry.

cadi publish [OPTIONS]

Options:
  --registry URL                  Registry URL
  --auth-token TOKEN              Authentication token
  --namespace NAMESPACE           Namespace/project
  --batch-size N                  Chunks per request
  --dry-run                       Show what would be published

build

Execute build recipes.

cadi build <RECIPE> [OPTIONS]

Options:
  --output DIR                    Output directory
  --config FILE                   Configuration file

query

Query registry for chunks.

cadi query [OPTIONS]

Options:
  --registry URL                  Registry URL
  --namespace NAMESPACE           Filter by namespace
  --search QUERY                  Search terms
  --limit N                       Result limit

Configuration

Global Config

Create ~/.cadi/config.yaml:

default_registry: https://registry.example.com
default_namespace: myorg
auth:
  token: your-token-here
scraper:
  strategy: semantic
  max_chunk_size: 52428800
  extract_api_surface: true

Project Config

Create ./cadi.yaml in your project:

name: my-project
version: 1.0.0
namespace: myorg/my-project

scraping:
  strategy: semantic
  include_overlap: true
  languages:
    - rust
    - python

publishing:
  registry: https://registry.example.com
  namespace: myorg/my-project

Examples

Scrape and Publish

# Scrape with semantic chunking
cadi scrape ./src --strategy semantic --output ./chunks

# Publish to registry
cadi publish --registry https://registry.example.com \
  --auth-token $CADI_TOKEN \
  --namespace myorg/project

Batch Processing

for dir in ./projects/*; do
  echo "Processing $(basename $dir)..."
  cadi scrape "$dir" --strategy semantic
done

Integration with Pipelines

# GitHub Actions
cadi scrape . --strategy semantic --publish \
  --registry $REGISTRY_URL \
  --auth-token $REGISTRY_TOKEN \
  --namespace $GITHUB_REPOSITORY

Environment Variables

CADI_REGISTRY_URL          # Default registry
CADI_AUTH_TOKEN            # Authentication token
CADI_NAMESPACE             # Default namespace
CADI_CHUNKING_STRATEGY     # Default strategy
CADI_MAX_CHUNK_SIZE        # Default chunk size
CADI_EXTRACT_API           # Extract API surfaces
CADI_DETECT_LICENSES       # Detect licenses

Output Formats

Table Format (default)

ID                              Content Type    Size     Chunks
abc123def456                    application/rs  2,048    3
xyz789                          text/markdown   512      1

JSON Format

cadi scrape ./project --format json

Outputs structured JSON with full metadata:

{
  "chunks": [
    {
      "id": "abc123def456",
      "content_type": "application/rs",
      "size": 2048,
      "metadata": {
        "title": "hello.rs",
        "description": "Main binary",
        "keywords": ["rust", "cli"]
      }
    }
  ],
  "statistics": {
    "total_chunks": 10,
    "total_bytes": 102400,
    "duration_ms": 1200
  }
}

Troubleshooting

Authentication Failed

export CADI_AUTH_TOKEN="your-token"
cadi query --registry https://registry.example.com

Rate Limiting

Configure in ~/.cadi/config.yaml:

scraper:
  rate_limit_per_sec: 20
  request_timeout_secs: 30

Large Projects

For large projects, use hierarchical chunking:

cadi scrape ./large-project --strategy hierarchical

Integration with Other Tools

Docker

FROM rust:latest
RUN cargo install cadi
WORKDIR /project
COPY . .
RUN cadi scrape . --strategy semantic

CI/CD

# GitLab CI
build_chunks:
  image: rust:latest
  script:
    - cargo install cadi
    - cadi scrape . --strategy semantic --publish

System Requirements

  • Rust 1.70+
  • 4GB RAM (2GB minimum)
  • 100MB disk space
  • Network access for registry operations

Performance

Typical performance on modern hardware:

  • Scraping: 50-100 MB/sec
  • Publishing: Limited by network bandwidth
  • Querying: < 100ms per chunk

Documentation

  • User Guide: See SCRAPER-GUIDE.md for detailed documentation
  • API Docs: cargo doc --open (library crates)
  • Examples: See examples/ directory
  • Quick Start: SCRAPER-QUICKSTART.md

License

MIT License

Contributing

Contributions welcome! Please check repository for guidelines.

Support

Commit count: 66

cargo fmt