frozen-duckdb

Crates.iofrozen-duckdb
lib.rsfrozen-duckdb
version0.1.0
created_at2025-10-07 22:35:53.196512+00
updated_at2025-10-07 22:35:53.196512+00
descriptionPre-compiled DuckDB binary for fast Rust builds - Drop-in replacement for duckdb-rs
homepagehttps://github.com/seanchatmangpt/frozen-duckdb
repositoryhttps://github.com/seanchatmangpt/frozen-duckdb
max_upload_size
id1872900
size13,630,324
Sean Chatman (seanchatmangpt)

documentation

https://github.com/seanchatmangpt/frozen-duckdb

README

🦆 Frozen DuckDB Binary

Pre-compiled DuckDB binary that never needs compilation - Fast builds forever!

This project provides architecture-specific DuckDB binaries that eliminate the slow compilation bottleneck in Rust projects using duckdb-rs. Instead of compiling DuckDB from source every time, this project uses pre-compiled official binaries for lightning-fast builds.

🚀 Performance

Build Type Before After Improvement
First Build 1-2 minutes 7-10 seconds 85% faster
Incremental 30 seconds 0.11 seconds 99% faster
Release Builds 1-2 minutes 0.11 seconds 99% faster

📦 What's Included

  • Architecture-specific DuckDB v1.4.0 binaries:
    • libduckdb_x86_64.dylib (55MB) - Intel Macs
    • libduckdb_arm64.dylib (50MB) - Apple Silicon Macs
  • C/C++ headers for development
  • Smart environment setup with automatic architecture detection
  • Build integration with Rust projects
  • Arrow compatibility patch for version conflicts
  • CLI tool for dataset management and TPC-H benchmark data generation

🛠️ Installation

Option 1: Download Pre-built Binary

# Clone this repository
git clone https://github.com/seanchatmangpt/frozen-duckdb.git
cd frozen-duckdb

# Set up environment (automatically detects your architecture)
source prebuilt/setup_env.sh

# Use in your Rust project
cargo build -p your-duckdb-crate

Option 2: Build from Source

# Clone this repository
git clone https://github.com/seanchatmangpt/frozen-duckdb.git
cd frozen-duckdb

# Build the frozen binary
./scripts/build_frozen_duckdb.sh

# Set up environment
source prebuilt/setup_env.sh

🔧 Integration with Your Rust Project

1. Update your Cargo.toml

[dependencies]
# Use prebuilt DuckDB - no compilation needed!
duckdb = { version = "1.4.0", default-features = false }

2. Add build script (build.rs)

use std::env;
use std::path::Path;

fn main() {
    // Check if we should use the prebuilt DuckDB
    if let Ok(lib_dir) = env::var("DUCKDB_LIB_DIR") {
        let lib_dir = Path::new(&lib_dir);
        let include_dir = env::var("DUCKDB_INCLUDE_DIR")
            .map(|p| Path::new(&p).to_path_buf())
            .unwrap_or_else(|_| lib_dir.join("include"));

        // Tell rustc where to find the DuckDB library and headers
        println!("cargo:rustc-link-search=native={}", lib_dir.display());
        println!("cargo:rustc-link-lib=dylib=duckdb");
        println!("cargo:include={}", include_dir.display());

        // This prevents the bundled build
        println!("cargo:rerun-if-env-changed=DUCKDB_LIB_DIR");
        println!("cargo:rerun-if-env-changed=DUCKDB_INCLUDE_DIR");
    } else {
        // Fall back to bundled if no prebuilt library is specified
        println!("cargo:warning=No DUCKDB_LIB_DIR specified, using bundled DuckDB");
    }
}

3. Set up environment before building

# In your project directory
source /path/to/frozen-duckdb/prebuilt/setup_env.sh
cargo build

🎯 Usage Examples

Basic Usage

# Set up the frozen DuckDB environment (auto-detects architecture)
source prebuilt/setup_env.sh

# Build your project (now fast!)
cargo build -p your-duckdb-crate

# Run tests
cargo test -p your-duckdb-crate

CLI Dataset Management

The frozen-duckdb CLI provides utilities for managing test datasets:

# Generate TPC-H benchmark data (industry standard)
cargo run -- download --dataset tpch --output-dir datasets

# Generate Chinook sample data
cargo run -- download --dataset chinook --output-dir datasets

# Convert between formats
cargo run -- convert --input data.csv --output data.parquet --input-format csv --output-format parquet

# Show system information
cargo run -- info

TPC-H Dataset Features:

  • 8 realistic tables: customer, lineitem, nation, orders, part, partsupp, region, supplier
  • Scalable data: From tiny (SF 0.01) to massive datasets (SF 1000+)
  • Industry standard: Used for database benchmarking worldwide
  • Fast generation: SF 0.01 generates ~86k rows in <1 second

Architecture Detection

The setup script automatically detects your architecture:

source prebuilt/setup_env.sh
# 🍎 Detected Apple Silicon (arm64), using 50MB binary
# 🖥️  Detected x86_64 architecture, using 55MB binary

CI/CD Integration

# GitHub Actions example
- name: Set up frozen DuckDB
  run: |
    git clone https://github.com/seanchatmangpt/frozen-duckdb.git
    source frozen-duckdb/prebuilt/setup_env.sh
    echo "DUCKDB_LIB_DIR=$DUCKDB_LIB_DIR" >> $GITHUB_ENV
    echo "DUCKDB_INCLUDE_DIR=$DUCKDB_INCLUDE_DIR" >> $GITHUB_ENV

- name: Build with frozen DuckDB
  run: cargo build --release

🔍 Troubleshooting

Library Not Found

If you get "library not found" errors:

# Check environment variables
echo $DUCKDB_LIB_DIR
echo $DUCKDB_INCLUDE_DIR

# Verify library exists
ls -la $DUCKDB_LIB_DIR/libduckdb*

# Re-source the environment
source prebuilt/setup_env.sh

Architecture Issues

If you need to force a specific architecture:

# Force x86_64 (Intel Mac)
ARCH=x86_64 source prebuilt/setup_env.sh

# Force arm64 (Apple Silicon)
ARCH=arm64 source prebuilt/setup_env.sh

Arrow Compatibility Issues

If you encounter Arrow version conflicts:

# Apply the Arrow patch
./scripts/apply_arrow_patch.sh

🏗️ Architecture

frozen-duckdb/
├── prebuilt/                 # Pre-compiled binaries
│   ├── libduckdb_x86_64.dylib # Intel Mac binary (55MB)
│   ├── libduckdb_arm64.dylib  # Apple Silicon binary (50MB)
│   ├── duckdb.h              # C header
│   ├── duckdb.hpp            # C++ header
│   └── setup_env.sh          # Smart environment setup
├── scripts/                  # Build and setup scripts
│   ├── build_frozen_duckdb.sh
│   ├── download_duckdb_binaries.sh
│   └── apply_arrow_patch.sh
├── examples/                 # Usage examples
└── README.md                # This file

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📊 Benchmarks

Before Frozen DuckDB

cargo build -p my-duckdb-crate
   Compiling libduckdb-sys v1.4.0
   Compiling duckdb v1.4.0
   Compiling my-duckdb-crate v0.1.0
    Finished dev profile [unoptimized + debuginfo] target(s) in 2m 15s

After Frozen DuckDB

source prebuilt/setup_env.sh
cargo build -p my-duckdb-crate
   Compiling my-duckdb-crate v0.1.0
    Finished dev profile [unoptimized + debuginfo] target(s) in 0.11s

Result: 99% faster builds! 🚀

💡 Architecture-Specific Benefits

  • Smaller downloads: Users only get the binary for their architecture
  • Faster setup: No need to download unused architecture code
  • Better performance: Native architecture optimization
  • Reduced storage: 50-55MB per user instead of 105MB universal binary
Commit count: 0

cargo fmt