| Crates.io | ddb-core |
| lib.rs | ddb-core |
| version | 2.1.0 |
| created_at | 2025-10-15 11:58:45.451922+00 |
| updated_at | 2025-10-15 17:19:59.733135+00 |
| description | DDB v2: A high-performance SQL interface for flat files written in Rust |
| homepage | https://github.com/watkinslabs/ddb/ |
| repository | https://github.com/watkinslabs/ddb/ |
| max_upload_size | |
| id | 1884236 |
| size | 1,800,633 |
A high-performance, secure SQL interface for CSV and delimited files.
Query flat files with full SQL power - no database server required. Built in Rust for maximum performance and safety.
SELECT u.name, COUNT(o.id) as orders, SUM(o.total) as revenue
FROM users.csv u
INNER JOIN orders.csv o ON u.id = o.user_id
GROUP BY u.name
HAVING revenue > 10000
ORDER BY revenue DESC
This is DDB v2 - a complete rewrite in Rust. Provides better performance, memory safety, and security while maintaining the core functionality of querying delimited files with SQL.
Perfect for:
Example scenarios:
# Analyze web server logs
ddb --query "SELECT ip, COUNT(*) FROM access.log GROUP BY ip ORDER BY count DESC LIMIT 10"
# Join sales data from multiple CSV exports
ddb --query "
SELECT r.region, SUM(s.amount) as total_sales
FROM sales.csv s
JOIN regions.csv r ON s.region_id = r.id
GROUP BY r.region
"
# Update customer records in place
ddb --query "UPDATE customers.csv SET status = 'premium' WHERE total_purchases > 10000"
โ Fully Implemented (v0.1.0):
SQL Query Support:
Data Modification Operations:
101+ SQL Functions:
System Variables:
@@VARIABLE syntax@@VERSION, @@DB_NAME, @@DB_TYPE, etc.Output Formats:
Performance:
nomMCP Server:
--mcp flagDDL Operations:
๐ Not Yet Implemented:
# 1. Build DDB
cargo build --release
# 2. Create configuration directory
mkdir -p ~/.ddb/schemas
# 3. Create main config file
cat > ~/.ddb/config.yaml <<EOF
default_database: main
schema_dir: ~/.ddb/schemas
default_delimiter: ','
data_starts_on: 0
default_output_format: table
EOF
# 4. Create a table schema using SQL (example: users.sql)
cat > ~/.ddb/schemas/users.sql <<EOF
CREATE TABLE IF NOT EXISTS users (
id INTEGER NOT NULL,
name STRING NOT NULL,
email STRING
)
FILE '/path/to/your/users.csv'
DELIMITER ','
DATA_STARTS_ON 1
COMMENT_CHAR '#';
EOF
# 5. Query your data!
./target/release/ddb --query "SELECT * FROM users WHERE id = 1"
# 6. Use with JOIN and aggregation
./target/release/ddb --query "
SELECT u.name, COUNT(o.id) as order_count
FROM users u
INNER JOIN orders o ON u.id = o.user_id
GROUP BY u.name
HAVING order_count > 5
"
nom# Development build
cargo build
# Release build (optimized)
cargo build --release
# Run tests
cargo test
# Run benchmarks
cargo bench
# Execute a SQL query
./target/release/ddb --query "SELECT * FROM users WHERE id = 123"
# With debug output
./target/release/ddb --query "SELECT * FROM test" --debug
# Specify config directory
./target/release/ddb --query "SELECT * FROM test" --config ./config
# Show version
./target/release/ddb version
SELECT with JOIN:
SELECT u.name, o.order_id, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.total > 100
ORDER BY o.total DESC
GROUP BY with HAVING:
SELECT category, COUNT(*) as total, AVG(price) as avg_price
FROM products
GROUP BY category
HAVING COUNT(*) > 5 AND AVG(price) > 50
ORDER BY total DESC
INSERT data:
INSERT INTO users (id, name, email, age)
VALUES (101, 'John Doe', 'john@example.com', 30)
UPDATE records:
UPDATE users
SET age = 31, email = 'newemail@example.com'
WHERE name = 'John Doe'
DELETE records:
DELETE FROM users WHERE age < 18
UPSERT (insert or update):
UPSERT INTO users (id, name, email, age)
VALUES (101, 'John Doe', 'updated@example.com', 32)
ON id
Advanced aggregation:
SELECT
department,
COUNT(*) as employee_count,
AVG(salary) as avg_salary,
MIN(salary) as min_salary,
MAX(salary) as max_salary,
STDDEV(salary) as salary_stddev
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000
ORDER BY avg_salary DESC
DDB includes an MCP (Model Context Protocol) server for seamless integration with AI assistants like Claude.
# Build (MCP included by default)
cargo build --release
# Run the MCP server
./target/release/ddb --mcp
The MCP server provides:
execute_query - Full CRUD support (SELECT with JOIN/GROUP BY/HAVING, INSERT, UPDATE, DELETE, UPSERT)list_tables - List all configured tablesdescribe_table - Get table schema and metadataSee MCP_SERVER.md for complete documentation.
DDB uses configuration files located in ~/.ddb/ (or a custom directory via --config).
~/.ddb/config.yaml)default_database: main
schema_dir: ~/.ddb/schemas
default_delimiter: ','
data_starts_on: 0 # Line where data starts (0 = after header)
comment_char: '#' # Optional: lines starting with this are ignored
default_output_format: table
~/.ddb/schemas/users.sql)Tables are now defined using SQL CREATE TABLE statements:
CREATE TABLE IF NOT EXISTS users (
id INTEGER NOT NULL,
name STRING NOT NULL,
email STRING,
age INTEGER,
created_at DATETIME NOT NULL
)
FILE '/path/to/users.csv'
DELIMITER ','
DATA_STARTS_ON 1
COMMENT_CHAR '#'
QUOTE_CHAR '"';
CREATE TABLE Options:
IF NOT EXISTS - Don't error if table already existsFILE 'path' - Path to data fileDELIMITER 'char' - Field delimiter (default: ,)DATA_STARTS_ON n - Line number where data startsCOMMENT_CHAR 'char' - Lines starting with this are ignoredQUOTE_CHAR 'char' - Character used to quote fieldsInteger - 64-bit signed integersFloat - 64-bit floating pointString - UTF-8 textBoolean - true/falseDate - Date only (YYYY-MM-DD)DateTime - Date and timeTime - Time only (HH:MM:SS)~/.ddb/
โโโ config.yaml # Main configuration
โโโ schemas/
โโโ users.sql # Table definition (CREATE TABLE statement)
โโโ orders.sql # Table definition (CREATE TABLE statement)
โโโ products.sql # Table definition (CREATE TABLE statement)
โโโ ...
src/
โโโ lib.rs # Main library entry point
โโโ error.rs # Error types and Result alias
โโโ lexer/ # Tokenization
โ โโโ mod.rs
โ โโโ types.rs # Token types
โ โโโ tokenizer.rs # SQL tokenizer
โโโ parser/ # SQL parsing
โ โโโ mod.rs
โ โโโ ast.rs # Abstract syntax tree
โโโ config/ # Configuration management
โ โโโ mod.rs # Database, Table, Column structs
โโโ file_io/ # File operations
โ โโโ locking.rs # File locking for concurrency
โ โโโ reader.rs # Streaming line reader
โโโ engine/ # Query execution
โโโ methods/ # SQL operations (SELECT, INSERT, etc.)
โโโ functions/ # SQL functions
โโโ output/ # Output formatters
โโโ mcp/ # Model Context Protocol server
โ โโโ mod.rs
โ โโโ server.rs # MCP server implementation (JSON-RPC over stdio)
โ โโโ tools.rs # MCP tools (3 tools)
โ โโโ resources.rs # MCP resources (2 types)
โ โโโ prompts.rs # MCP prompts (2 prompts)
โโโ bin/
โโโ ddb.rs # CLI binary (includes --mcp mode)
nom - Fast parser combinators for SQL tokenizationserde - Serialization/deserializationchrono - Date/time functionsfs2 - File locking for concurrent accessmemmap2 - Memory-mapped file I/O for large filesrayon - Data parallelism for aggregationsclap - CLI argument parsingthiserror/anyhow - Error handlingregex - Pattern matching for LIKE operationsuuid - UUID generationRun benchmarks to validate performance:
# Run all benchmarks
cargo bench
# View detailed results
open target/criterion/report/index.html
Performance Highlights:
See BENCHMARKS.md for comprehensive benchmark results with detailed performance graphs and human-readable time conversions.
Contributions welcome! Key areas for future development:
Creative Commons Attribution-Noncommercial-Share Alike (CC-BY-NC-SA-4.0)