| Crates.io | mbox2db |
| lib.rs | mbox2db |
| version | 0.1.1 |
| created_at | 2025-11-03 23:57:24.630186+00 |
| updated_at | 2025-11-04 13:42:13.020969+00 |
| description | A fast, simple tool to convert large mbox email archives into optimized SQLite databases |
| homepage | https://github.com/ehamiter/mbox2db |
| repository | https://github.com/ehamiter/mbox2db |
| max_upload_size | |
| id | 1915462 |
| size | 50,982 |
A fast, simple Rust-based tool to convert large mbox email archives into optimized SQLite databases. Built for handling gigabyte-sized Gmail exports with maximum performance.
cargo install mbox2db
# Convert mbox to SQLite (excludes Spam/Trash by default)
mbox2db all-mail.mbox
# Output: 2025-11-04-emails.db (in current directory)
-- Count all emails
SELECT COUNT(*) FROM emails;
-- Get most recent emails
SELECT subject, from_addr, date_parsed
FROM emails
ORDER BY date_parsed DESC
LIMIT 10;
-- Search subject lines
SELECT subject, date_parsed, from_addr
FROM emails
WHERE subject LIKE '%keyword%'
ORDER BY date_parsed DESC;
-- Count emails by year
SELECT strftime('%Y', date_parsed) as year, COUNT(*)
FROM emails
WHERE date_parsed IS NOT NULL
GROUP BY year
ORDER BY year;
mbox2db [OPTIONS] <INPUT>
Arguments:
<INPUT> Input mbox file path
Options:
-o, --output <OUTPUT> Custom output database path
-d, --destructive Overwrite existing database instead of auto-incrementing
--include-spam Include emails marked as Spam
--include-trash Include emails marked as Trash
--include-spam-and-trash Include both Spam and Trash emails
-h, --help Print help
.zip or .tgz.mbox file2025-11-03-emails.db) that auto-increment to avoid overwriting# Build release binary
cargo build --release
# Binary will be at ./target/release/mbox2db
# Filters out Spam/Trash, creates dated output file
mbox2db all-mail.mbox
# Output: 2025-11-04-emails.db
# Running again on the same day creates incremented file
mbox2db all-mail.mbox
# Output: 2025-11-04-emails-0001.db
# Include spam emails only
mbox2db all-mail.mbox --include-spam
# Include trash emails only
mbox2db all-mail.mbox --include-trash
# Include both spam and trash
mbox2db all-mail.mbox --include-spam-and-trash
# Specify custom output location
mbox2db all-mail.mbox -o ~/Documents/my-emails.db
# Overwrite existing file (destructive mode)
mbox2db all-mail.mbox -d -o emails.db
CREATE TABLE emails (
id INTEGER PRIMARY KEY AUTOINCREMENT,
from_addr TEXT,
to_addr TEXT,
cc TEXT,
bcc TEXT,
subject TEXT,
date TEXT, -- Original email date header
date_parsed TEXT, -- Parsed datetime in SQLite format (YYYY-MM-DD HH:MM:SS)
message_id TEXT,
in_reply_to TEXT,
refs TEXT, -- "references" header
content_type TEXT,
body_plain TEXT,
body_html TEXT
);
-- Indexes for fast queries
CREATE INDEX idx_from ON emails(from_addr);
CREATE INDEX idx_date ON emails(date);
CREATE INDEX idx_date_parsed ON emails(date_parsed);
CREATE INDEX idx_subject ON emails(subject);
-- Get emails from 2025
SELECT * FROM emails
WHERE date_parsed LIKE '2025%'
ORDER BY date_parsed DESC;
-- Get emails from date range
SELECT subject, date_parsed, from_addr
FROM emails
WHERE date_parsed BETWEEN '2020-01-01' AND '2020-12-31'
ORDER BY date_parsed DESC;
-- Count emails from specific sender
SELECT COUNT(*) FROM emails WHERE from_addr LIKE '%user@example.com%';
-- Search email body
SELECT subject, from_addr, date_parsed
FROM emails
WHERE body_plain LIKE '%search term%'
OR body_html LIKE '%search term%'
ORDER BY date_parsed DESC;
-- Find email threads by message_id/in_reply_to
SELECT * FROM emails
WHERE in_reply_to = '<some-message-id>'
ORDER BY date_parsed;
Optimized SQLite Settings:
Handles Large Files: Tested with multi-GB mbox files containing 80,000+ emails
Date Parsing: Handles malformed dates including:
--0400)9:47:11)Jun 09)Eastern Daylight Time, GMT-0700)7/19/2005 8:11:52 AM)MIT
Eric Hamiter