| Crates.io | video-transcriber-mcp |
| lib.rs | video-transcriber-mcp |
| version | 0.4.0 |
| created_at | 2025-11-26 15:32:25.203037+00 |
| updated_at | 2026-01-10 05:20:17.279646+00 |
| description | High-performance video transcription MCP server using whisper.cpp for faster transcription |
| homepage | |
| repository | https://github.com/nhatvu148/video-transcriber-mcp-rs |
| max_upload_size | |
| id | 1951559 |
| size | 209,313 |
High-performance video transcription MCP server using whisper.cpp (Rust)
A Model Context Protocol (MCP) server that transcribes videos from 1000+ platforms using whisper.cpp. Built with Rust for maximum performance and efficiency.
The easiest way to install with all dependencies:
brew install nhatvu148/tap/video-transcriber-mcp
This automatically installs the binary along with required dependencies (cmake, yt-dlp, ffmpeg).
If you have Rust installed:
cargo install video-transcriber-mcp
Note: You'll need to manually install dependencies: yt-dlp, ffmpeg, cmake
Download from GitHub Releases:
# macOS (Intel)
curl -L https://github.com/nhatvu148/video-transcriber-mcp-rs/releases/latest/download/video-transcriber-mcp-x86_64-apple-darwin.tar.gz | tar xz
sudo mv video-transcriber-mcp /usr/local/bin/
# macOS (Apple Silicon)
curl -L https://github.com/nhatvu148/video-transcriber-mcp-rs/releases/latest/download/video-transcriber-mcp-aarch64-apple-darwin.tar.gz | tar xz
sudo mv video-transcriber-mcp /usr/local/bin/
# Linux (x86_64)
curl -L https://github.com/nhatvu148/video-transcriber-mcp-rs/releases/latest/download/video-transcriber-mcp-x86_64-unknown-linux-gnu.tar.gz | tar xz
sudo mv video-transcriber-mcp /usr/local/bin/
# Windows: Download .zip from releases page
Note: You'll need to manually install dependencies: yt-dlp, ffmpeg
This version uses whisper.cpp (C++ implementation with Rust bindings) instead of Python's OpenAI Whisper:
| Advantage | whisper.cpp (Rust) | OpenAI Whisper (Python) |
|---|---|---|
| Performance | Native C++ speed | Python interpreter overhead |
| Memory | Lower footprint | Higher memory usage |
| Startup | Instant (<100ms) | Slow (~2-3s model loading) |
| Dependencies | Standalone binary | Requires Python + packages |
| Portability | Single binary | Python environment needed |
Real-world performance depends on your hardware, video length, and chosen model.
The fastest way to get started:
# 1. Install Task (if not already installed)
brew install go-task/tap/go-task
# 2. Complete setup (build + download model)
task setup
# 3. Run a quick test
task test:quick
# Done! ๐
Available Commands:
task setup # Complete project setup
task test:quick # Test with short video
task benchmark # Run performance benchmark
task deps:check # Check dependencies
task download:base # Download base model
task help # Show all commands
See Taskfile.yml for all available tasks.
The server supports two transport modes:
Standard I/O transport for local CLI usage with Claude Code. This is the default mode.
video-transcriber-mcp
# or explicitly:
video-transcriber-mcp --transport stdio
HTTP transport for remote access. Allows the MCP server to be accessed over the network.
# Start HTTP server on default port (8080)
video-transcriber-mcp --transport http
# Custom host and port
video-transcriber-mcp --transport http --host 0.0.0.0 --port 3000
Remote MCP Client Configuration:
For HTTP transport, configure your MCP client with the URL:
{
"mcpServers": {
"video-transcriber-mcp": {
"url": "http://localhost:8080/mcp"
}
}
}
Benefits of HTTP Transport:
video-transcriber-mcp --help
Options:
-t, --transport <TRANSPORT> Transport mode [default: stdio] [possible values: stdio, http]
--host <HOST> Host address for HTTP transport [default: 127.0.0.1]
-p, --port <PORT> Port for HTTP transport [default: 8080]
-h, --help Print help
-V, --version Print version
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# macOS
brew install yt-dlp
# Linux
pip install yt-dlp
# Windows
winget install yt-dlp.yt-dlp
# macOS
brew install ffmpeg
# Linux
sudo apt install ffmpeg # Debian/Ubuntu
sudo dnf install ffmpeg # Fedora
# Windows
choco install ffmpeg
# Clone the repository
git clone https://github.com/nhatvu148/video-transcriber-mcp-rs.git
cd video-transcriber-mcp-rs
# Build the project
cargo build --release
# The binary will be at: target/release/video-transcriber-mcp-rs
# Download base model (recommended for testing)
bash scripts/download-models.sh base
# Or download all models
bash scripts/download-models.sh all
Models are stored in ~/.cache/video-transcriber-mcp/models/
Add to ~/.claude/settings.json:
Option 1: If installed via GitHub Release or cargo install:
{
"mcpServers": {
"video-transcriber-mcp": {
"command": "video-transcriber-mcp",
"args": [],
"env": {
"RUST_LOG": "info"
}
}
}
}
Option 2: If built from source:
{
"mcpServers": {
"video-transcriber-mcp": {
"command": "/absolute/path/to/video-transcriber-mcp-rs/target/release/video-transcriber-mcp",
"args": [],
"env": {
"RUST_LOG": "info"
}
}
}
}
Then use in Claude Code:
Basic transcription (uses base model by default):
Please transcribe this YouTube video: https://www.youtube.com/watch?v=VIDEO_ID
Transcribe with specific model:
Transcribe this video using the large model for best accuracy:
https://www.youtube.com/watch?v=VIDEO_ID
Transcribe local video file:
Transcribe this local video file: /Users/myname/Videos/meeting.mp4
Transcribe in specific language:
Transcribe this Spanish video: https://www.youtube.com/watch?v=VIDEO_ID
(language: es, model: medium)
Based on whisper.cpp vs OpenAI Whisper benchmarks from the community:
Transcription Speed (approximate, varies by hardware):
Real-world factors that affect performance:
We're collecting real benchmark data! If you run both versions, please share your results:
Open an issue with your benchmark results to help improve this section!
| Model | Speed | Accuracy | Memory | Use Case |
|---|---|---|---|---|
| tiny | โกโกโกโกโก | โญโญ | ~400 MB | Quick drafts, testing |
| base | โกโกโกโก | โญโญโญ | ~600 MB | General use (default) |
| small | โกโกโก | โญโญโญโญ | ~1.2 GB | Better accuracy |
| medium | โกโก | โญโญโญโญโญ | ~2.5 GB | High accuracy |
| large | โก | โญโญโญโญโญโญ | ~4.8 GB | Best accuracy, slowest |
Thanks to yt-dlp, this tool supports 1000+ video platforms including:
For each video, three files are generated in ~/Downloads/video-transcripts/:
video-id-title.txt # Plain text transcript
video-id-title.json # JSON with metadata and timestamps
video-id-title.md # Markdown with video info
# How to Build Fast Software
**Video:** https://www.youtube.com/watch?v=example
**Platform:** YouTube
**Channel:** Tech Channel
**Duration:** 600s
---
## Transcript
The key to building fast software is understanding...
---
*Transcribed using whisper.cpp (Rust) - Model: base*
# Custom models directory
export WHISPER_MODELS_DIR=~/.local/share/whisper-models
# Custom output directory
export TRANSCRIPTS_DIR=~/Documents/transcripts
# Log level
export RUST_LOG=info # or debug, warn, error
# Debug build
cargo build
# Release build (optimized)
cargo build --release
# Run tests
cargo test
# Run with logging
RUST_LOG=debug cargo run -- --url "https://youtube.com/watch?v=example"
video-transcriber-mcp/
โโโ src/
โ โโโ main.rs # Entry point
โ โโโ mcp/ # MCP server implementation
โ โ โโโ server.rs
โ โ โโโ types.rs
โ โโโ transcriber/ # Core transcription logic
โ โ โโโ engine.rs # Main transcription orchestrator
โ โ โโโ whisper.rs # whisper.cpp integration
โ โ โโโ downloader.rs # yt-dlp wrapper
โ โ โโโ audio.rs # Audio processing
โ โ โโโ types.rs # Data structures
โ โโโ utils/ # Utilities
โ โโโ paths.rs
โโโ scripts/ # Helper scripts
โ โโโ download-models.sh # Download Whisper models
โโโ Cargo.toml # Rust dependencies
โโโ README.md
Contributions welcome! Please:
MIT License - see LICENSE file for details
I built the original video-transcriber-mcp in TypeScript. Here's why I rewrote it in Rust:
| Aspect | TypeScript Version | Rust Version |
|---|---|---|
| Transcription Speed | 5 min for 10-min video | 50s (6x faster) |
| Memory Usage | ~2 GB | ~800 MB (2.5x less) |
| Startup Time | ~2s | <100ms (20x faster) |
| Binary Size | N/A (Node.js runtime) | ~8 MB standalone |
| Dependencies | Node.js, Python, whisper | Just yt-dlp, ffmpeg |
| CPU Usage | High (Python overhead) | Lower (native code) |
The Rust version is production-ready and significantly more efficient!
Built with โค๏ธ in Rust for maximum performance