pdfx

Crates.io	pdfx
lib.rs	pdfx
version	0.2.0
created_at	2025-08-27 00:03:37.661419+00
updated_at	2025-08-30 07:20:47.592399+00
description	A lightning-fast terminal-native PDF indexing and search toolkit
homepage	https://github.com/ionnss/pdfx
repository	https://github.com/ionnss/pdfx
max_upload_size
id	1812039
size	15,132,101

(ionnss)

documentation

README

pdfx

A lightning-fast terminal-native PDF indexing and search toolkit

Features

Fast PDF Indexing: SQLite-powered database with metadata extraction
Lightning Search: Instant filename-based search across indexed PDFs
List & Browse: View all indexed PDFs with detailed information
Export Data: Export your PDF library to JSON, CSV, Markdown, PDF, YAML, and HTML
Cross-Platform: Native support for Linux, macOS, and Windows
Clean UI: Beautiful progress bars and organized output
Zero Dependencies: No external system requirements
Smart Cleanup: Complete data removal with pdfx cleanup

Installation

From Source

# Clone the repository
git clone https://github.com/ionnss/pdfx.git
cd pdfx

# Build and install
cargo install --path .

From GitHub

cargo install --git https://github.com/ionnss/pdfx

Usage

Basic Commands

# Initialize PDF index
pdfx init                    # Index current directory
pdfx init ~/Documents        # Index specific directory
pdfx init ~                  # Index entire home directory

# Search indexed PDFs
pdfx search "machine learning"   # Search for keyword in filenames

# List all indexed PDFs
pdfx list                    # Show all PDFs with details

# Export your PDF library
pdfx export                  # Export all formats to Downloads folder
pdfx export --format json    # Export only JSON format
pdfx export --format csv,yaml # Export multiple formats

# Clean up
pdfx cleanup                 # Remove all indexed data

Workflow Example

# 1. First time setup - index your PDFs
pdfx init ~/Documents
# ✅ Scan complete! 170 PDFs found | 2500 files processed | 25 directories skipped
# Indexed 170 PDFs in /Users/user/Library/Application Support/pdfx/db.sqlite

# 2. Browse your PDF library
pdfx list
# 📋 All Indexed PDFs
# 📊 Total: 170 PDFs
# 📄 1. The Rust Programming Language.pdf
#     Size: 14.37 MB
#     Path: /Users/user/Documents/books/rust.pdf
#     Modified: 2025-01-15 10:30:00

# 3. Search your indexed PDFs instantly
pdfx search "rust programming"

# 4. Export your library for sharing or backup
pdfx export
# Exporting 170 PDFs to /Users/user/Downloads/pdfx_exports
#   ✅ Generated pdfs.json
#   ✅ Generated pdfs.csv
#   ✅ Generated pdfs.md
#   ✅ Generated pdfs.yaml
#   ✅ Generated pdfs.html
# 🎉 Export complete!

# 5. When you're done (optional cleanup)
pdfx cleanup

Export Formats

pdfx supports multiple export formats for your PDF library:

Available Formats

JSON: Machine-readable format with full metadata
CSV: Spreadsheet-compatible format for data analysis
Markdown: Human-readable format with tables
YAML: Structured format for configuration files
HTML: Web-ready format for sharing online

Export Examples

# Export all formats to Downloads folder
pdfx export

# Export specific formats
pdfx export --format json
pdfx export --format csv,yaml
pdfx export --format html

Export Location

Default: ~/Downloads/pdfx_exports/ (or equivalent on your OS)
Files: pdfs.json, pdfs.csv, pdfs.md, pdfs.yaml, pdfs.html

Database & Storage

Where Your Data Lives

# macOS
~/Library/Application Support/pdfx/db.sqlite

# Linux  
~/.local/share/pdfx/db.sqlite

# Windows
%APPDATA%/pdfx/db.sqlite

Privacy & Security

Local Storage Only: No cloud, no tracking, no data sharing
SQLite Database: Industry-standard, portable format
Complete Cleanup: pdfx cleanup removes all traces

Requirements

Rust: 1.70 or later
Operating System: Linux, macOS, or Windows
Terminal: Any modern terminal with Unicode support

Development

Setup

git clone https://github.com/ionnss/pdfx.git
cd pdfx
cargo build
cargo run -- --help

Project Structure

src/
├── cli/          # Command-line interface
├── database/     # SQLite database operations
├── indexer/      # PDF file discovery and indexing
├── helpers/      # Utility functions
└── types.rs      # Core data structures

Contributing

We welcome contributions! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Troubleshooting

Common Issues

Q: "Permission denied" errors during scanning

# This is normal on macOS/Linux - system directories are protected
# pdfx will skip these and continue scanning accessible directories

Q: Database seems corrupted or giving errors

pdfx cleanup    # Remove database and start fresh
pdfx init       # Rebuild index

Q: Where is my data stored?

# View database location after running pdfx init
# Path is shown in success message
# Use `pdfx cleanup` to remove all data

Roadmap

Current Status (v0.2.0)

✅ PDF Indexing: SQLite-based PDF database with metadata
✅ Filename Search: Fast, case-insensitive filename search
✅ List Command: Display all indexed PDFs with detailed information
✅ Export Data: Export to JSON, CSV, Markdown, YAML, and HTML formats
✅ Cross-Platform: Works on Linux, macOS, and Windows
✅ Clean UI: Progress bars and organized output

Planned Features

📅 Recent Command: Show recently modified PDFs
🔍 Advanced Search: Filter by size, date, path
📊 Statistics: Show indexing statistics and storage usage
🏷️ Tagging System: Categorize and tag PDFs for better organization

See FUTURE.md for detailed roadmap and feature plans.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with excellence using:

Rust - Systems programming language
rusqlite - SQLite database operations
clap - Command-line argument parsing
indicatif - Progress bars and spinners
walkdir - Recursive directory traversal
chrono - Date and time handling

Commit count: 69

pdfx

documentation

README

pdfx

Features

Installation

From Source

From GitHub

Usage

Basic Commands

Workflow Example

Export Formats

Available Formats

Export Examples

Export Location

Database & Storage

Where Your Data Lives

Privacy & Security

Requirements

Development

Setup

Project Structure

Contributing

Troubleshooting

Common Issues

Roadmap

Current Status (v0.2.0)

Planned Features

License

Acknowledgments

cargo fmt