| Crates.io | check_hashes |
| lib.rs | check_hashes |
| version | 1.0.0 |
| created_at | 2025-04-28 09:42:51.94467+00 |
| updated_at | 2025-04-28 09:42:51.94467+00 |
| description | A fast and parallelized tool to detect duplicate files in a directory based on hashes. |
| homepage | https://github.com/jmg/check_hashes |
| repository | https://github.com/jmg/check_hashes |
| max_upload_size | |
| id | 1651994 |
| size | 32,303 |
A fast and efficient tool to detect duplicate files in a directory based on file content.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
git clone https://github.com/yourusername/duplicate-file-finder.git
cd duplicate-file-finder
cargo build --release
cargo run -- --path /path/to/your/directory
Or using the compiled release binary:
./target/release/duplicate-file-finder --path /path/to/your/directory
cargo run -- --path ./Downloads
Sample output:
Scanning files...
Found 5321 files. Computing partial hashes...
Grouping files by partial hash...
421 candidate files after partial hashing. Computing full hashes...
Grouping by full hash...
❌ Duplicates found:
Group 1 (2 files) - Hash: d2f1d7e91c8b...
/path/to/file1.jpg
/path/to/file1_copy.jpg
Group 2 (3 files) - Hash: a34e1b1fe98d...
/path/to/doc1.pdf
/path/to/backup/doc1.pdf
/path/to/archive/old/doc1.pdf
Found 2 duplicate groups.
Summary: Scanned 5321 files in 1m 12s.
| Argument | Description | Example |
|---|---|---|
--path or -p |
Directory to scan recursively | --path ./Documents |
This two-step approach makes it very fast even for very large folders.
This project uses:
blake3 for fast cryptographic hashing.clap for argument parsing.rayon for parallel processing.indicatif for progress bars.colored for colored terminal output.walkdir for recursive file walking.memmap2 for memory-mapping large files.Install all dependencies automatically when you run cargo build.
This project is licensed under the MIT License. See LICENSE for more information.