Crates.io | dedup-advanced |
lib.rs | dedup-advanced |
version | 1.2.0 |
source | src |
created_at | 2021-07-07 06:43:26.621154 |
updated_at | 2021-08-27 16:57:46.938716 |
description | Fast and accurate deduplication tool to be used with blockhash |
homepage | https://github.com/installgentoo/dedup_advanced |
repository | https://github.com/installgentoo/dedup_advanced |
max_upload_size | |
id | 419805 |
size | 55,140 |
Finds duplicate images from blockhash list
Very useful and much faster than the alternatives that i googled about 40 times faster than geeqie on truly large filesets. Compile this; download and compile C blockhash from http://blockhash.io.
Run blockhash with find . -regextype posix-egrep -regex ".*\.(png|jpe?g)$" -type f -printf '"%p"\n' | xargs -n1 -IF -P8 blockhash F >> hashes
on your files.
Then run this program on hashes list, redirect output into a file and you'll get list of duplicates separated by newlines. You can bash then your way around the list, for example move all found files into something like dedup functionality in geeqie, and delete copies manually.