# rmd [![Build Status](https://travis-ci.com/FilippoRanza/rmd.svg?branch=master)](https://travis-ci.com/FilippoRanza/rmd) ![crates.io](https://img.shields.io/crates/v/rmd.svg) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=round-square)](http://makeapullrequest.com) An improved rm implementation able to remove duplicate files ## Description ```rmd``` is an ```rm``` reimplementation made in pure Rust. It is able to remove files and directories as usual. ```rmd``` is also able to: - Recursively remove duplicate file. Duplicates are found by comparing *SHA256* file hash. - Recursively remove file by size - Recusively remove file by last access time. ## Installation This tool can be easly installed from sources: ```bash cargo install rmd ``` ### Compile from source It is also possible to directly clone the repository and compile ```rmd``` from there. In this case it is recommended to run all tests before compile ```rmd``` for production. A convenient way to do that is using make ```bash make build ``` This will run all cargo tests (both unit and integration) and cli tests before compile rmd for production. ## Usage It works in an almost compatible way with the standard rm. To get a full help run: ```bash rmd --help ``` ### Standard Features (Standard Mode) But the most common scenarios includes: - remove files, for example: ```bash rmd FILE_A FILE_B ``` - remove a directory, for example: ```bash rmd -rf DIR_A ``` - enable verbose mode: ```bash rmd -v FILE_A ``` simple verbose just output the name of removed files and directory. For a more verbose: ```bash rmd -vv FILE_A ``` ```rmd``` shows also statistics about removed files, specifing if the removed file is a directory or a regular file, in the latter case ```rmd``` also shows the size of the removed file. - enable *log* mode: ```bash rmd -l FILE_A ``` output the same information as *-v* to syslog. For more log: ```bash rmd -ll FILE_A ``` output the same information as *-vv* to syslog. ### Additional Features (Automatic Mode) #### Remove Duplicates - remove duplicates files in the current directory, and all sub directories: ```bash rmd -d ``` or remove duplicates files in a specified directory: ```bash rmd -d /PATH/TO/DIRECTORY ``` #### Remove by Last Access This functionality allows to remove file **older** or **newer** then a given *time-specification*. Remove File *older* then *time-spec* ```bash rmd --older [directory...] ``` Remove File *newer* then *time-spec* ```bash rmd --newer [directory...] ``` ```rmd``` checks if the last access is **before** (so the file is **older**) or **after** (so the file is **newer**) then the time described by the *time-specification*. *time-specification* describes a relative amount of time (in **seconds**) in the past from the moment when the program is run. *time-specification* format ``` [N+T]+ ``` Where: - **N** is a number (1-9) - **T** is a time descriptor - **+** means one or more Time Descriptor Table | Short Format | Long Format| Meaning | Value | |--------------|-------------|---------|---------------| | s | second | second | 1 second | | m | minute | minute | 60 seconds | | h | hour | hour | 60 minutes | | d | day | day | 24 hours | | w | week | week | 7 days | | M | month | month | 30 days | | y | year | year | 365 days | ##### Examples ```bash rmd --older 2y4M5d ``` will remove in the current directory, and recursively in all sub directories, file with a last access time equal or before *2 year, 4 month and 5 days* in the past from the time when the program is run. ```bash rmd --newer '4h+30m' ``` will remove in the current directory, and recursively in all sub directories, file with a last access time equal or after *4 hour and 30 minutes* in the past from the time when the program is run. ```bash rmd --older '1M 15d' /home/user/temp-store ``` will remove in **/home/user/temp-store** and recursively in all sub directories, file with a last access time equal or before *1 month and 15 days* in the past from the time when the program is run. ```bash rmd --newer 30s /home/user/wrong-downloads ``` will remove in **/home/user/wrong-downloads** and recursively in all sub directories, file with a last access time equal or after *30 seconds* in the past from the time when the program is run. #### Remove by Size This functionality allows to remove file **smaller** or **larger** then a given *size-specification*. Remove File *smaller* then *size-spec* ```bash rmd --smaller [directory...] ``` Remove File *larger* then *size-spec* ```bash rmd --larger [directory...] ``` ```rmd``` checks if the file size, in bytes. If **larger** mode is used ```rmd``` checks, for each file in the specified directory, and recursivelly in all sub directories, if the size is **larger or equal** to the size decribed in *size-spec* and if so ```rmd``` remove the file. Of course if **smaller** mode is used ```rmd``` checks for file **smaller or equal** to the size in *size-spec*. *size-specification* format ``` [N+S]+ ``` Where: - **N** is a number (1-9) - **T** is a size descriptor - **+** means one or more Deciamal Size Descriptor Table | Short Format | Long Format| Meaning | Value | |--------------|-------------|---------|---------------| | b | | byte | 1 byte | |kb |kilo |kilobyte |1000 byte | |mb |mega |megabyte |1000 kilobyte | |gb |giga |gigabyte |1000 megabyte | |tb |tera |terabyte |1000 gigabyte | |pb |peta |petabyte |1000 terabyte | Binary Size Descriptor Table | Short Format | Long Format| Meaning | Value | |--------------|-------------|---------|---------------| | b | | byte | 1 byte | |kib |kibi |kibibyte |1024 byte | |mib |mebi |mebibyte |1024 kibibyte | |gib |gibi |gibibyte |1024 mebibyte | |tib |tebi |tebibyte |1024 gibibyte | |pib |pebi |pebibyte |1024 tebibyte | Decimal and Binary size descriptor **can** be use together ##### Examples ```bash rmd --smaller '2kb,56mib' ``` will remove in the current directory, and recursively in all sub directories, file with a size smaller or equal to *56 Mebibytes and 2 Kilobytes*. ```bash rmd --larger 4gb30mb ``` will remove in the current directory, and recursively in all sub directories, file with a size larger or equal to *4 Gigabytes and 30 Megabytes*. ```bash rmd --larger '1 mebi 15 kibi' /home/user/temp-store ``` will remove in **/home/user/temp-store** and recursively in all sub directories, file with a size larger or equal to *1 Mebibytes and 15 Kibibytes*. ```bash rmd --smaller 30kb /home/user/useless-files ``` will remove in **/home/user/useless-files** and recursively in all sub directories, file with a size smaller or equal to *30 Kilobytes*. #### Skip Files Sometimes you may need to skip some files or directories from been removed, for example you may want to preserve any *.bak* file or to completely ignore directories like *.git*. In these cases ```rmd``` provides two useful options: - ```--ignore-extensions``` allows to specify a list of extensions that will be ignored by ```rmd``` ```bash rmd --ignore-extensions bak --duplicates ``` will remove any duplicate file in the current directory and recursively in all sub directories ignoring any file with *.bak* extension. So if to equal file "file.rs.bak" and "copy-file.rs.bak" will be preserved. Also the original "file.bak" (if it is unique) will be preserved because *.bak* file are completely ignored. ```bash rmd --ignore-extensions bak pdf mp3 --larger 40kb project ``` will remove all file larger or equal to *40 Kilobytes* in the *project* directory, and recursively in all sub directories, but files with *.bak*, *.pdf* and *.mp3* extensions. So, for example, *project/docs.pdf* a 4 Mb file will not be removed. - ```--ignore-directories``` allows to specify a list of directory names (just the last component in the path string) that will be ignored by ```rmd``` ```bash rmd --clean --ignore-directories xmas_photos --older 1y documents ``` will remove any file older than one year in *documents* directory and recursively in all sub directories, ignoring any directory named *xmas_photo*. If *xmas_photo* is empty it will not be removed. ```rmd``` simply will never open any directory named *xmas_photo* in the directory tree rooted in *documents*. ```bash rmd --clean --ignore-directories important_files .git --duplicates /home/user ``` will remove any duplicate file in the user home, and recursively in all sub directories, ignoring any directory named *.git* or *important_files*. It is allowed to use ```--ignore-directories``` and ```--ignore-extensions``` together. It is also possible to simply ignore hidden files and directories. ```--ignore-unix-hidden``` allows to automatically ignore any file and directory whose name starts with '.' (unix style hidden files). ```rmd``` working with ```--ignore-unix-hidden``` set skips hidden files and does not open hidden directories, so any non hidden file inside an hidden directory is left untouched. An Exmple: ```bash rmd -d --ignore-unix-hidden important_project ``` will deduplicate *important_project* but hidden files or directories (such as .git) are ignored. ### Note - When working in *interactive* mode and a remove file is a directory ```rmd``` during an automatic removal prompts for each file that need to be deleted, during a standard removal prompts just once - *newer*, *older*, *duplicates*, *smaller*, *larger* are mutually exclusive. - Specification String, in both time and size remove, can contain any number of non alphanumeric characters between a number and a descriptor or between a descriptor and a number, those characters are simply treated as separators. The important thing are to **NOT** put separators into numbers or into descriptors and to properly quote the specification string so it will be treated as a unique argument. - ```-c```/```--clean``` flag deletes directories left empty after an automatic file removal, (i.e. ```rmd``` run with *newer*, *older*, *duplicates*, *smaller*, *larger*). This operation is done from the bottom of the directory tree, so directories that contains only directoies (recursively) without any file are considered empty. So, although not technically empty those directoies will be removed. Pay attention ;-). Clean flag does not take any effect where uses in standard mode. - Verbose flag can be set in standard or in automatic mode. - Log flag can be set in standard or in automatic mode. - Log and Verbose mode can work together. - Output generated by *log* and *verbose* is the same, it just changes where this output is sent. *log* send its output to *syslog*, *verbose* to stdout. - ```--ignore-extensions```, ```--ignore-directories``` and ```--ignore-unix-hidden``` can be used only with an automatic remover - If ```--ignore-unix-hidden``` and ```--clean``` are used together empty hidden directories are safe from removal as non empty hidden directories. ### Advice It is very likely that you will end up using ```--ignore-extenions``` and/or ```--ignore-directories``` with the same arguments over and over. In this scenario a good idea could be add an alias to your shell configuration file like ```bash alias rmdd='rmd --ignore-extensions bak --ignore-directories .git .hg' ``` or a shell function like ```bash function rmd() { rmd --ignore-extensions bak --ignore-directories .git .hg -- "$@" } ```