Crates.io | dirinventory |
lib.rs | dirinventory |
version | 0.7.0 |
source | src |
created_at | 2021-12-27 15:39:42.386975 |
updated_at | 2022-01-22 21:55:56.205227 |
description | Very fast multithreaded directory traversal |
homepage | |
repository | https://github.com/cehteh/dirinventory.git |
max_upload_size | |
id | 503760 |
size | 2,863,848 |
This library implements machinery for extremely fast multithreaded directory traversal. Using multiple threads leverages the kernels abilities to schedule IO-Requests in an optimal way.
The user crates a 'Gatherer' object which spawns threads listening on an PriorityQueue. Sending a 'directory' to this queue let one thread pick it up and traverse the directory. Each element found is then send to a custom function/closure which may decides on how to process it:
A priority queue is choosen for the input to ensure that directories are processed in a file handled preserving order. This is depth first in ascending inode order.
Handling pathnames of millions of files would need considerably much memory. To conserve this demands a ObjectPath implementation encodes any path by its filename and a reference to its parent directory. Futher all names are interned thus same names would require only memory once for their storage.
See the 'BENCH.md' file for some tests. As baseline was the 'gnu find' utility chosen. In the most extreme case this code can perform directory traversal 20 times faster. With slow spinning disks and moderate settings (16 threads) 1.6 times faster.