ffcnt

Crates.ioffcnt
lib.rsffcnt
version0.3.2
sourcesrc
created_at2017-01-28 21:07:58.795946
updated_at2021-03-18 22:46:22.820695
descriptionFast file counting on HDDs
homepage
repositoryhttps://github.com/the8472/ffcnt
max_upload_size
id8270
size54,476
(the8472)

documentation

README

Version

ffcnt

Fast file counting and listing for spinning rust, in rust.

ffcnt's purpose is to provide a faster alternatives to some common filesystem operations as a frontend for the platter-walk crate.

  • ffcnt --type f replaces find -type f | wc -l

  • ffcnt --type f --ls --leaf-order content replaces find -type f and returns files in optimized order for reading

  • ffcnt -s replaces du -s --apparent-size

Requirements

  • Linux

  • A filesystem that supports the fiemap ioctl on directories.
    Currently ext4 is known to provide that. If you know other ones, please report!
    Incompatible filesystems will work but gain no speedup over find.

You can test filesystem support with the filefrag tool.

## supported

$ filefrag /tmp/
/tmp/: 3 extents found


## unsupported

$ filefrag /mnt/test/
/mnt/test/: FIBMAP unsupported

Binary

You can find prebuilt x86_64-linux-glibc binaries without debug information under releases. For troubleshooting and other environments you'll have to build your own.

Build

  • clone repo
  • install liblzo2 and libz (build-time dependencies)
  • install rust and cargo
  • cargo build --release

Usage

fast file counting 0.3.0

USAGE:
    ffcnt [FLAGS] [OPTIONS] [dirs]...

FLAGS:
    -h, --help        Prints help information
        --ls          list files
        --prefetch    attempt to prefetch directory indices from underlying mount device. requires read permission on device
    -s                sum apparent length of matched files. Only counts hardlinked files once. Does not follow symlinks. Implies --leaf-order inode.
    -V, --version     Prints version information

OPTIONS:
        --leaf-order <ord>    optimize order for listing/stat/reads [values: inode, content, dentry]
        --type <type>         filter type [values: f, l, d, s, b, c, p]

ARGS:
    <dirs>...    directories to traverse [default: cwd]

Unscientific Benchmark

Idle system:

$ echo 3 > /proc/sys/vm/drop_caches ; time find /tmp/foo/ -type f | wc -l
826536

real	0m52.289s
user	0m0.680s
sys	0m4.361s

$ echo 3 > /proc/sys/vm/drop_caches ; time ffcnt /tmp/foo/ --type f
files: 826536

real	0m17.072s
user	0m1.230s
sys	0m2.190s

$ echo 3 > /proc/sys/vm/drop_caches ; time sudo ffcnt /tmp/foo/ --prefetch --type f
files: 826536

real	0m13.311s
user	0m2.029s
sys	0m1.440s

Busy system with mixed read/write workload. Differences in file counts arose due to writes happening in the meantime:

# echo 3 > /proc/sys/vm/drop_caches ; time ffcnt . 
4411262

real	10m36.288s
user	0m3.656s
sys	0m7.588s

# echo 3 > /proc/sys/vm/drop_caches ; time find . -type f | wc -l
4412101

real	45m54.955s
user	0m3.212s
sys	0m12.044s

Both tests were performed on HDDs with a directory structure of at least 2 nesting levels and a branching factor of 256

Ideas

  • 1 thread per block device in tree
Commit count: 25

cargo fmt