# wdict Create dictionaries by scraping webpages or crawling local files. Similar tools (some features inspired by them): - [CeWL](https://github.com/digininja/CeWL) - [CeWLeR](https://github.com/roys/cewler) ## Take it for a spin ```bash # build with nix and run the result nix build .# ./result/bin/wdict --help # just run it directly nix run .# -- --help # run it without cloning nix run github:pyqlsa/wdict -- --help # install from crates.io # (nixOS users may need to do this within a dev shell) cargo install wdict # using a dev shell nix develop .# cargo build ./target/debug/wdict --help # ...or a release version cargo build --release ./target/release/wdict --help ``` ## Usage ```bash Create dictionaries by scraping webpages or crawling local files. Usage: wdict [OPTIONS] <--url |--theme |--path |--resume|--resume-strict> Options: -u, --url URL to start crawling from --theme Pre-canned theme URLs to start crawling from (for fun) Possible values: - star-wars: Star Wars themed URL - tolkien: Tolkien themed URL - witcher: Witcher themed URL - pokemon: Pokemon themed URL - bebop: Cowboy Bebop themed URL - greek: Greek Mythology themed URL - greco-roman: Greek and Roman Mythology themed URL - lovecraft: H.P. Lovecraft themed URL -p, --path Local file path to start crawling from --resume Resume crawling from a previous run; state file must exist; existence of dictionary is optional; parameters from state are ignored, instead favoring arguments provided on the command line --resume-strict Resume crawling from a previous run; state file must exist; existence of dictionary is optional; 'strict' enforces that all arguments from the state file are observed -d, --depth Limit the depth of crawling URLs [default: 1] -m, --min-word-length Only save words greater than or equal to this value [default: 3] -x, --max-word-length Only save words less than or equal to this value [default: 18446744073709551615] -j, --include-js Include javascript from