| Crates.io | minisearchtk |
| lib.rs | minisearchtk |
| version | 0.2.0 |
| created_at | 2026-01-13 21:30:34.809596+00 |
| updated_at | 2026-01-13 21:30:34.809596+00 |
| description | Small toolkit for crawling and searching web pages |
| homepage | |
| repository | https://github.com/konfou/minisearch-rs |
| max_upload_size | |
| id | 2041402 |
| size | 4,545,734 |
A small Rust toolkit for crawling and searching web pages.
Key binaries:
mycrawler: fetch pages to {hash}.html, append metadata to
crawl.db, persist the crawl queue (including in-flight) for
recovery, honor nofollow/noindex flags, optionally respect robots.txt /
crawl-delay and per-host throttling, and let you set a User-Agent.minisearch: load the saved corpus, build a BM25 index with optional
on-disk caches (BM25 stats and token cache for incremental rebuild),
then query via a REPL (/search, /df, /tf) with snippets and
highlights.searchsrv: serve the same search over HTTP with a minimal HTML UI,
reusing the caches when available.See docs/ for architecture details and future improvement ideas.
This project is my own implementation of a semester assignment shared by a friend.