| Crates.io | jmdict-fast |
| lib.rs | jmdict-fast |
| version | 0.1.1 |
| created_at | 2025-10-24 11:40:42.655365+00 |
| updated_at | 2025-10-24 11:40:42.655365+00 |
| description | Blazing-fast Japanese dictionary engine with FST-based indexing |
| homepage | https://github.com/theGlenn/jmdict-fst |
| repository | https://github.com/theGlenn/jmdict-fst |
| max_upload_size | |
| id | 1898326 |
| size | 118,693 |
Blazing-fast, Japanese dictionary engine
Note: This crate uses bunpo for Japanese conjugation handling. Both crates are part of the same monorepo but are published separately to crates.io.
| Metric | Value |
|---|---|
| Index Size | ~888KB (FSTs) |
| Data Size | 16MB binary blob |
| Entries | 22,569 |
| Unique Keys | 24,342 |
| Lookup Speed | O(log n), instant |
| Memory Usage | Memory-mapped, zero allocations |
cargo build
This creates:
OUT_DIR/kanji.fst โ Kanji lookup indexOUT_DIR/kana.fst โ Kana lookup indexOUT_DIR/romaji.fst โ Romaji lookup indexOUT_DIR/entries.bin โ Binary blob with all entriesuse jmdict_fast::Dict;
fn main() -> anyhow::Result<()> {
let dict = Dict::load_default()?;
let results = dict.lookup_partial("ใใใซ");
for entry in &results {
println!("Found: {:?}", entry.kanji);
println!("Reading: {:?}", entry.kana);
println!("Meanings: {:?}", entry.sense[0].gloss);
}
Ok(())
}
use jmdict_fast::Dict;
fn main() -> anyhow::Result<()> {
let dict = Dict::load_default()?;
let results = dict.lookup_exact("ใใใซใกใฏ");
for entry in &results {
println!("Found: {:?}", entry.kanji);
println!("Reading: {:?}", entry.kana);
println!("Meanings: {:?}", entry.sense[0].gloss);
}
Ok(())
}
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ kanji.fst โ โ kana.fst โ โ romaji.fst โ
โ (243KB) โ โ (257KB) โ โ (388KB) โ
โ โ โ โ โ โ
โ ๆผขๅญ โ Entry ID โ โ ใใช โ Entry ID โ โ romaji โ Entry IDโ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโ
โ entries.bin โ
โ (16MB) โ
โ โ
โ Offset Table โ
โ + JSON Entries โ
โโโโโโโโโโโโโโโโโโโ
Dict::load<P: AsRef<Path>>(base_dir: P) -> Result<Self> โ Loads the dictionary from the specified directory.dict.lookup_exact(term: &str) -> Vec<Entry> โ Performs exact lookup across all writing systems.Entry Structure:
pub struct Entry {
pub id: String, // JMdict entry ID
pub kanji: Vec<KanjiEntry>, // Kanji forms
pub kana: Vec<KanaEntry>, // Kana readings
pub sense: Vec<SenseEntry>, // Meanings and metadata
}
pub struct KanjiEntry {
pub common: bool, // Is this a common kanji?
pub text: String, // The kanji text
pub tags: Vec<String>, // JMdict tags
}
pub struct KanaEntry {
pub common: bool, // Is this a common reading?
pub text: String, // The kana text
pub tags: Vec<String>, // JMdict tags
pub applies_to_kanji: Vec<String>, // Which kanji this applies to
}
pub struct SenseEntry {
pub part_of_speech: Vec<String>, // Grammatical information
pub applies_to_kanji: Vec<String>, // Which kanji this sense applies to
pub applies_to_kana: Vec<String>, // Which kana this sense applies to
pub gloss: Vec<GlossEntry>, // English translations
// ... other JMdict fields
}
The build script implements a robust caching system to avoid re-downloading the large JMdict dataset. See CACHING.md and CACHE_QUICK_REFERENCE.md for details.
build tool processes the JMdict JSON and creates FST indexes and a binary blob for instant retrieval.Criterion (lookup_word.rs) โ MacBook, Rust 1.70+
lookup_exact ็ซ (jmdict-fast)
time: [4.06 ยตs]
lookup_word ็ซ (jmdict)
time: [511.96 ยตs]
MIT License โ see LICENSE for details.
Built with โค๏ธ and Rust ๐ฆ