Crates.io | furigana |
lib.rs | furigana |
version | 0.1.1 |
source | src |
created_at | 2022-12-23 17:34:01.717592 |
updated_at | 2023-03-04 09:54:51.083436 |
description | Map furigana to a word given its reading. |
homepage | |
repository | https://github.com/Heliozoa/furigana |
max_upload_size | |
id | 744637 |
size | 58,763 |
furigana
is a Rust library that contains functionality for correctly mapping furigana to a word given a reading, and optionally kanji reading data.
for mapping in furigana::map_naive("物の怪", "もののけ") {
println!("{mapping}");
}
prints out the following mappings:
物の怪 物の怪
Only the first reading is actually valid, but based only on a word and its reading there's no way to know that.
If given information about kanji readings (for example, from KANJIDIC2), furigana::map
is able to grade the potential mappings:
let mut kanji_to_readings = HashMap::new();
kanji_to_readings.insert("物".to_string(), vec!["もの".to_string()]);
kanji_to_readings.insert("怪".to_string(), vec!["け".to_string()]);
let mapping = furigana::map("物の怪", "もののけ", &kanji_to_readings)
.into_iter()
.max_by_key(|f| f.accuracy)
.unwrap();
println!("{mapping}");
assigns a high accuracy value to the correct reading and a low value to the other one, so that only the correct mapping is printed:
物の怪
The algorithm used is recursive and not optimised, so it may be inefficient for long, kanji-heavy inputs.
If the library fails to produce the correct mapping, or if its accuracy is lower than an incorrect mapping's, a GitHub issue is much appreciated!
Licensed under the Mozilla Public License Version 2.0.