furigana

Crates.io	furigana
lib.rs	furigana
version
source	src
created_at	2022-12-23 17:34:01.717592+00
updated_at	2025-03-07 11:46:24.485483+00
description	Map furigana to a word given its reading.
homepage
repository	https://github.com/Heliozoa/furigana
max_upload_size
id	744637
Cargo.toml error:	TOML parse error at line 19, column 1 \| 19 \| autolib = false \| ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include`
size	0

Martinez (Heliozoa)

documentation

README

furigana

Contains functionality for correctly mapping furigana to a word given a reading, and optionally kanji reading data.

Usage

for mapping in furigana::map_naive("物の怪", "もののけ") {
    println!("{mapping}");
}

prints out the following mappings:

物もの怪のけ

物ものの怪け

Only the first reading is actually valid, but based only on a word and its reading there's no way to know that.

If given information about kanji readings (for example, from KANJIDIC2), furigana::map is able to grade the potential mappings:

let mut kanji_to_readings = HashMap::new();
kanji_to_readings.insert("物".to_string(), vec!["もの".to_string()]);
kanji_to_readings.insert("怪".to_string(), vec!["け".to_string()]);
let mapping = furigana::map("物の怪", "もののけ", &kanji_to_readings)
    .into_iter()
    .max_by_key(|f| f.accuracy)
    .unwrap();
println!("{mapping}");

assigns a high accuracy value to the correct reading and a low value to the other one, so that only the correct mapping is printed:

物ものの怪け

Notes

The algorithm used is recursive and not optimised, so it may be inefficient for long, kanji-heavy inputs.
Irregular readings such as おとな for 大人 and とおか１０日 are handled case by case so these may be mapped in correctly in some cases. Issues on these are appreciated.
If the library fails to produce the correct mapping, or if its accuracy is lower than an incorrect mapping's, an issue is much appreciated!

License

Licensed under the Mozilla Public License Version 2.0.

Commit count: 18

furigana

documentation

README

furigana

Usage

Notes

License

cargo fmt