Crates.io | name-engine |
lib.rs | name-engine |
version | 0.1.0 |
source | src |
created_at | 2024-02-20 04:15:56.532993 |
updated_at | 2024-02-20 04:15:56.532993 |
description | A Rust library for computing Markov chains to generate random names based on pronunciation |
homepage | |
repository | |
max_upload_size | |
id | 1145966 |
size | 94,961 |
Preview: Generating English Place Names examples/england_evaluated.rs
name-engine
is a basic library for computing Markov chains to generate names based on their pronunciation.
This can be used for various purposes, but primarily for generating place names.
This library computes Markov chains from a dataset of names. The names must be separated by certain user-defined rules, such as syllables. Each of the separated units is treated as a state of the Markov chain.
The transition is defined as the connection between the pronunciations. For example:
ŋ
-> w
in Ringwood /ˈrɪŋwʊd/
[(Ring /ˈrɪŋ/)
(wood /wʊd/)
]
k
-> ə
in Beccles /ˈbɛkəlz/
[(Becc /ˈbɛk/)
(les /əlz/)
]
k
-> ə
and m
-> s
in Berkhamsted /ˈbɜːrkəmstɛd/
[(Berk /ˈbɜːrk/)
(ham /əm/)
(sted /stɛd/)
]
With the data adove, the model can generate Berkles
from (Berk /ˈbɜːrk/)
and (les /əlz/)
by tracking the transition k
-> ə
.
The probability of the transition is calculated from the frequency of the connection in the dataset.
This library does:
This library DOES NOT:
NameGenerator::generate_verbose
is useful.This library only does the minimal processing necessary to generate names. To create a more practical name generator, some additional processing like above will be required.
Run cargo doc --open
to see the documentation.
If you want to try it out, see the examples in examples/
. For the first step, examples/japanese.rs
is suitable for reading.
[dependencies]
name-engine = "0.1.0"
$ cargo run --example hokkaido
中富 nakatomi
初威冠 shoikappu
上沢 kamizawa
$ cargo run --example england
Stoneon /ˈstəʊnən/
Thatchingworth /ˈθætʃɪŋwɜːθ/
Brentgomley /ˈbrɛntɡʌmli/
$ cargo run --example england_evaluated
Oltham Abbey /ˈoʊlθəm ˈæbi/
Downbury /ˈdaʊnbəri/
Farhead /ˈfɑːrhɛd/
$ cargo run --example us_evaluated
Winfield /ˈwɪnfiːld/
Perton /ˈpɛrtən/
Kinbridge Falls /ˈkɪnbrɪdʒ fɔːlz/
For English and US place name data, some symbols are added for better results.
+
and treated as independent syllables.*
is added at the beginning of the pronunciation to become the first syllable of the name or the next syllable of +
.+
, an asterisk *
is added at the end of the pronunciation to become the previous syllable of +
.Example
Tunbridge Wells /ˈtʌnbrɪdʒ ˈwɛlz/
(Tun, /*ˈtʌn/) (bridge, /brɪdʒ*/) (+, /+/) (Wells, /*ˈwɛlz/)
(Tun /ˈtʌn/)
-> (Tun /*ˈtʌn/)
[2](bridge /brɪdʒ/)
-> (bridge /brɪdʒ*/)
[3](+ /+/)
[1](Wells /ˈwɛlz/)
-> (Wells /*ˈwɛlz/)
[2]Moreover, some suffexes are treated as independent syllables, such as minster
and bridge
.
examples/assets/hokkaido.csv
: Hokkaido Government Opendata CC-BY4.0(https://creativecommons.org/licenses/by/4.0/deed.ja)
Modified from the original data.
Source: https://www.pref.hokkaido.lg.jp/link/shichoson/aiueo.html
This project is licensed under the Mozilla Public License v2.0. See the LICENSE file for details.