tanaka

Crates.iotanaka
lib.rstanaka
version0.1.0
sourcesrc
created_at2023-12-06 23:44:10.592762
updated_at2023-12-06 23:44:10.592762
descriptionA Rust interface the Tanaka Corpus of parallel Japanese-English sentences
homepagehttps://gitlab.com/johngavingraham/tanaka
repositoryhttps://gitlab.com/johngavingraham/tanaka.git
max_upload_size
id1060527
size7,038,890
(johngavingraham)

documentation

README

Tanaka

A Rust interface to the Tanaka Corpus of parallel Japanese-English sentences.

The standard corpus is included - simply call examples() (or examples_subset()). These take up a few megabytes in the library - they can be excluded by disabling their respective feature flags.

# use tanaka::Corpus;
let corpus = Corpus::examples();
println!("{:?}", corpus.examples[0]);

Otherwise, load the version of the corpus you want into a string, and parse it:

# use tanaka::Corpus;
let text = "A: 彼は忙しいですか。	Is he busy?#ID=303692_100005\n\
            B: 彼(かれ)[01] は 忙しい(いそがしい) ですか";

let corpus = Corpus::parse(text).unwrap();
println!("{:?}", corpus.examples[0]);
Commit count: 2

cargo fmt