conll

Crates.ioconll
lib.rsconll
version0.2.0
sourcesrc
created_at2023-08-26 05:47:20.478268
updated_at2023-08-26 07:02:07.768858
descriptionParser for CoNLL(-U) Treebanks
homepagehttps://github.com/pcranaway/conll
repositoryhttps://github.com/pcranaway/conll
max_upload_size
id955368
size17,458
alex, (pcranaway)

documentation

README

conll

conll is a Rust Crate for efficiently parsing Treebanks in the CoNLL(-U) format.

Usage

You can use the parse program bundled with the crate, or you can use the library programmatically with the following usage:

let lines: Vec<String>;

let treebank = conll::conllu::parser::parse(lines).unwrap();

Performance

The ConLL-U parser is quite fast. Here is the output of executing the binary using time, on a 14MB file.

$ time ./target/release/parse nl_alpino-ud-dev.conllu -s

real    0m0.074s
user    0m0.054s
sys     0m0.019s

For comparison, here it is on a 195MB file.

time ./target/release/parse de_hdt-ud-train.conllu -s

real    0m5.006s
user    0m3.866s
sys     0m1.116s
Commit count: 27

cargo fmt