rs-conllu

Crates.iors-conllu
lib.rsrs-conllu
version0.2.0
sourcesrc
created_at2023-04-22 07:52:19.808149
updated_at2024-07-29 19:01:55.095674
descriptionA parser for the CoNLL-U format of the Universal Dependencies project.
homepage
repository
max_upload_size
id845862
size38,342
(davidhelbig)

documentation

README

rs-conllu

This project aims to provide a parser for the CoNLL-U format of the Universal Dependencies project: https://universaldependencies.org/format.html.

Basic Usage

Parse a file in CoNLL-U format and iterate over the containing sentences.

let file = File::open("example.conllu").unwrap();

let doc = parse_file(file);

for sentence in doc {
    for token in sentence.unwrap() {
        println!("{}", token.form);
    }
}

Features

  • Tested on version 2.11 UD treebanks
  • Handles different types of token ids (single, range, suboordinate)

Limitations

Parsing happens in a "flat" manner, relations between tokens are not respected.

Commit count: 0

cargo fmt