rs-conllu

Crates.iors-conllu
lib.rsrs-conllu
version0.4.1
created_at2023-04-22 07:52:19.808149+00
updated_at2025-05-27 20:01:29.977133+00
descriptionA parser for the CoNLL-U format of the Universal Dependencies project.
homepage
repositoryhttps://github.com/dahelb/rs-conllu
max_upload_size
id845862
size45,947
David Helbig (dahelb)

documentation

README

rs-conllu

This project aims to provide a parser for the CoNLL-U format of the Universal Dependencies project: https://universaldependencies.org/format.html.

Basic Usage

Parse a file in CoNLL-U format and iterate over the containing sentences.

use rs_conllu::parse_file;
use std::fs::File;

let file = File::open("tests/example.conllu")?;

let parsed = parse_file(file)?;

// Iterate over the contained sentences.
for sentence in parsed {
    // We can also iterate over the tokens in the sentence.
    for token in sentence {
        // Process token, e.g. access individual fields.
        println!("{}", token.form)
    }
}

Features

  • Tested on version 2.11 UD treebanks
  • Handles different types of token ids (single, range, empty)

Limitations

Parsing happens in a "flat" manner, relations between tokens are not respected.

Commit count: 41

cargo fmt