Crates.io | kma-rustlang-vadym-polishchuk-english-parser |
lib.rs | kma-rustlang-vadym-polishchuk-english-parser |
version | 0.2.0 |
source | src |
created_at | 2023-11-11 16:17:34.981989 |
updated_at | 2023-11-11 16:17:34.981989 |
description | Simple parser of English sentences created for KMA Rust course. |
homepage | |
repository | |
max_upload_size | |
id | 1032159 |
size | 14,815 |
Simple parser of English sentences created for KMA Rust course. Parser can identify single words, numbers, punctuation symbols, whitespaces, sentences and whole text.
make run ARGS="-f test_files/test1.txt"
Output:
["Hello", ",", " ", "world", "!"]
Or to get help information:
make
Parser uses peg
library. Rules:
word()
rule is used to parse words that contain only alphabetical symbolsnumber()
rule is used to parse numbersend_punctuation()
rule is used to parse punctuation marks that can end a sentence: ... | . | ! | ?
other_punctuation()
rule is used to parse punctuation marks that can be inside a sentence: , | ; | : | -
whitespace()
rule is used to parse spaces or other identation symbols like '\t' | '\n' | '\r'
sentence()
rule is used to parse the whole sentence. It uses all three previous rules to parse the input string. Sentence must end in an end_punctuation
text()
rule can parse multiple sentences