kma-rustlang-vadym-polishchuk-english-parser

Crates.iokma-rustlang-vadym-polishchuk-english-parser
lib.rskma-rustlang-vadym-polishchuk-english-parser
version0.2.0
sourcesrc
created_at2023-11-11 16:17:34.981989
updated_at2023-11-11 16:17:34.981989
descriptionSimple parser of English sentences created for KMA Rust course.
homepage
repository
max_upload_size
id1032159
size14,815
vadympk (vadimpk)

documentation

README

Description

Simple parser of English sentences created for KMA Rust course. Parser can identify single words, numbers, punctuation symbols, whitespaces, sentences and whole text.

Usage

make run ARGS="-f test_files/test1.txt"

Output:

["Hello", ",", " ", "world", "!"]

Or to get help information:

make

Techical

Parser uses peg library. Rules:

  • word() rule is used to parse words that contain only alphabetical symbols
  • number() rule is used to parse numbers
  • end_punctuation() rule is used to parse punctuation marks that can end a sentence: ... | . | ! | ?
  • other_punctuation() rule is used to parse punctuation marks that can be inside a sentence: , | ; | : | -
  • whitespace() rule is used to parse spaces or other identation symbols like '\t' | '\n' | '\r'
  • sentence() rule is used to parse the whole sentence. It uses all three previous rules to parse the input string. Sentence must end in an end_punctuation
  • text() rule can parse multiple sentences
Commit count: 0

cargo fmt