| Crates.io | xml_simple_parser |
| lib.rs | xml_simple_parser |
| version | 0.1.0 |
| created_at | 2025-11-11 20:12:39.693252+00 |
| updated_at | 2025-11-11 20:12:39.693252+00 |
| description | A simple XML parser implemented in Rust using the Pest parser generator. |
| homepage | |
| repository | https://github.com/ZhekaOst/xml_simple_parser |
| max_upload_size | |
| id | 1928130 |
| size | 24,883 |
This project implements a simplified XML parser in Rust using the pest library. It handles core XML structures: tags, attributes, and nested content.
This project includes a Makefile to simplify common development commands.
Check all (format, lint with clippy, and run tests):
make check
Run parser (uses test.xml):
make run
Run tests:
make test
Format code:
make fmt
Lint code:
make clippy
Show all commands:
make help
You can also run the program directly.
Parse a file:
cargo run -- parse --file <path/to/your-file.xml>
(Note: The parser will look for the file in the project root and in the src/ folder.)
Show author credits:
cargo run -- credits
Show built-in CLI help:
cargo run -- --help
The parser uses a Context-Free Grammar (CFG) to validate and structure the XML input. It recognizes the following elements via grammar rules:
name: Identifiers for tags and attributes.attribute: Key-value pairs (key="value").text_content: Literal data between tags.element: Recursive structure for tags, children, and attributes.xml: The complete document (ensuring a single root element).The output is a tree of tokens (Pairs) from the pest library.
Pairs are used to build an Abstract Syntax Tree (AST) (native Rust structs) for subsequent data manipulation or interpretation.grammar.pest)This is the core logic of the parser. It defines the exact syntax that the parser will recognize.
WHITESPACE = _{ " " | "\t" | "\r" | "\n" }
name = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
attribute_value = { "\"" ~ (!("\"") ~ ANY)* ~ "\"" }
attribute = { name ~ "=" ~ attribute_value }
opening_tag = { "<" ~ name ~ attribute* ~ ">" }
closing_tag = { "</" ~ name ~ ">" }
text_content = @{ (!("<") ~ ANY)+ }
element = { opening_tag ~ (element | text_content)* ~ closing_tag }
xml = { SOI ~ element ~ EOI }