| Crates.io | php-parser |
| lib.rs | php-parser |
| version | 0.1.3 |
| created_at | 2025-12-03 12:15:05.322686+00 |
| updated_at | 2025-12-11 04:59:09.417755+00 |
| description | A fast PHP parser written in Rust |
| homepage | |
| repository | https://github.com/wudi/php-parser |
| max_upload_size | |
| id | 1963932 |
| size | 483,277 |
A production-grade, fault-tolerant, zero-copy PHP parser written in Rust.
bumpalo arena allocation for high performance and zero heap allocations for AST nodes.&[u8]).cargo add php-parser
cargo add bumpalo
If you want to use the latest version from the GitHub repository, add this to your Cargo.toml:
[dependencies]
php-parser = { git = "https://github.com/wudi/php-parser" }
bumpalo = "3.19.0"
Here is a basic example of how to parse a PHP script:
use bumpalo::Bump;
use php_parser::lexer::Lexer;
use php_parser::parser::Parser;
fn main() {
// The source code to parse (as bytes)
let source = b"<?php echo 'Hello, World!';";
// Create an arena for AST allocation
let arena = Bump::new();
// Initialize the lexer and parser
let lexer = Lexer::new(source);
let mut parser = Parser::new(lexer, &arena);
let program = parser.parse_program();
println!("{:#?}", program);
// php_parser::span::with_session_globals(source, || {
// println!("{:#?}", program);
// });
}
You can output the AST in S-expression format for easier visualization:
use bumpalo::Bump;
use php_parser::lexer::Lexer;
use php_parser::parser::Parser;
use php_parser::ast::sexpr::SExprFormatter;
fn main() {
let code = "<?php class Foo extends Bar implements Baz { public int $p = 1; function m($a) { return $a; } }";
let arena = Bump::new();
let lexer = Lexer::new(code.as_bytes());
let mut parser = Parser::new(lexer, &arena);
let program = parser.parse_program();
let mut formatter = SExprFormatter::new(code.as_bytes());
formatter.visit_program(&program);
let output = formatter.finish();
println!("{}", output);
}
Gives the output:
(program
(class "Foo" (extends Bar) (implements Baz)
(members
(property public int = (integer 1))
(method "m" (params ())
(body
(return (variable "")))))))
Test file run-tests.php from php-src with 140KB size, here are the benchmark results:
➜ php-parser git:(master) ✗ cargo run --release --bin bench_file -- run-tests.php
Finished `release` profile [optimized] target(s) in 0.05s
Running `target/release/bench_file run-tests.php`
Benchmarking: run-tests.php
File size: 139.63 KB
Warming up...
Running 200 iterations...
Profile written to profile.pb
Flamegraph written to flamegraph.svg
Total time: 134.267333ms
Average time: 671.336µs
Throughput: 203.11 MB/s
Table comparing with nikic/PHP-Parser v5.6.2
| Parser | Language | Time (ms) |
|---|---|---|
| nikic/PHP-Parser | PHP | 33 |
| tree-sitter-php | C | 15 |
| php-parser | Rust | 0.67 |
Machine specs: Apple M1 Pro, 32GB RAM
Run the full test suite:
cargo test
This project uses insta for snapshot testing. If you make changes to the parser that affect the AST output, you may need to review and accept the new snapshots:
cargo test
cargo insta review
To verify stability against real-world codebases (like WordPress or Laravel), use the corpus test runner:
cargo run --release --bin corpus_test -- /path/to/php/project
Lexer: Operates on &[u8] and handles PHP's complex lexical modes (Scripting, DoubleQuote, Heredoc).
Parser: A combination of Recursive Descent and Pratt parsing for expressions.
AST: All nodes are allocated in a Bump arena. Strings are stored as references to the original source (&'src [u8]) or arena-allocated slices.
This project is licensed under the MIT License. See the LICENSE file for details.