| Crates.io | sipha |
| lib.rs | sipha |
| version | 0.5.0 |
| created_at | 2025-11-29 13:08:21.462148+00 |
| updated_at | 2025-11-29 13:08:21.462148+00 |
| description | Core parsing infrastructure for Sipha parser library |
| homepage | https://github.com/nyaleph/sipha |
| repository | https://github.com/nyaleph/sipha |
| max_upload_size | |
| id | 1956670 |
| size | 723,818 |
A flexible, incremental parsing library for Rust with support for multiple parsing algorithms. Sipha is designed from the ground up to support incremental parsingβthe ability to efficiently re-parse only the changed portions of your code, making it ideal for interactive applications like IDEs, editors, and language servers.
π Read the Book - Comprehensive guide and documentation
Sipha 0.5.0 provides a stable foundation for incremental parsing with multiple backends. The core API is stable, though some advanced features may continue to evolve. We welcome feedback and contributions!
Sipha's standout feature is its incremental parsing capability. When you edit code, Sipha can:
This makes Sipha perfect for building language servers, IDEs, and other tools that need to parse code as users type.
Add Sipha to your Cargo.toml:
[dependencies]
sipha = "0.5.0"
Or with specific features:
[dependencies]
sipha = { version = "0.5.0", features = ["diagnostics", "unicode", "backend-ll"] }
Let's build a simple arithmetic expression parser step by step. This example will help you understand the core concepts.
First, define the tokens and non-terminals your parser will use. Sipha uses a unified SyntaxKind trait for both terminals (tokens) and non-terminals (grammar rules):
use sipha::syntax::SyntaxKind;
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
enum ArithSyntaxKind {
// Terminals (produced by lexer)
Number,
Plus,
Minus,
Multiply,
Divide,
LParen,
RParen,
Whitespace,
Eof,
// Non-terminals (produced by parser)
Expr,
Term,
Factor,
}
impl SyntaxKind for ArithSyntaxKind {
fn is_terminal(self) -> bool {
!matches!(self, ArithSyntaxKind::Expr | ArithSyntaxKind::Term | ArithSyntaxKind::Factor)
}
fn is_trivia(self) -> bool {
matches!(self, ArithSyntaxKind::Whitespace)
}
}
Create a lexer to tokenize your input text:
use sipha::lexer::{LexerBuilder, Pattern, CharSet};
let lexer = LexerBuilder::new()
.token(ArithSyntaxKind::Number, Pattern::Repeat {
pattern: Box::new(Pattern::CharClass(CharSet::digits())),
min: 1,
max: None,
})
.token(ArithSyntaxKind::Plus, Pattern::Literal("+".into()))
.token(ArithSyntaxKind::Minus, Pattern::Literal("-".into()))
.token(ArithSyntaxKind::Multiply, Pattern::Literal("*".into()))
.token(ArithSyntaxKind::Divide, Pattern::Literal("/".into()))
.token(ArithSyntaxKind::LParen, Pattern::Literal("(".into()))
.token(ArithSyntaxKind::RParen, Pattern::Literal(")".into()))
.token(ArithSyntaxKind::Whitespace, Pattern::Repeat {
pattern: Box::new(Pattern::CharClass(CharSet::whitespace())),
min: 1,
max: None,
})
.trivia(ArithSyntaxKind::Whitespace)
.build(ArithSyntaxKind::Eof, ArithSyntaxKind::Number)
.expect("Failed to build lexer");
let input = "42 + 10";
let tokens = lexer.tokenize(input)
.expect("Failed to tokenize input");
use sipha::grammar::{GrammarBuilder, Token, NonTerminal, Expr};
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum ArithNonTerminal {
Expr,
Term,
Factor,
}
impl NonTerminal for ArithNonTerminal {
fn name(&self) -> &str {
match self {
ArithNonTerminal::Expr => "Expr",
ArithNonTerminal::Term => "Term",
ArithNonTerminal::Factor => "Factor",
}
}
}
// Build your grammar rules
let grammar = GrammarBuilder::new()
.entry_point(ArithNonTerminal::Expr)
// Add your grammar rules here
.build()
.expect("Failed to build grammar");
use sipha::backend::ll::{LlParser, LlConfig};
use sipha::backend::ParserBackend;
let config = LlConfig::default();
let mut parser = LlParser::new(&grammar, config)
.expect("Failed to create parser");
let result = parser.parse(&tokens, ArithNonTerminal::Expr);
For a complete working example, see examples/basic_arithmetic.rs.
Incremental parsing is Sipha's core strength. It allows you to efficiently update your parse tree when code changes, rather than re-parsing everything from scratch.
In interactive applications like IDEs, users edit code frequently. Traditional parsers re-parse the entire file on every change, which can be slow for large files. Incremental parsing:
use sipha::incremental::{IncrementalParser, TextEdit};
use sipha::syntax::{TextRange, TextSize};
// Initial parse
let result1 = parser.parse_incremental(&tokens, None, &[], entry_point);
// After an edit (e.g., user changed "42" to "100")
let edits = vec![TextEdit {
range: TextRange::new(
TextSize::from(0),
TextSize::from(2)
),
new_text: "100".into(),
}];
// Incremental re-parse - only affected regions are re-parsed
let result2 = parser.parse_incremental(
&new_tokens,
Some(&result1.root),
&edits,
entry_point
);
The parser automatically:
Incremental parsing is fully implemented in version 0.5.0 with complete node reuse and cache management:
The parser automatically integrates reusable nodes from previous parses and queries the persistent cache during parsing, providing significant performance improvements for interactive editing scenarios.
Sipha includes a GLR (Generalized LR) parser backend for handling ambiguous grammars. GLR parsing extends LR parsing to handle non-deterministic grammars by maintaining multiple parser stacks and forking on conflicts.
Use the GLR backend when:
use sipha::backend::glr::{GlrParser, GlrConfig};
use sipha::backend::ParserBackend;
// Create GLR parser
let config = GlrConfig::default();
let parser = GlrParser::new(&grammar, config)
.expect("Failed to create GLR parser");
// Parse with GLR - returns a parse forest for ambiguous results
let result = parser.parse(&tokens, entry_point);
// Handle ambiguity if present
if let Some(forest) = result.forest {
// Multiple parse trees exist - disambiguate
let disambiguated = forest.disambiguate(|alternatives| {
// Custom disambiguation logic
alternatives.first().cloned()
});
}
GLR parsers can produce parse forests when multiple valid parse trees exist. Sipha provides several disambiguation strategies:
For more details, see the backend::glr module documentation.
Sipha supports multiple parsing algorithms via feature flags:
LL(k) Parser (backend-ll): Top-down predictive parsing with configurable lookahead
LR Parser (backend-lr): Bottom-up shift-reduce parsing
GLR Parser (backend-glr): Generalized LR parsing for ambiguous grammars
Planned: PEG, Packrat, Earley parsers
Sipha uses an immutable green/red tree representation:
Sipha provides configurable error recovery strategies:
The repository includes several examples to help you get started:
Run examples with:
cargo run --example basic_arithmetic
cargo run --example simple_calculator
syntax: Syntax tree types (green/red trees, nodes, tokens)grammar: Grammar definition and validationlexer: Tokenization and lexingparser: Parser traits and interfacesbackend: Parser backend implementations (LL, LR, etc.)error: Error types and diagnosticsincremental: Incremental parsing supportSyntaxKind: Trait for syntax node kindsToken: Trait for tokensNonTerminal: Trait for grammar non-terminalsGrammarBuilder: Build grammars declarativelyLexerBuilder: Build lexers with pattern matchingParserBackend: Unified interface for all parser backendsSyntaxNode: Red tree node for traversalGreenNode: Green tree node for storageIncremental parsing provides significant performance benefits:
For batch parsing (non-interactive), Sipha is competitive with other Rust parsing libraries. The incremental parsing capabilities make it particularly well-suited for language servers and editors.
| Feature | Sipha | pest | nom | lalrpop |
|---|---|---|---|---|
| Incremental parsing | β | β | β | β |
| Multiple backends | β | β | β | β |
| Syntax trees | β | β | β | β |
| Error recovery | β | β | β | β |
| Grammar DSL | β | β | β | β |
| Zero-copy | Partial | β | β | β |
When to use Sipha:
When to consider alternatives:
Sipha continues to evolve. Planned features for future releases include:
Contributions are welcome! Sipha 0.5.0 provides a solid foundation, and there are many opportunities to help:
# Clone the repository
git clone https://github.com/yourusername/sipha.git
cd sipha
# Run tests
cargo test
# Run examples
cargo run --example basic_arithmetic
# Build documentation
cargo doc --open
This project is licensed under the MIT License - see the LICENSE file for details.
Sipha is inspired by:
Note: Sipha 0.5.0 provides a complete incremental parsing solution. The core API is stable, and we continue to add features and improvements based on user feedback.