| Crates.io | rexile |
| lib.rs | rexile |
| version | 0.4.6 |
| created_at | 2026-01-23 06:05:07.51379+00 |
| updated_at | 2026-01-25 16:39:48.591328+00 |
| description | A blazing-fast regex engine with 10-100x faster compilation and competitive matching performance - now with dot wildcard, non-greedy quantifiers, DOTALL mode, and non-capturing groups |
| homepage | https://github.com/KSD-CO/rexile |
| repository | https://github.com/KSD-CO/rexile |
| max_upload_size | |
| id | 2063577 |
| size | 656,750 |
A blazing-fast regex engine with 10-100x faster compilation speed
ReXile is a lightweight regex alternative that achieves exceptional compilation speed while maintaining competitive matching performance:
., .*, .+ implementation with backtrackingmemchr and aho-corasick for SIMD primitivesKey Features:
*, +, ?)*?, +?, ??) - NEW in v0.2.1., .*, .+) with backtracking(?s)) - Dot matches newlines - NEW in v0.2.1(?:...)) with alternations - NEW in v0.2.1\d, \w, \s, etc.)\b, \B)^, $)ReXile is a high-performance regex engine optimized for fast compilation:
regex cratememchr + aho-corasick for SIMD primitivesCompilation Speed (vs regex crate):
[a-zA-Z_]\w*: 104.7x faster ๐\d+: 46.5x faster ๐(\w+)\s*(>=|<=|==|!=|>|<)\s*(.+): 40.7x faster ๐.*test.*: 15.3x fasterMatching Speed:
\d+, \w+): 1.4-1.9x faster โ
Use Case Example (Load 1000 GRL rules):
Memory Comparison:
When to Use ReXile:
use rexile::Pattern;
// Literal matching with SIMD acceleration
let pattern = Pattern::new("hello").unwrap();
assert!(pattern.is_match("hello world"));
assert_eq!(pattern.find("say hello"), Some((4, 9)));
// Multi-pattern matching (aho-corasick fast path)
let multi = Pattern::new("foo|bar|baz").unwrap();
assert!(multi.is_match("the bar is open"));
// Dot wildcard matching (with backtracking)
let dot = Pattern::new("a.c").unwrap();
assert!(dot.is_match("abc")); // . matches 'b'
assert!(dot.is_match("a_c")); // . matches '_'
// Greedy quantifiers with dot
let greedy = Pattern::new("a.*c").unwrap();
assert!(greedy.is_match("abc")); // .* matches 'b'
assert!(greedy.is_match("a12345c")); // .* matches '12345'
let plus = Pattern::new("a.+c").unwrap();
assert!(plus.is_match("abc")); // .+ matches 'b' (requires at least one char)
assert!(!plus.is_match("ac")); // .+ needs at least 1 character
// Non-greedy quantifiers (NEW in v0.2.1)
let lazy = Pattern::new(r"start\{.*?\}").unwrap();
assert_eq!(lazy.find("start{abc}end{xyz}"), Some((0, 10))); // Matches "start{abc}", not greedy
// DOTALL mode - dot matches newlines (NEW in v0.2.1)
let dotall = Pattern::new(r"(?s)rule\s+.*?\}").unwrap();
let multiline = "rule test {\n content\n}";
assert!(dotall.is_match(multiline)); // (?s) makes .* match across newlines
// Non-capturing groups with alternation (NEW in v0.2.1)
let group = Pattern::new(r#"(?:"test"|foo)"#).unwrap();
assert!(group.is_match("\"test\"")); // Matches quoted "test"
assert!(group.is_match("foo")); // Or matches foo
// Digit matching (DigitRun fast path - 1.4-1.9x faster than regex!)
let digits = Pattern::new("\\d+").unwrap();
let matches = digits.find_all("Order #12345 costs $67.89");
// Returns: [(7, 12), (20, 22), (23, 25)]
// Identifier matching (IdentifierRun fast path)
let ident = Pattern::new("[a-zA-Z_]\\w*").unwrap();
assert!(ident.is_match("variable_name_123"));
// Quoted strings (QuotedString fast path - 1.4-1.9x faster!)
let quoted = Pattern::new("\"[^\"]+\"").unwrap();
assert!(quoted.is_match("say \"hello world\""));
// Word boundaries
let word = Pattern::new("\\btest\\b").unwrap();
assert!(word.is_match("this is a test"));
assert!(!word.is_match("testing"));
// Anchors
let exact = Pattern::new("^hello$").unwrap();
assert!(exact.is_match("hello"));
assert!(!exact.is_match("hello world"));
For patterns used repeatedly in hot loops:
use rexile;
// Automatically cached - compile once, reuse forever
assert!(rexile::is_match("test", "this is a test").unwrap());
assert_eq!(rexile::find("world", "hello world").unwrap(), Some((6, 11)));
// Perfect for parsers and lexers
for line in log_lines {
if rexile::is_match("ERROR", line).unwrap() {
// handle error
}
}
ReXile uses JIT-style specialized implementations for common patterns:
| Fast Path | Pattern Example | Performance vs regex |
|---|---|---|
| Literal | "hello" |
Competitive (SIMD) |
| LiteralPlusWhitespace | "rule " |
Competitive |
| DigitRun | \d+ |
1.4-1.9x faster โจ |
| IdentifierRun | [a-zA-Z_]\w* |
104.7x faster compilation |
| QuotedString | "[^"]+" |
1.4-1.9x faster โจ |
| WordRun | \w+ |
Competitive |
| DotWildcard | ., .*, .+ |
With backtracking |
| Alternation | foo|bar|baz |
2x slower (acceptable) |
| LiteralWhitespaceQuoted | Complex | Competitive |
| LiteralWhitespaceDigits | Complex | Competitive |
| Feature | Example | Status |
|---|---|---|
| Literal strings | hello, world |
โ Supported |
| Alternation | foo|bar|baz |
โ Supported (aho-corasick) |
| Start anchor | ^start |
โ Supported |
| End anchor | end$ |
โ Supported |
| Exact match | ^exact$ |
โ Supported |
| Character classes | [a-z], [0-9], [^abc] |
โ Supported |
| Quantifiers | *, +, ? |
โ Supported |
| Non-greedy quantifiers | .*?, +?, ?? |
โ Supported (v0.2.1) |
| Dot wildcard | ., .*, .+ |
โ Supported (v0.2.0) |
| DOTALL mode | (?s) - dot matches newlines |
โ Supported (v0.2.1) |
| Escape sequences | \d, \w, \s, \., \n, \t |
โ Supported |
| Sequences | ab+c*, \d+\w* |
โ Supported |
| Non-capturing groups | (?:abc|def) |
โ Supported (v0.2.1) |
| Capturing groups | Extract (group) |
โ Supported (v0.2.0) |
| Word boundaries | \b, \B |
โ Supported |
| Bounded quantifiers | {n}, {n,m} |
๐ง Planned |
| Lookahead/lookbehind | (?=...), (?<=...) |
๐ง Planned |
| Backreferences | \1, \2 |
๐ง Planned |
Pattern Compilation Benchmark (vs regex crate):
| Pattern | rexile | regex | Speedup |
|---|---|---|---|
[a-zA-Z_]\w* |
95.2 ns | 9.97 ยตs | 104.7x faster ๐ |
\d+ |
86.7 ns | 4.03 ยตs | 46.5x faster ๐ |
(\w+)\s*(>=|<=|==|!=|>|<)\s*(.+) |
471 ns | 19.2 ยตs | 40.7x faster ๐ |
.*test.* |
148 ns | 2.27 ยตs | 15.3x faster ๐ |
Average: 10-100x faster compilation - Perfect for dynamic patterns!
Simple Patterns (Fast paths):
\d+ on "12345": 1.4-1.9x faster โ
\w+ on "variable": 1.4-1.9x faster โ
"[^"]+" on quoted strings: Competitive โ
Complex Patterns (Backtracking):
a.+c on "abc": 2-5x slower (acceptable).*test.* on long strings: 2-10x slower (acceptable)Loading 1000 GRL Rules:
Test 1: Pattern Compilation (10 patterns):
Test 2: Search Operations (5 patterns ร 139KB corpus):
Test 3: Stress Test (50 patterns ร 500KB corpus):
โ
Simple patterns (\d+, \w+) - 1.4-1.9x faster matching
โ
Fast compilation - 10-100x faster pattern compilation (huge win!)
โ
Identifiers ([a-zA-Z_]\w*) - 104.7x faster compilation
โ
Memory efficiency - 15x less for compilation, 5x less peak
โ
Instant startup - Load 1000 patterns in 0.02s vs 2s (100x faster)
โ
Dot wildcards - Full ., .*, .+ support with backtracking
โ ๏ธ Complex patterns with backtracking - ReXile 2-10x slower (acceptable trade-off)
โ ๏ธ Alternations (when|then) - ReXile 2x slower
โ ๏ธ Hot-path matching - For performance-critical matching, regex may be better
Pattern โ Parser โ AST โ Fast Path Detection โ Specialized Matcher
โ
DigitRun (memchr SIMD scanning)
IdentifierRun (direct byte scanning)
QuotedString (memchr + validation)
Alternation (aho-corasick automaton)
Literal (memchr SIMD)
... 5 more fast paths
Run benchmarks yourself:
cargo run --release --example per_file_grl_benchmark
cargo run --release --example memory_comparison
Add to your Cargo.toml:
[dependencies]
rexile = "0.2"
let p = Pattern::new("needle").unwrap();
assert!(p.is_match("needle in a haystack"));
assert_eq!(p.find("where is the needle?"), Some((13, 19)));
// Find all occurrences
let matches = p.find_all("needle and needle");
assert_eq!(matches, vec![(0, 6), (11, 17)]);
// Fast multi-pattern search using aho-corasick
let keywords = Pattern::new("import|export|function|class").unwrap();
assert!(keywords.is_match("export default function"));
// Must start with pattern
let starts = Pattern::new("^Hello").unwrap();
assert!(starts.is_match("Hello World"));
assert!(!starts.is_match("Say Hello"));
// Must end with pattern
let ends = Pattern::new("World$").unwrap();
assert!(ends.is_match("Hello World"));
assert!(!ends.is_match("World Peace"));
// Exact match
let exact = Pattern::new("^exact$").unwrap();
assert!(exact.is_match("exact"));
assert!(!exact.is_match("not exact"));
// First call compiles and caches
rexile::is_match("keyword", "find keyword here").unwrap();
// Subsequent calls reuse cached pattern (zero compile cost)
rexile::is_match("keyword", "another keyword").unwrap();
rexile::is_match("keyword", "more keyword text").unwrap();
๐ More examples: See examples/ directory for:
basic_usage.rs - Core API walkthroughlog_processing.rs - Log analysis patternsperformance.rs - Performance comparisonRun examples with:
cargo run --example basic_usage
cargo run --example log_processing
ReXile is production-ready for:
\d+, \w+ (1.4-1.9x faster matching)[a-zA-Z_]\w* (104.7x faster compilation!)., .*, .+ with proper backtrackingrule\s+, function\s+\p{L} - not yet supported)Contributions welcome! ReXile is actively maintained and evolving.
Current focus:
., .*, .+) with backtracking - v0.2.0.*?, +?, ??) - v0.2.1(?s)) for multiline matching - v0.2.1(?:...)) with alternations - v0.2.1{n,m}, lookahead, Unicode supportHow to contribute:
cargo testcargo run --release --example per_file_grl_benchmarkPriority areas:
{n}, {n,m})Licensed under either of:
at your option.
Built on top of:
memchr by Andrew Gallant - SIMD-accelerated substring searchaho-corasick by Andrew Gallant - Multi-pattern matching automatonDeveloped for the rust-rule-engine project, providing fast pattern matching for GRL (Grule Rule Language) parsing and business rule evaluation.
Performance Philosophy: ReXile achieves competitive performance through intelligent specialization rather than complex JIT compilation:
Status: โ Production Ready (v0.2.1)
โ Compilation Speed: 10-100x faster than regex crate
โ Matching Speed: 1.4-1.9x faster on simple patterns
โ Memory: 15x less compilation, 5x less peak
โ Features: Core regex + dot wildcard + capturing groups + non-greedy + DOTALL + non-capturing groups
โ Testing: 84 unit tests + 13 group integration tests passing
โ Real-world validated: GRL parsing, rule engines, DSL compilers