Crates.io | rulex |
lib.rs | rulex |
version | 0.4.4 |
source | src |
created_at | 2022-03-11 15:42:35.89039 |
updated_at | 2022-08-24 19:23:11.422444 |
description | DEPRECATED: Use pomsky instead. A new regular expression language |
homepage | https://pomsky-lang.org |
repository | https://github.com/rulex-rs/pomsky |
max_upload_size | |
id | 548261 |
size | 247,924 |
⚠️ DEPRECATED ⚠️ Use the pomsky
crate instead. Rulex was
renamed to pomsky.
A new, portable, regular expression language. Read the book to get started!
On the left are rulex expressions (rulexes for short), on the right is the compiled regex:
# String
'hello world' # hello world
# Greedy repetition
'hello'{1,5} # (?:hello){1,5}
'hello'* # (?:hello)*
'hello'+ # (?:hello)+
# Lazy repetition
'hello'{1,5} lazy # (?:hello){1,5}?
'hello'* lazy # (?:hello)*?
'hello'+ lazy # (?:hello)+?
# Alternation
'hello' | 'world' # hello|world
# Character classes
['aeiou'] # [aeiou]
['p'-'s'] # [p-s]
# Named character classes
[.] [w] [s] [n] # .\w\s\n
# Combined
[w 'a' 't'-'z' U+15] # [\wat-z\x15]
# Negated character classes
!['a' 't'-'z'] # [^at-z]
# Unicode
[Greek] U+30F Grapheme # \p{Greek}\u030F\X
# Boundaries
<% %> # ^$
% 'hello' !% # \bhello\B
# Non-capturing groups
'terri' ('fic' | 'ble') # terri(?:fic|ble)
# Capturing groups
:('test') # (test)
:name('test') # (?P<name>test)
# Lookahead/lookbehind
>> 'foo' | 'bar' # (?=foo|bar)
<< 'foo' | 'bar' # (?<=foo|bar)
!>> 'foo' | 'bar' # (?!foo|bar)
!<< 'foo' | 'bar' # (?<!foo|bar)
# Backreferences
:('test') ::1 # (test)\1
:name('test') ::name # (?P<name>test)\1
# Ranges
range '0'-'999' # 0|[1-9][0-9]{0,2}
range '0'-'255' # 0|1[0-9]{0,2}|2(?:[0-4][0-9]?|5[0-5]?|[6-9])?|[3-9][0-9]?
let operator = '+' | '-' | '*' | '/';
let number = '-'? [digit]+;
number (operator number)*
Read the book to get started, or check out the CLI program, the Rust library and the procedural macro.
Normal regexes are very concise, but when they get longer, they get increasingly difficult to
understand. By default, they don't have comments, and whitespace is significant. Then there's the
plethora of sigils and backslash escapes that follow no discernible system:
(?<=) (?P<>) .?? \N \p{} \k<> \g''
and so on. And with various inconsistencies between regex
implementations, it's the perfect recipe for confusion.
Rulex solves these problems with a new, simpler but also more powerful syntax:
Rulex is currently compatible with PCRE, JavaScript, Java, .NET, Python, Ruby and Rust. The regex flavor must be specified during compilation, so rulex can ensure that the produced regex works as desired on the targeted regex engine.
Important note for JavaScript users: Don't forget to enable the u
flag. This is required for
Unicode support. All other major regex engines support Unicode by default.
Rulex looks for mistakes and displays helpful diagnostics:
You can find the Roadmap here.
You can contribute by using rulex and providing feedback. If you find a bug or have a question, please create an issue.
I also gladly accept code contributions. To make sure that CI succeeds, please run cargo fmt
,
cargo clippy
and cargo test
before creating a pull request.
Dual-licensed under the MIT license or the Apache 2.0 license.