# temex ## `temex` - Regex-like temporal expressions for evaluating systems that change over time This library implements *temexes*, which are structures that combine formulas of propositional logic with regular expression operators to provide a temporal logic designed for simplicity and ease of use. ## Table of contents 1. [Introduction](#introduction) 2. [Overview](#overview) 3. [Temex syntax](#temex-syntax) * [Propositional formula syntax](#propositional-formula-syntax) * [Regex operator syntax](#regex-operator-syntax) 4. [Supported trace formats](#supported-trace-formats) 5. [Usage](#usage) 6. [Examples](#examples) * [Example 1: Testing for a match in a polars DataFrame using `is_match`](#example-1-testing-for-a-match-in-a-polars-dataframe-using-is_match) * [Example 2: Finding the location of a match using `find`](#example-2-finding-the-location-of-a-match-using-find) * [Example 3: Find all nonoverlapping matches using `find_iter`](#example-3-find-all-nonoverlapping-matches-using-find_iter) 7. [License](#license) --- ## Introduction Temporal logic has useful applications in a wide range of contexts, in and outside of computer science. A detailed examination is beyond the scope of this document, but a good introduction is available [here](https://plato.stanford.edu/entries/logic-temporal/). Although temporal logic has many use cases, most tools that implement some form of temporal logic are designed for formal methods specialists and logicians, not for a general audience. As a result, they can seem arcane and unapproachable to a layman. The purpose of `temex` is to provide a simple tool that allows generalist software engineers and data analysts to harness the power of temporal logic. It is designed to be easy to understand and use, requiring only a basic knowledge of propositional logic and regular expressions. ## Overview A *truth assignment* is a mapping of boolean variables to truth values (i.e., `true` or `false`). When all the boolean variables that occur in a formula of propositional logic are assigned a value, then that formula evaluates to either `true` or `false`. We can think of a formula as defining a set of truth assignments: namely, the truth assignments that make that formula evaluate to `true`. A regular expression is either a character, or it is a composite expression in which subexpressions are joined by regex operators. A regular expression defines a set of sequences of characters: those sequences that match the regular expression. We can think of truth assignments as characters, and we can think of a propositional formula as a regular expression in which the truth assignments it defines are joined by alternation (i.e., the regex operator `|`). So, for instance, the formula `not (p and q)` can be thought of as `{p = true, q = false} | {p = false, q = true} | {p = false, q = false}`. If we consider truth assignments as characters and propositional formulas as alternations over these characters, then if we join propositional formulas with regex operators and interpret the result according to the ordinary semantics of regular expressions we have something that defines a sequence of truth assignments. We call this a *temex*, for temporal expression. We can use a temex to search sequences of truth assignments (henceforth called *traces*) in much the same way we can use regular expressions to search strings. `temex` implements this functionality. ## Temex syntax Temex atoms are propositional formulas enclosed in square brackets (`[` and `]`). They may be joined with regex operators in the same ways characters can be joined with regex operators in a regular expression. #### Propositional formula syntax A propositional formula may be a boolean variable (which begins with an alphabetic character and may contain alphabetic or numeric characters as well as underscores), or a composite formula, in which subformulas are joined with propositional connectives. `temex` uses the English words `not`, `and`, and `or` for negation, conjunction, and disjunction; `not` has the highest operator precedence, while `and` and `or` have equal precedence and are right associative. Examples: * `f123` * `is_sunny and not is_raining` * `cpu_GT_80 and not (mem_LT_40 or io_LT_20)` #### Regex operator syntax `temex`'s regex operators inherit the syntax of the [regex](https://crates.io/crates/regex) crate, which `temex` uses to implement its regex operators. In the following forms, `phi`, `phi_1`, and `phi_2` are propositional formulas, and `n` and `m` are natural numbers:
[phi_1][phi_2] concatenation (phi_1 followed by phi_2) [phi_1]|[phi_2] alternation (phi_1 or phi_2, prefer phi_1) [phi]* zero or more of phi (greedy) [phi]+ one or more of phi (greedy) [phi]? zero or one of phi (greedy) [phi]*? zero or more of phi (ungreedy/lazy) [phi]+? one or more of phi (ungreedy/lazy) [phi]?? zero or one of phi (ungreedy/lazy) [phi]{n,m} at least n phi and at most m phi (greedy) [phi]{n,} at least n phi (greedy) [phi]{n} exactly n phi [phi]{n,m}? at least n phi and at most m phi (ungreedy/lazy) [phi]{n,}? at least n phi (ungreedy/lazy) [phi]{n}? exactly n phi## Supported trace formats The following trace formats are supported: * [polars](https://crates.io/crates/polars) dataframes in which all of the columns are boolean. Column names are interpreted as the names of boolean variables and the rows of the dataframe are interpreted as truth assignments. * CSVs encoded as UTF8 strings or files. The first line should contain the names of boolean variables and the subsequent lines should contain values of either `1` or `0`, for `true` or `false`, respectively. The lines after the header are interpreted as truth assignments. When matching against a trace, the variable names in the `temex` pattern and the trace must be the same; if there are extra variables in the trace then searching will result in an error. ## Usage Prior to performing any search operation, the temex pattern and trace must be compiled. Compiling the temex pattern is done with [Temex::new](Temex::new), while compiling the trace is done with [TemexTrace::try_from](TemexTrace::try_from). Searches may be performed using methods on the temex; a complete listing is available [here](temex::Temex). ## Examples #### Example 1: Testing for a match in a polars DataFrame using `is_match` This example shows how to perform a `temex` search in a [polars](https://crates.io/crates/polars) DataFrame. The DataFrame must be converted to a TemexTrace before searching. The example also shows how to convert a TemexTrace back to a DataFrame. Note the use of `^` and `$`, which are anchors that match to the beginning and end of the trace, respectively: ```rust use polars::df; use polars::prelude::*; use temex::{Temex, TemexTrace}; let df = df! [ "p1" => [true, true, true], "p2" => [false, false, false], "p3" => [true, false, true] ] .unwrap(); let trace = TemexTrace::try_from(df.clone()).unwrap(); let te = Temex::new("^[p1 and (p2 or p3)][p1][p1 or p2 and not p3]$").unwrap(); assert!(te.is_match(&trace).unwrap()); // We can convert the TemexTrace back to a DataFrame let df_from_trace = polars::frame::DataFrame::try_from(trace).unwrap(); assert_eq!(df, df_from_trace); ``` ##### Example 2: Finding the location of a match using `find` Here we use the `find` method to find the range of the left-most first match in the trace: ```rust let trace_str = "CPU_core1_GT_80, CPU_core2_GT_80, mem_usage_GT_40\n\ 0, 0, 0\n\ 1, 0, 0\n\ 0, 1, 0\n\ 0, 0, 1\n\ 1, 0, 1\n\ 0, 1, 1\n\ 1, 1, 0\n\ 1, 1, 0\n\ 1, 1, 0\n"; let trace = trace_str.parse::