# Absolut Absolut stands for "**A**utogenerated **B**ytewise **S**IMD-**O**ptimized **L**ook-**U**p **T**ables". The following is a breakdown of this jargon: - **Bytewise Lookup Table**: One-to-one mappings between sets of bytes. - **SIMD-Optimized**: Said lookup tables are implemented using SIMD (Single Instruction Multiple Data) instructions, such as `PSHUFB` on x86_64 and `TBL` on AArch64. - **Autogenerated**: This crate utilizes [procedural macros](https://doc.rust-lang.org/reference/procedural-macros.html) to generate (if possible) SIMD lookup tables given a human-readable byte-to-byte mapping. ## Why? SIMD instructions allow for greater data parallelism when performing table lookups on bytes. This is has proved incredibly useful for [high-performance data processing](https://arxiv.org/abs/1902.08318). Unfortunately, SIMD table lookup instructions (or byte shuffling instructions) operate on tables too small to cover the entire 8-bit integer space. These tables typically have a size of 16 on x86_64, while on AArch64 tables of up to 64 elements are supported. This library facilitates the generation of SIMD lookup tables from high-level descriptions of byte-to-byte mappings. The goal is to avoid the need to [hardcode manually-computed](https://github.com/simd-lite/simd-json/blob/main/src/impls/sse42/stage1.rs#L22) SIMD lookup tables, thus enabling a wider audience to utilize these techniques more easily. ## How? Absolut is essentially a set of procedural macros that accept byte-to-byte mapping descriptions in the form of Rust enums: ```rust #[absolut::one_hot] pub enum JsonTable { #[matches(b',')] Comma, #[matches(b':')] Colon, #[matches(b'[', b']', b'{', b'}')] Brackets, #[matches(b'\r', b'\n', b'\t')] Control, #[matches(b' ')] Space, #[wildcard] Other, } ``` The above `JsonTable` enum encodes the following one-to-one mapping: | Input | Output | |------------------------- |----------| | `0x2C` | Comma | | `0x3A` | Colon | | `0x5B, 0x5D, 0x7B, 0x7D` | Brackets | | `0xD, 0xA, 0x9` | Control | | `0x20` | Space | | `*` | Other | Where `*` denotes all other bytes not explicitly mapped. Mapping results needn't be explicitly defined as Absolut will solve for them automatically. In the previous code snippet, the expression `JsonTable::Space as u8` evaluates to the output byte when performing a table lookup on `0x20`. Absolut supports multiple techniques for constructing SIMD lookup tables called _algorithms_. Each algorithm is implemented as a procedural macro that accepts byte-to-byte mappings described using enums with attribute-annotated variants as illustrated [above with the `absolut::one_hot` algorithm](#how). ## Known issues ### Error messages In case a byte-to-byte mapping cannot be implemented using a given Absolut algorithm (i.e. the table is _unsatisfiable_) the resulting error messages won't be useful for understanding _why_ the algorithm failed to solve for the table. Unless the user is at least vaguely familiar with how the algorithm at play works, it would be difficult for them to figure out how to change the mapping in such a way that it becomes satisfiable _and_ stay useful for their purposes. ### SIMD lookup routines Absolut currently does not provide SIMD implementations of lookup routines for the generated lookup tables. However, the library tests contain lookup routines for SSSE3 and NEON. ## License Absolut is open-source software licensed under the terms of the [MIT License](https://opensource.org/licenses/MIT).