# LLHD Language Reference This document specifies the low-level hardware description language. It outlines the architecture, structure, and instruction set, and provides usage examples. @[toc] --- ## Modules At the root of the LLHD hierarchy, a module represents an entire design. It is equivalent to one single LLHD assembly file on disk, or one in-memory design graph. Modules consist of functions, processes, entities, and external unit declarations as outlined in the following sections. Two or more modules can be combined using the linker, which substitutes external declarations (`declare ...`) with an actual unit definition. A module is called *self-contained* if it contains no external unit declarations. ## Names Names in LLHD follow a scheme similar to LLVM. The language distinguishes between global names, local names, and anonymous names. Global names are visible outside of the module. Local names are visible only within the module, function, process, or entity they are defined in. Anonymous names are purely numeric local names whose numbering is not preserved across IR in-memory and on-disk representations. Example | Regex | Description ------- | -------------------- | --- `@foo` | `@[a-zA-Z0-9_\.\\]+` | Global name visible outside of the module, function, process, or entity. `%foo` | `%[a-zA-Z0-9_\.\\]+` | Local name visible only within module, function, process, or entity. `%42` | `%[0-9]+` | Anonymous local name. Note that basic block names are local names introduced without an explicit leading `%` but are otherwise referred to as other local names (with the leading `%`). Names are UTF-8 encoded. Arbitrary code points beyond letters and numbers may be represented as sequences of `\xx` bytes, where `xx` is the lower- or uppercase hexadecimal representation of the byte. E.g. the local name `foo$bar` is encoded as `%foo\24bar`. ## Units Designs in LLHD are represented by three different constructs (called "units"): functions, processes, and entities. These capture different concerns arising from the need to model silicon hardware, and is in contrast to IRs targeting machine code generation, which generally only consist of functions. The language differentiates between how instructions are executed in a unit: - *Control-flow* units consist of basic blocks, where execution follows a clear control-flow path. This is equivalent to what one would find in LLVM's IR. - *Data-flow* units consist only of an unordered set of instructions which form a data-flow graph. Execution of instructions is implied by the propagation of value changes through the graph. Furthermore it differentiates how time passes during the execution of a unit: - *Immediate* units execute in zero time. They may not contain any instructions that suspend execution or manipulate signals. These units are ephemeral in the sense that their execution starts and terminates in between time steps. As such no immediate units coexist or persist across time steps. - *Timed* units coexist and persist during the entire execution of the IR. They represent reactions to changes in signals and may suspend execution or interact with signals (probe value, schedule state changes). The following table provides an overview of the three IR units, which are detailed in the following sections: Unit | Paradigm | Timing | Models ------------ | ------------ | --------- | --- **Function** | control-flow | immediate | Ephemeral computation in zero time **Process** | control-flow | timed | Behavioural circuit description **Entity** | data-flow | timed | Structural circuit description ### Functions Functions represent *control-flow* executing *immediately* and consist of a sequence of basic blocks and instructions: func ( , ...) { ... } A function has a local or global name, input arguments, and a return type. The first basic block in a function is the entry block. Functions must contain at least one basic block. Terminator instructions may either branch to another basic block or must be the `ret` instruction. The argument to `ret` must be of the return type ``. Functions are called using the `call` instruction. Functions may not contain instructions that suspend execution (`wait` and `halt`), may not interact with signals (`prb`, `drv`, `sig`), and may not instantiate entities/processes (`inst`). ##### Example The following function computes the Fibonacci series for a 32 bit signed integer number N: func @fib (i32 %N) i32 { entry: %one = const i32 1 %0 = sle i32 %N, %one br %0, %recursive, %base base: ret i32 %one recursive: %two = const i32 2 %1 = sub i32 %N, %one %2 = sub i32 %N, %two %3 = call i32 @fib (i32 %1) %4 = call i32 @fib (i32 %2) %5 = add i32 %3, %4 ret i32 %5 } ### Processes Processes represent *control-flow* executing in a *timed* fashion and consist of a sequence of basic blocks and instructions. They are used to represent a procedural description of a how a circuit's output signals change in reaction to changing input signals. proc ( , ...) -> ( , ...) { ... } A process has a local or global name, input arguments, and output arguments. Input arguments may be used with the `prb` instruction. Output arguments must be of signal type (`T$`) and may be used with the `drv` instruction. The first basic block in a process is the entry block. Processes must contain at least one basic block. Terminator instructions may either branch to another basic block or must be the `halt` instruction. Processes are instantiated in entities using the `inst` instruction. Processes may not contain instructions that return execution (`ret`) and may not instantiate entities/processes (`inst`). Processes may be used to behaviorally model a circuit, as is commonly done in higher-level hardware description languages such as SystemVerilog or VHDL. As such they may represent a richer and more abstract set of behaviors beyond what actual hardware can achieve. One of the tasks of a synthesizer is to transform processes into entities, resolving implicitly modeled state-keeping elements and combinatorial transfer functions into explicit register and gate instances. LLHD aims to provide a standard way for such transformations to occur. ##### Example The following process computes the butterfly operation in an FFT combinatorially with a 1ns delay: proc @bfly (i32$ %x0, i32$ %x1) -> (i32$ %y0, i32$ %y1) { entry: %x0v = prb i32$ %x0 %x1v = prb i32$ %x1 %0 = add i32 %x0v, %x1v %1 = sub i32 %x0v, %x1v %d = const time 1ns drv i32$ %y0, %0, %d drv i32$ %y1, %1, %d wait %entry, %x0, %x1 } ### Entities Entities represent *data-flow* executing in a *timed* fashion and consist of a set of instructions. They are used to represent hierarchy in a design, as well as a data-flow description of how a circuit's output signals change in reaction to changing input signals. entity ( , ...) -> ( , ...) { ... } Eventually every design consists of at least one top-level entity, which may in turn call functions or instantiate processes and entities to form a design hierarchy. There are no basic blocks in an entity. All instructions are considered to execute in a schedule implicitly defined by their data dependencies. Dependency cycles are forbidden (except for the ones formed by probing and driving a signal). The order of instructions is purely cosmetic and does not affect behaviour. ##### Example The following entity computes the butterfly operation in an FFT combinatorially with a 1ns delay: entity @bfly (i32$ %x0, i32$ %x1) -> (i32$ %y0, i32$ %y1) { %x0v = prb i32$ %x0 %x1v = prb i32$ %x1 %0 = add i32 %x0v, %x1v %1 = sub i32 %x0v, %x1v %d = const time 1ns drv i32$ %y0, %0, %d drv i32$ %y1, %1, %d } ### External Units External units allow an LLHD module to refer to functions, processes, and entities declared outside of the module itself. The linker can then be used to resolve these declarations to actual definitions in another module. declare (, ...) ; function declaration declare (, ...) -> (, ...) ; process/entity declaration ### Basic Blocks A basic block has a name and consists of a sequence of instructions. The `` name introduced is a local name that must match `[a-zA-Z0-9_\.\\]+` without an explicit leading `%`. The created basic block is referred to by the `phi`, `br` and `wait` instructions using the full `%` form of the label. The last instruction in a basic block must be a terminator; all other instructions must *not* be a terminator. This ensures that no control flow transfer occurs within a basic block, but rather control enters at the top and leaves at the bottom. A basic block may not be empty. Functions and processes contain at least one basic block. : ... ## Type System ### Overview The following table shows the types available in LLHD. These are outlined in more detail in the following sections. Type | Description --------------- | --- `void` | The unit type (e.g. instruction that yields no result). `time` | A simulation time value. `iN` | Integer of `N` bits, signed or unsigned. `nN` | Enumeration of `N` distinct values. `lN` | Logical value of `N` bits (IEEE 1164). `T*` | Pointer to a value of type `T`. `T$` | Signal of a value of type `T`. `[N x T]` | Array containing `N` elements of type `T`. `{T0,T1,...}` | Structured data containing fields of types `T0`, `T1`, etc. Note that arbitrary combinations of signal types `T$` and pointer types `T*` are allowed. These should help support higher level HDLs advanced features and map to defined simulation behaviours. Not all such combinations are expected to describe synthesizable circuits. ### Void Type (`void`) The `void` type is used to represent the absence of a value. Instructions that do not return a value are of type `void`. There is no way to construct a `void` value. ### Time Type (`time`) The `time` type represents a simulation time value as a combination of a real time value in seconds, a delta value representing infinitesimal time steps, and an epsilon value representing an absolute time slot within a delta step (used to model SystemVerilog scheduling regions). It may be constructed using the `const time` instruction, for example: %0 = const time 1ns 2d 3e ### Integer Type (`iN`) The `iN` type represents an integer value of `N` bits, where `N` can be any non-zero positive number. There is no sign associated with an integer values. Rather, separate instructions are available to perform signed and unsigned operations, where applicable. Integer values may be constructed using the `const iN` instruction, for example: %0 = const i1 1 %1 = const i32 9001 %2 = const i1234 42 ### Enumeration Type (`nN`) The `nN` type represents an enumeration value which may take one of `N` distinct states. This type is useful for modeling sum types such as the enumerations in VHDL, and may allow for more detailed circuit analysis due to the non-power-of-two number of states the value can take. The values for `nN` range from `0` to `N-1`. Enumeration values may be constructed using the `const nN` instruction, for example: %0 = const n1 0 ; 0 is the only state in n1 %1 = const n4 3 ; 3 is the last state in n4 ### Logic Type (`lN`) The `lN` type represents a collection of `N` wires each carrying one of the nine logic values defined by IEEE 1164. This type is useful to model the actual behavior of a logic circuit, where individual bits may be in other states than just `0` and `1`: | Symbol | Meaning | ------ | --------------------------------- | | `U` | uninitialized | | `X` | strong drive, unknown logic value | | `0` | strong drive, logic zero | | `1` | strong drive, logic one | | `Z` | high impedance | | `W` | weak drive, unknown logic value | | `L` | weak drive, logic zero | | `H` | weak drive, logic one | | `-` | don't care | This type allows for the modeling of high-impedance and wired-AND/-OR signal lines. It is not directly used in arithmetic, but rather various conversion instructions should be used to translate between `lN` and the equivalent `iN`, explicitly handling states not representable in `iN`. Typically this would involve mapping an addition result to `X` when any of the input bits is `X`. Logic values may be constructed using the `const lN` instruction, for example: %0 = const l1 "U" %1 = const l8 "01XZHWLU" ### Pointer Type (`T*`) The `T*` type represents a pointer to a memory location which holds a value of type `T`. LLHD offers a very limited memory model where pointers may be used to load and store data in distinct memory slots. No bit casts or reinterpretation casts are possible. Pointers are obtained by allocating variables on the stack, which may then be accessed by load and store instructions: %init = const i8 42 %ptr = var i8 %init %0 = ld i8* %ptr %1 = mul i8 %0, %0 st i8* %ptr, %1 *Note:* It is not yet clear whether LLHD will provide `alloc` and `free` instructions to create and destroy memory slots in an infinite heap data structure. ### Signal Type (`T$`) The `T$` type represents a physical signal which carries a value of type `T`. Signals correspond directly to wires in a physical design, and are used to model propagation delays and timing. Signals are used to carry values across time steps in the LLHD execution model. Signals are obtained by creating them in an entity, which may then be probed for the current value and driven to a new value: %init = const i8 42 %wire = sig i8 %init %0 = prb i8$ %wire %1 = mul i8 %0, %0 %1ns = const time 1ns drv i8$ %wire, %1, %1ns ### Array Type (`[N x T]`) The `[N x T]` type represents a collection of `N` values of type `T`, where `N` can be any positive number, including zero. All elements of an array have the same type. An array may be constructed using the `[...]` instruction: %0 = const i16 1 %1 = const i16 42 %2 = const i16 9001 %3 = [i16 %0, %1, %2] ; [1, 42, 9001] %4 = [3 x i16 %0] ; [1, 1, 1] Individual values may be obtained or modified with the `extf`/`insf` instructions. Subranges of the array may be obtained or modified with the `exts`/`inss` instructions. ### Struct Type (`{T0,T1,...}`) The `{T0,T1,...}` type represents a struct of field types `T0`, `T1`, etc. Fields in LLHD structs are unnamed and accessed by their respective index, starting from zero. A struct may be constructed using the `{...}` instruction: %0 = const i1 1 %1 = const i8 42 %2 = const time 10ns %3 = {i1 %0, i8 %1, time %2} ; {1, 42, 10ns} Individual fields may be obtained or modified with the `extf`/`insf` instructions. ## Instructions ### Overview The following table shows the full instruction set of LLHD. The flags indicate if an instruction - **F**: can appear in a function, - **P**: can appear in a process, - **E**: can appear in an entity, or - **T**: is a terminator. Instruction | Flags | Description ----------------------------|---------|------------ **Values** | | `const` | F P E | Construct a constant value `alias` | F P E | Assign a new name to a value `[...]` | F P E | Construct an array `{...}` | F P E | Construct a struct `insf` `inss` | F P E | Insert elements, fields, or bits `extf` `exts` | F P E | Extract elements, fields, or bits `mux` | F P E | Choose from an array of values **Bitwise** | | `not` | F P E | Unary logic `and` `or` `xor` | F P E | Binary logic `shl` `shr` | F P E | Shift left or right **Arithmetic** | | `neg` | F P E | Unary arithmetic `add` `sub` | F P E | Binary arithmetic `smul` `sdiv` `smod` `srem` | F P E | Binary signed arithmetic `umul` `udiv` `umod` `urem` | F P E | Binary unsigned arithmetic **Comparison** | | `eq` `neq` | F P E | Equality operators `slt` `sgt` `sle` `sge` | F P E | Signed relational operators `ult` `ugt` `ule` `uge` | F P E | Unsigned relational operators **Control Flow** | | `phi` | F P | Reconvergence node `br` | F P T | Branch to a different block `call` | F P E | Call a function `ret` | F P T | Return from a function `wait` | P T | Suspend execution `halt` | P T | Terminate execution **Memory** | | `var` | F P | Allocate memory `ld` | F P | Load value from memory `st` | F P | Store value in memory **Signals** | | `sig` | E | Create a signal `prb` | E P | Probe value on signal `drv` | E P | Drive value of signal **Structural** | | `reg` | E | Create a storage element `del` | E | Delay a signal `con` | E | Connect two signals `inst` | E | Instantiate a process/entity ### Working with Values #### Constant Value (`const`) The `const` instruction is used to introduce a constant value into the IR. The first version constructs a constant integer value, the second a constant integer signal, and the third a constant time value. %result = const time