| Crates.io | thalir |
| lib.rs | thalir |
| version | 0.1.0 |
| created_at | 2025-10-22 05:27:26.103082+00 |
| updated_at | 2025-10-22 05:27:26.103082+00 |
| description | ThalIR - Solidity intermediate representation for smart contract security auditing |
| homepage | https://github.com/tameshi-dev/thalir |
| repository | https://github.com/tameshi-dev/thalir |
| max_upload_size | |
| id | 1895008 |
| size | 43,668 |
An intermediate representation for smart contract security auditing. Extends Cranelift IR with EVM-specific semantics while preserving Cranelift's SSA form and design principles.
Tameshi (tameshi.dev) is the reference auditing platform using ThalIR.
Note on Syntax: This document uses
ext.sol.*notation to describe ThalIR's conceptual extension namespace for EVM operations. The actual grammar implementation uses shorter opcodes likestorage_load,mapping_load,get_context, etc. Code examples show the implemented syntax.
Source-level Solidity presents challenges for systematic security analysis:
ThalIR provides a canonical, explicit representation optimized for vulnerability detection rather than code generation.
Pattern Recognition
Represent security-relevant operation sequences explicitly (storage writes after external calls, unchecked arithmetic, access control checks).
Explicit Semantics
Storage operations, external calls, and overflow behavior are first-class IR concepts rather than implicit.
Canonical Form
SSA with block parameters and multi-value returns provides uniform representation regardless of source-level syntax variations.
Obfuscation Support
Name obfuscation and metadata stripping for confidential auditing while preserving security-relevant behavior.
ThalIR is a superset of Cranelift IR with the following additions:
Storage operations:
storage_load %key - read from persistent storagestorage_store %key, %val - write to persistent storagemapping_load %mapping, %key - load from mappingmapping_store %mapping, %key, %val - store to mappingExternal calls:
call - external call with explicit reentrancy pointsstaticcall - call that can't modify statedelegatecall - call using caller's storage contextEvents:
log operations for event emissionABI encoding:
abi.encode - encode argumentsabi.decode - decode return dataEnvironment access:
get_context msg.sender - transaction senderget_context msg.value - ether valueget_context block.timestamp - block timestampget_context block.number - block numberExplicit overflow behavior:
Checked arithmetic (Solidity ≥0.8):
add.trap, sub.trap, mul.trap - trap on overflowUnchecked arithmetic (Solidity unchecked blocks):
add.wrap, sub.wrap, mul.wrap - wrap silentlyNo-overflow assertions:
add.nsw, add.nuw - poison on signed/unsigned overflowAddress spaces for EVM memory regions:
Address spaces prevent accidental aliasing and enable precise memory analysis.
Solidity:
function transfer(address to, uint256 amount) public {
balances[msg.sender] -= amount;
balances[to] += amount;
}
ThalIR:
function %transfer(i160, i256) public {
block0(v0: i160, v1: i256):
v2 = get_context msg.sender
v3 = mapping_load balances, v2
v4 = isub.i256 v3, v1
mapping_store balances, v2, v4
v5 = mapping_load balances, v0
v6 = iadd.i256 v5, v1
mapping_store balances, v0, v6
return
}
The representation makes control flow and data dependencies explicit through SSA values and block parameters.
ThalIR's explicit representation enables detection of the following vulnerability patterns:
Look for external calls followed by state writes:
%ok = call_ext %target, %value
; risky: state update after external call
mapping_store balances, %key, %new_balance
Classic reentrancy: the called contract can reenter before the balance update, seeing stale state.
Scan for .wrap variants in security-critical paths:
%new_bal = sub.wrap %balance, %amount : i256 ; DANGER: can underflow!
storage_store %bal_key, %new_bal
Solidity ≥0.8 defaults to checked arithmetic, but unchecked {} blocks use .wrap and can silently underflow.
Missing guard checks before privileged operations:
function %withdraw(i256) public {
block0(v0: i256):
; missing: icmp sender, owner
; missing: brif check, block_allowed, block_revert
v1 = storage_load slot0
v2 = call_ext v1, v0
return
}
No sender check means anyone can call withdraw. In ThalIR, the absence of icmp + brif before the call is a red flag.
Incorrect storage key derivation can alias different variables:
; Two mappings using same slot → collision!
%key1 = mapping_key 0, %addr1
%key2 = mapping_key 0, %addr2
If both mappings use slot 0, they'll overwrite each other. ThalIR makes slots explicit so checkers can detect this.
ThalIR uses Cranelift's type system with Solidity-specific conventions:
i1, i8, i16, i32, i64, i128, i256 - two's complement integersi256address = i160 (zero-extended to i256 when needed)bytesN = integer of width 8*N bitsbool - logical true/false (distinct from i1 for clarity)ptr or ptr[as=0] - default pointer (transient memory)ptr[as=1] - calldata pointerptr[as=2] - code pointerAddress space N distinguishes memory regions for alias analysis.
%z = add.wrap %x, %y : i256 ; wrapping add (unchecked)
%z = add.trap %x, %y : i256 ; trapping add (checked)
%z = sub.trap %x, %y : i256 ; checked subtraction
%z = mul.wrap %x, %y : i256 ; unchecked multiplication
%b = icmp.eq %x, %y : i256 -> bool
%b = icmp.slt %x, %y : i256 -> bool ; signed less-than
%b = icmp.ult %x, %y : i256 -> bool ; unsigned less-than
Variants: eq, ne, ult, ule, ugt, uge, slt, sle, sgt, sge.
br block1(%arg1, %arg2)
br_if %cond, then:block1(%arg1), else:block2(%arg2)
ret %v0, %v1, ...
call @function(%arg1, %arg2)
trap [reason="assertion failed"]
%v = storage_load %key
storage_store %key, %val
%v = mapping_load %mapping, %key
mapping_store %mapping, %key, %val
%ret = call_ext %addr, %value
%ret = staticcall %addr
%ret = delegatecall %addr
ThalIR programs must satisfy:
Type correctness: Operand and result types match instruction signatures. Block parameters match branch arguments.
SSA dominance: Every value is defined exactly once before use. Uses must be dominated by definitions.
Control flow: Every block ends with a terminator (br, br_if, ret, trap). No fall-through.
Memory: Load/store types must be sized. Alignment must be a power of two. Address space consistency enforced.
function %sum(i32) -> i32 {
block0(v0: i32):
v1 = iconst.i32 0
v2 = iconst.i32 1
jump block1(v1, v2)
block1(v3: i32, v4: i32):
v5 = icmp slt v4, v0
brif v5, block2(v3, v4), block3(v3)
block2(v6: i32, v7: i32):
v8 = iadd v6, v7
v9 = iadd v7, 1
jump block1(v8, v9)
block3(v10: i32):
return v10
}
function %withdraw(i256) public {
block0(v0: i256):
v1 = get_context msg.sender
v2 = mapping_load balances, v1
v3 = isub.i256 v2, v0
; DANGEROUS: external call before state update!
v4 = call_ext v1, v0
; Reentrancy window: sender can call back before this line
mapping_store balances, v1, v3
return
}
Auditor's note: Move the mapping_store before the call_ext to prevent reentrancy.
use thalir::{transform_solidity_to_ir, ThalIREmitter};
let solidity = "contract Token { ... }";
let contracts = transform_solidity_to_ir(solidity)?;
let emitter = ThalIREmitter::new(contracts);
let ir_text = emitter.emit_to_string(false);
println!("{}", ir_text);
use thalir::obfuscation::{ObfuscationPass, ObfuscationConfig};
let config = ObfuscationConfig::standard();
let mapping = ObfuscationPass::new(config)
.obfuscate_contract(&mut contract)?;
mapping.save("mapping.json")?;
This transforms:
contract MyToken {
function transfer(address to, uint256 amount) { ... }
}
Into:
contract c_a1b2c3 {
function f_d4e5f6(address p0, uint256 p1) { ... }
}
The mapping file lets you convert findings back to original names:
{
"c_a1b2c3": "MyToken",
"f_d4e5f6": "transfer"
}
Auditors can analyze the obfuscated IR without seeing proprietary logic, then report vulnerabilities using deobfuscated names.
| Feature | Cranelift IR | ThalIR |
|---|---|---|
| SSA with block params | ✓ | ✓ |
| Multi-value returns | ✓ | ✓ |
| Explicit traps | ✓ | ✓ (enhanced) |
| Memory model | C11 atomics | C11 atomics |
| Storage operations | ✗ (use memory) | ✓ |
| External calls | ✗ (use indirect call) | ✓ |
| Overflow semantics | wrap/trap | wrap/trap/nsw/nuw |
| Address spaces | ✓ (basic) | ✓ (EVM-specific) |
| Obfuscation | ✗ | ✓ (built-in) |
| Auditing focus | ✗ (compiler focus) | ✓ |
Mozilla Public License 2.0 (MPL-2.0)
Copyright (c) 2025 Gianluca Brigandi