uxn-tal

Crates.iouxn-tal
lib.rsuxn-tal
version0.1.8
created_at2025-08-02 03:21:20.565328+00
updated_at2025-08-24 06:12:20.402356+00
descriptionA Rust library for assembling TAL (Tal Assembly Language) files into UXN ROM files
homepage
repositoryhttps://github.com/yourusername/uxn-tal
max_upload_size
id1778127
size667,805
David Horner (davehorner)

documentation

README

UXN-TAL Assembler

A fast and comprehensive Rust library for assembling TAL (Tal Assembly Language) files into UXN ROM files.

This library provides functionality to parse TAL source code and generate bytecode compatible with the UXN virtual machine, with full symbol generation support for debugging.

Features

✅ Complete UXN Support

  • Errors report with line numbers!
  • All UXN Opcodes: Full support for all 256 UXN instructions
  • Mode Flags: Short mode (2), return mode (r), and keep mode (k)
  • Verified Compatibility: Embedded official opcode table from UXN specification uxntal_reference.html
  • ROM Generation: Compatible with uxncli, uxnemu, and other UXN emulators

✅ TAL Syntax Support

  • Literals: Hex (#12, #1234), character ('A'), decimal, and binary
  • Labels: Main labels (@main) and sublabels (&loop)
  • References: Absolute (;main), relative (,loop), and sublabel references
  • Padding: Address padding (|0100) and byte skipping ($10)
  • Raw Strings: String literals with automatic null termination
  • Comments: Parenthetical comments ( like this )

✅ Symbol File Generation

  • Text Format: Human-readable address-symbol pairs for debugging
  • Binary Format: Compact binary format for tool integration
  • Debug Support: Full symbol table extraction for development tools

✅ Ergonomic API

  • Single File Assembly: Simple one-function assembly with symbol generation
  • Batch Processing: Process entire directories of TAL files
  • Flexible Options: With or without symbol file generation
  • Comprehensive Error Handling: Detailed error messages with line numbers

Quick Start

Add this to your Cargo.toml:

[dependencies]
uxn-tal = "0.1.0"

Basic Usage

use uxn_tal::Assembler;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let tal_source = r#"
        |0100 @main
            #48 #65 #6c #6c #6f  ( "Hello" )
            #20 #57 #6f #72 #6c #64 #21 #0a  ( " World!\n" )
            BRK
    "#;
    
    let mut assembler = Assembler::new();
    let rom = assembler.assemble(tal_source)?;
    
    // Save ROM file
    std::fs::write("hello.rom", rom)?;
    
    // Generate symbol file
    let symbols = assembler.generate_symbol_file();
    std::fs::write("hello.sym", symbols)?;
    
    Ok(())
}

File-Based Assembly

use uxn_tal::assemble_file_with_symbols;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Assembles hello.tal -> hello.rom + hello.sym
    let (rom_path, sym_path, size) = assemble_file_with_symbols("hello.tal")?;
    
    println!("Generated {} bytes: {} + {}", 
        size, 
        rom_path.display(), 
        sym_path.display()
    );
    
    Ok(())
}

Batch Processing

use uxn_tal::assemble_directory;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Process all .tal files in a directory
    let results = assemble_directory("examples/", true)?; // true = generate symbols
    
    for (tal_path, rom_path, sym_path, size) in results {
        println!("✅ {} -> {} ({} bytes)", 
            tal_path.file_name().unwrap().to_string_lossy(),
            rom_path.file_name().unwrap().to_string_lossy(),
            size
        );
        if let Some(sym_path) = sym_path {
            println!("  + {}", sym_path.file_name().unwrap().to_string_lossy());
        }
    }
    
    Ok(())
}

TAL Syntax Reference

Literals

#12      ( hex byte literal )
#1234    ( hex short literal )
'A       ( character literal )
42       ( decimal literal )
#b101010 ( binary literal )

Labels and References

@main           ( define main label )
&loop           ( define sublabel under current main label )
;main           ( absolute reference to main label )
,loop           ( relative reference to sublabel )
/target         ( relative reference to any label )

Instructions with Mode Flags

ADD      ( basic instruction )
ADD2     ( short mode - operates on 16-bit values )
ADDr     ( return mode - operates on return stack )
ADDk     ( keep mode - keeps operands on stack )
ADD2rk   ( combined modes )

Padding and Skipping

|0100    ( pad to absolute address 0x0100 )
$10      ( skip 16 bytes, filling with zeros )

Raw Strings

"Hello World! 00  ( raw string with explicit null terminator )

Symbol File Format

Text Format (.sym)

0100 main
010A main/loop
0115 data
0125 end

Binary Format

Compact binary format with:

  • Address as little-endian u16
  • Symbol name as null-terminated string
  • Repeating for each symbol

API Reference

Core Types

Assembler

Main assembler instance for processing TAL source code.

impl Assembler {
    pub fn new() -> Self
    pub fn assemble(&mut self, source: &str) -> Result<Vec<u8>, AssemblerError>
    pub fn generate_symbol_file(&self) -> String
    pub fn generate_symbol_file_binary(&self) -> Vec<u8>
    pub fn symbols(&self) -> &HashMap<String, Symbol>
}

AssemblerError

Comprehensive error types with detailed messages:

  • SyntaxError { line: usize, message: String }
  • UnknownOpcode { opcode: String }
  • UndefinedLabel { label: String }
  • DuplicateLabel { label: String }
  • InvalidNumber { value: String }
  • Io(std::io::Error)

Convenience Functions

// Assemble single file
pub fn assemble_file<P: AsRef<Path>>(input_path: P) -> Result<Vec<u8>, AssemblerError>

// Assemble file to ROM
pub fn assemble_file_to_rom<P: AsRef<Path>, Q: AsRef<Path>>(
    input_path: P, 
    output_path: Q
) -> Result<usize, AssemblerError>

// Auto-generate ROM with same basename
pub fn assemble_file_auto<P: AsRef<Path>>(
    input_path: P
) -> Result<(PathBuf, usize), AssemblerError>

// Assemble with symbol generation
pub fn assemble_file_with_symbols<P: AsRef<Path>>(
    input_path: P
) -> Result<(PathBuf, PathBuf, usize), AssemblerError>

// Batch process directory
pub fn assemble_directory<P: AsRef<Path>>(
    dir_path: P, 
    generate_symbols: bool
) -> Result<Vec<(PathBuf, PathBuf, Option<PathBuf>, usize)>, AssemblerError>

Examples

The examples/ directory contains comprehensive demonstrations:

  • comprehensive_demo.rs - Complete feature showcase
  • batch_assembler.rs - Batch processing with CLI options
  • test_symbols.rs - Symbol generation testing
  • hello_world.rs - Basic usage example

Run examples with:

cargo run --example comprehensive_demo
cargo run --example batch_assembler -- --symbols

Testing

The library includes comprehensive tests covering all functionality:

# Run all tests
cargo test

# Run specific test
cargo test test_skip_directive

# Run with verbose output
cargo test -- --nocapture

Compatibility

Verified Compatible

  • uxncli - Command-line UXN emulator
  • uxnemu - UXN emulator with GUI
  • ✅ Official UXN opcode specification

Generated ROMs

  • Compatible with all standard UXN emulators
  • Proper memory layout and addressing
  • Correctly trimmed ROM files (excludes zero page padding)

Current Limitations

Important: This assembler currently supports core TAL syntax but cannot compile most existing TAL programs that use advanced features. The following are not yet implemented:

  • Macros (%MACRO { ... }) - Preprocessor-style definitions (used in most demos)
  • Device Access (.Screen/width) - Direct device port access syntax (ubiquitous in UXN programs)
  • Inline Assembly ([ LIT2 01 -Screen/auto ]) - Raw bytecode blocks (common optimization)
  • Include Files - File inclusion directives

Reality Check: The demo TAL files in uxn/projects/examples/demos all failed to assemble because they extensively use these features. This assembler is currently best suited for:

  • Learning UXN assembly fundamentals
  • Simple educational programs
  • Basic ROM generation experiments
  • Projects that stick to core instruction sets

For production UXN development, consider using the official uxnasm or ruxnasm assemblers until these features are implemented.

Performance

  • Fast Assembly: Optimized two-pass assembler
  • Memory Efficient: Minimal memory usage during assembly
  • Batch Processing: Efficient directory processing
  • Error Recovery: Detailed error reporting with line numbers

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass: cargo test
  5. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Related Projects

  • UXN - The UXN virtual machine
  • uxnasm - Original C implementation
  • ruxnasm - Another Rust implementation

Acknowledgments

  • Devine Lu Linvega - Creator of UXN and TAL
  • UXN Community - Documentation and examples
  • 100 Rabbits - UXN ecosystem development
  • Binary: #b10101010
  • Character: 'A, 'B
  • Strings: "Hello World"

Labels and References

  • Label Definition: @main, @loop
  • Label Reference: ;main, ;data
  • Sublabel Definition: &loop, &end
  • Sublabel Reference: ,loop, ,end
  • Relative Reference: /loop (for short jumps)

Instructions

All UXN instructions are supported with mode flags:

  • Base Instructions: ADD, SUB, MUL, DIV, JMP, JSR, etc.
  • Short Mode: ADD2, LDA2, STA2 (operates on 16-bit values)
  • Return Mode: ADDr, LDAr (uses return stack)
  • Keep Mode: ADDk, LDAk (doesn't consume stack values)
  • Combined Modes: ADD2rk, LDA2k, etc.

Directives

  • Padding: |0100 (pad to address 0x0100)
  • Comments: ( this is a comment )

Examples

Hello World

( Hello World Program )

|0100 @reset
    ;hello-world print-string
BRK

@print-string ( string* -- )
    &loop
        LDAk #18 DEO INC2 
        LDAk ,&loop JCN
    POP2 JMP2r

@hello-world
    "Hello 20 "World! 0a 00

Counter Program

( Counter Program )

|0100 @reset
    #00 ;counter STA
    
    &main-loop
        ;counter LDA 
        DUP #30 ADD #18 DEO ( print digit )
        #20 #18 DEO ( print space )
        
        INC DUP ;counter STA
        #0a LTH ,&main-loop JCN
        
        #0a #18 DEO ( newline )
BRK

@counter $1

Running Examples

# Run the hello world example
cargo run --example hello_world

# Run the counter example  
cargo run --example counter

Testing

Run the test suite:

cargo test

Architecture

The assembler uses a two-pass approach:

  1. First Pass: Tokenize source, parse into AST, collect label definitions, and generate initial code
  2. Second Pass: Resolve forward references and patch addresses

Modules

  • lexer: Tokenizes TAL source code
  • parser: Converts tokens into Abstract Syntax Tree
  • opcodes: UXN instruction definitions and mode handling
  • assembler: Main assembly logic with symbol table management
  • rom: ROM file generation and binary output
  • error: Comprehensive error types with detailed messages

Error Handling

The library provides detailed error messages for common issues:

  • Syntax errors with line numbers
  • Undefined label references
  • Duplicate label definitions
  • Invalid number formats
  • Unknown opcodes
  • ROM size limits
  • Invalid addressing modes

Current Limitations

Update: This assembler now supports core TAL syntax and advanced features found in most production TAL programs. The following features are now implemented:

  • Macros (%MACRO { ... }): Preprocessor-style definitions
  • Device Access (.Screen/width): Direct device port access syntax
  • Inline Assembly ([ ... ]): Raw bytecode blocks
  • Include Files: File inclusion directives

You can now assemble most existing TAL programs, including those from official UXN projects and demos.

New Feature Support

  • Macros: Write and use macros for reusable code blocks
  • Device Access: Use device port syntax for direct hardware access
  • Inline Assembly: Embed raw bytecode blocks for optimization
  • Include Files: Modularize programs with file inclusion

Performance & Error Handling

  • Fast Assembly: Optimized two-pass assembler
  • Memory Efficient: Minimal memory usage during assembly
  • Batch Processing: Efficient directory processing
  • Error Reporting: Detailed error messages with line numbers, file names, and source context

Error Handling

All errors include file name, line number, position, and source line for actionable debugging. Common error types:

  • Syntax errors (with file, line, position, and source context)
  • Undefined label references
  • Duplicate label definitions
  • Invalid number formats
  • Unknown opcodes
  • ROM size limits
  • Invalid addressing modes

Assembler

Main assembler instance for processing TAL source code.

impl Assembler {
    pub fn new() -> Self
    pub fn assemble(&mut self, source: &str, context: Option<String>) -> Result<Vec<u8>, AssemblerError>
    pub fn generate_symbol_file(&self) -> String
    pub fn generate_symbol_file_binary(&self) -> Vec<u8>
    pub fn symbols(&self) -> &HashMap<String, Symbol>
}

AssemblerError

Comprehensive error types with detailed messages:

  • SyntaxError { path: String, line: usize, position: usize, message: String, source_line: String }
  • UnknownOpcode { opcode: String }
  • UndefinedLabel { label: String }
  • DuplicateLabel { label: String }
  • InvalidNumber { value: String }
  • Io(std::io::Error)

License

This project is licensed under the MIT License - see the LICENSE file for details.

The repository includes ROMs and TAL files from the uxn reference implementation, which are © Devine Lu Linvega and released under the MIT license

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Related Projects

  • UXN - The UXN virtual machine
  • TAL - TAL assembly language documentation
Commit count: 0

cargo fmt