ansi-escape-sequences

Crates.ioansi-escape-sequences
lib.rsansi-escape-sequences
version0.1.0
created_at2025-09-08 08:50:45.913372+00
updated_at2025-09-08 08:50:45.913372+00
descriptionHigh-performance Rust library for detecting, matching, and processing ANSI escape sequences in terminal text with zero-allocation static regex patterns
homepagehttps://github.com/sabry-awad97/ansi-escape-sequences
repositoryhttps://github.com/sabry-awad97/ansi-escape-sequences
max_upload_size
id1829012
size87,454
Sabry Awad (sabry-awad97)

documentation

https://docs.rs/ansi-escape-sequences

README

ANSI Escape Sequences - Professional ANSI Escape Sequence Processing

Crates.io Documentation License Build Status

A high-performance, zero-allocation Rust library for detecting, matching, and processing ANSI escape sequences in terminal text. This crate provides comprehensive support for ANSI/VT100 terminal control sequences with optimized regex patterns and convenient APIs.

🚀 Features

  • High Performance: Pre-compiled regex patterns with zero-allocation static references
  • Comprehensive Coverage: Supports CSI, OSC, and other ANSI escape sequences
  • Flexible API: Multiple matching modes (global, first-only, owned)
  • Minimal Dependencies: Only depends on the regex crate
  • Memory Efficient: Uses LazyLock for optimal memory usage
  • Thread Safe: All operations are thread-safe and can be used in concurrent environments
  • Rich API: Convenience functions for common operations (detection, stripping, extraction)
  • Standards Compliant: Follows ANSI X3.64, ISO/IEC 6429, and VT100/VT220 specifications

📦 Installation

Add this to your Cargo.toml:

[dependencies]
ansi-escape-sequences = "0.1"

🔧 Quick Start

Basic Usage

use ansi_escape_sequences::{ansi_regex, strip_ansi, has_ansi};

// Check if text contains ANSI sequences
let colored_text = "\u{001B}[31mHello\u{001B}[0m World";
assert!(has_ansi(colored_text));

// Strip all ANSI sequences
let clean_text = strip_ansi(colored_text);
assert_eq!(clean_text, "Hello World");

// Get a regex for custom processing
let regex = ansi_regex(None);
let matches: Vec<_> = regex.find_iter(colored_text).collect();
assert_eq!(matches.len(), 2); // Found 2 ANSI sequences

Advanced Configuration

use ansi_regex::{ansi_regex, ansi_regex_owned, AnsiRegexOptions};

// Match only the first ANSI sequence
let options = AnsiRegexOptions::new().only_first();
let regex = ansi_regex(Some(options));

// Global matching (default behavior)
let options = AnsiRegexOptions::new().global();
let regex = ansi_regex(Some(options));

// Get an owned regex instance (useful for storing in structs)
let owned_regex = ansi_regex_owned(None);

// Find all ANSI sequences with detailed information
let sequences = AnsiRegexOptions::find_ansi_sequences(colored_text);
for seq in sequences {
    println!("Found ANSI sequence: {:?} at position {}",
             seq.as_str(), seq.start());
}

📖 ANSI Escape Sequences Overview

ANSI escape sequences are special character sequences used to control terminal formatting, cursor positioning, and other terminal behaviors. They typically start with the ESC character (\u{001B} or \x1B) followed by specific control characters.

Common ANSI Sequence Types:

  • CSI (Control Sequence Introducer): ESC[ followed by parameters and a final character
    • Example: \u{001B}[31m (red foreground color)
    • Example: \u{001B}[2J (clear screen)
  • OSC (Operating System Command): ESC] followed by data and terminated by ST
    • Example: \u{001B}]0;Window Title\u{0007} (set window title)
  • C1 Control Characters: Direct 8-bit control characters like \u{009B} (CSI)

🎯 Use Cases

  • Log Processing: Clean ANSI codes from log files for storage or analysis
  • Terminal Emulators: Parse and process terminal control sequences
  • Text Processing: Extract plain text content from terminal-formatted strings
  • CLI Tools: Handle colored output in command-line applications
  • Web Applications: Convert terminal output for web display
  • Testing: Validate terminal output in automated tests

⚡ Performance Notes

  • Static Compilation: Regex patterns are compiled once using LazyLock and reused
  • Zero Allocation: Static references avoid repeated allocations
  • Optimized Patterns: Carefully crafted regex patterns minimize backtracking
  • Thread Safety: All operations are thread-safe with no synchronization overhead
  • Efficient Detection: has_ansi() stops at first match for optimal performance
  • Bounded Matching: Parameter lengths are limited to prevent ReDoS attacks

Performance Tips

  • Use has_ansi() for quick detection before processing
  • Use only_first() option when you only need to detect presence
  • Static regex references (ansi_regex()) are faster than owned instances
  • For high-frequency operations, store regex references in static variables

🔍 Supported ANSI Sequences

This library recognizes and matches:

  • CSI sequences: ESC[ + parameters + final byte (colors, cursor movement, screen clearing)
  • OSC sequences: ESC] + data + string terminator (window titles, hyperlinks)
  • C1 control characters: Direct 8-bit equivalents (\u{009B} for CSI)
  • String terminators: BEL (\u{0007}), ESC\ (\u{001B}\u{005C}), ST (\u{009C})
  • Parameter formats: Both semicolon (;) and colon (:) separated parameters
  • Extended sequences: RGB colors, indexed colors, underline variants, and more

📚 API Reference

Core Functions

ansi_regex(options: Option<AnsiRegexOptions>) -> &'static Regex

Returns a static reference to a pre-compiled regex for maximum performance.

ansi_regex_owned(options: Option<AnsiRegexOptions>) -> Regex

Returns an owned regex instance, useful for storing in structs or when static references aren't suitable.

strip_ansi(text: &str) -> String

Removes all ANSI escape sequences from text, returning clean plain text.

has_ansi(text: &str) -> bool

Fast detection of ANSI sequences - returns true if any are found.

Configuration Options

AnsiRegexOptions

  • new() - Create default options (global matching)
  • only_first() - Configure for first-match-only (more efficient for detection)
  • global() - Configure for global matching (finds all sequences)
  • find_ansi_sequences(text: &str) - Extract all ANSI sequences with position info

Examples

use ansi_regex::{ansi_regex, ansi_regex_owned, strip_ansi, has_ansi, AnsiRegexOptions};

// Quick detection
let text = "\u{001B}[31mcolored\u{001B}[0m text";
if has_ansi(text) {
    let clean = strip_ansi(text);
    println!("Cleaned: {}", clean);
}

// Custom processing with static regex
let regex = ansi_regex(None);
let count = regex.find_iter(text).count();

// Owned regex for struct storage
struct TextProcessor {
    ansi_regex: regex::Regex,
}

impl TextProcessor {
    fn new() -> Self {
        Self {
            ansi_regex: ansi_regex_owned(None),
        }
    }
}

// Detailed sequence analysis
let sequences = AnsiRegexOptions::find_ansi_sequences(text);
for seq in sequences {
    println!("ANSI sequence '{}' at position {}", seq.as_str(), seq.start());
}

🎯 Examples

Running the Comprehensive Demo

The library includes a comprehensive example showcasing real-world usage patterns:

cargo run --example comprehensive_demo

This demo covers:

  • Basic detection and cleaning operations
  • Advanced sequence processing and analysis
  • Log processing pipelines
  • Performance benchmarking
  • CLI output processing and HTML conversion

Real-world Usage Patterns

use ansi_regex::{ansi_regex, strip_ansi, has_ansi, AnsiRegexOptions};

// Log file processing
fn clean_log_file(content: &str) -> String {
    content.lines()
        .map(|line| if has_ansi(line) { strip_ansi(line) } else { line.to_string() })
        .collect::<Vec<_>>()
        .join("\n")
}

// Terminal output analysis
fn analyze_terminal_output(output: &str) {
    let sequences = AnsiRegexOptions::find_ansi_sequences(output);
    println!("Found {} ANSI sequences", sequences.len());

    for seq in sequences {
        let seq_type = if seq.as_str().contains('m') { "Color/Style" }
                      else if seq.as_str().contains('H') { "Cursor Position" }
                      else { "Other" };
        println!("  {}: {}", seq_type, seq.as_str());
    }
}

// Web application: Convert ANSI to HTML
fn ansi_to_html(text: &str) -> String {
    let regex = ansi_regex(None);
    regex.replace_all(text, |caps: &regex::Captures| {
        match caps.get(0).unwrap().as_str() {
            "\u{001B}[31m" => "<span style='color: red'>",
            "\u{001B}[32m" => "<span style='color: green'>",
            "\u{001B}[0m" => "</span>",
            _ => "",
        }
    }).to_string()
}

📚 Documentation

For detailed API documentation, visit docs.rs/ansi-escape-sequences.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

📄 License

This project is licensed under either of

at your option.

🙏 Acknowledgments

This library is inspired by the JavaScript ansi-regex library and aims to provide similar functionality for Rust applications.

Commit count: 9

cargo fmt