east-asian-width

Crates.io	east-asian-width
lib.rs	east-asian-width
version	0.1.0
created_at	2025-09-07 17:14:59.860933+00
updated_at	2025-09-07 17:14:59.860933+00
description	Determine the display width of Unicode characters in East Asian contexts
homepage	https://github.com/sabry-awad97/east-asian-width
repository	https://github.com/sabry-awad97/east-asian-width
max_upload_size
id	1828340
size	129,092

Sabry Awad (sabry-awad97)

documentation

https://docs.rs/east-asian-width

README

East Asian Width

A fast, zero-dependency Rust library for determining the display width of Unicode characters in East Asian contexts. This is essential for terminal applications, text editors, and other software that needs to properly align text containing CJK (Chinese, Japanese, Korean) characters.

Features

Fast lookups: Pre-generated lookup tables for O(1) character width determination
Unicode compliant: Based on the official Unicode East Asian Width property
Flexible API: Support for both simple and configurable width calculations
Zero dependencies: No runtime dependencies for the core library
Comprehensive: Handles all Unicode East Asian Width categories
Safe: Validates Unicode code points and provides fallible APIs

Installation

Add this to your Cargo.toml:

[dependencies]
east-asian-width = "0.1.0"

Quick Start

use east_asian_width::{east_asian_width, DisplayWidth};

// Basic usage - get display width
let width = east_asian_width('字' as u32); // Chinese character
assert_eq!(width, DisplayWidth::Wide);
assert_eq!(width.as_u8(), 2); // Width is 2 columns

// ASCII characters are narrow
let width = east_asian_width('A' as u32);
assert_eq!(width, DisplayWidth::Narrow);
assert_eq!(width.as_u8(), 1); // Width is 1 column

Usage

Basic Width Calculation

The primary function east_asian_width() returns a DisplayWidth enum:

use east_asian_width::{east_asian_width, DisplayWidth};

// Wide characters (CJK ideographs, fullwidth chars, etc.)
assert_eq!(east_asian_width(0x4E00), DisplayWidth::Wide);  // 一 (CJK)
assert_eq!(east_asian_width(0xFF21), DisplayWidth::Wide);  // Ａ (fullwidth A)

// Narrow characters (ASCII, halfwidth, etc.)
assert_eq!(east_asian_width(0x0041), DisplayWidth::Narrow); // A (ASCII)
assert_eq!(east_asian_width(0xFF61), DisplayWidth::Narrow); // ｡ (halfwidth)

Handling Ambiguous Characters

Some characters are classified as "ambiguous" and can be displayed as either narrow or wide depending on context:

use east_asian_width::east_asian_width;

let ambiguous_char = 0x00A1; // ¡ (inverted exclamation mark)

// Default: treat ambiguous as narrow
assert_eq!(east_asian_width(ambiguous_char).as_u8(), 1);

// Explicitly treat ambiguous as wide
assert_eq!(east_asian_width((ambiguous_char, true)).as_u8(), 2);

Getting Character Categories

You can also get the specific East Asian Width category:

use east_asian_width::east_asian_width_type;

assert_eq!(east_asian_width_type(0x4E00), "wide");      // CJK ideograph
assert_eq!(east_asian_width_type(0xFF21), "fullwidth"); // Fullwidth A
assert_eq!(east_asian_width_type(0x0041), "narrow");    // ASCII A
assert_eq!(east_asian_width_type(0x00A1), "ambiguous"); // ¡

Error Handling

For applications that need to handle invalid input gracefully:

use east_asian_width::{try_east_asian_width, try_east_asian_width_type};

// These functions return Result<T, EastAsianWidthError>
match try_east_asian_width(0x110000) { // Invalid code point
    Ok(width) => println!("Width: {}", width.as_u8()),
    Err(e) => println!("Error: {}", e),
}

Working with Strings

Calculate the total display width of a string:

use east_asian_width::east_asian_width;

fn string_width(s: &str) -> usize {
    s.chars()
        .map(|c| east_asian_width(c as u32).as_usize())
        .sum()
}

assert_eq!(string_width("Hello"), 5);      // ASCII: 5 chars × 1 = 5
assert_eq!(string_width("こんにちは"), 10);   // Japanese: 5 chars × 2 = 10
assert_eq!(string_width("Hello世界"), 9);   // Mixed: 5×1 + 2×2 = 9

Character Categories

The library recognizes six East Asian Width categories defined by Unicode:

Category	Description	Display Width	Examples
Narrow (Na)	Characters that are always narrow	1	Basic Latin letters
Neutral (N)	Characters without East Asian context	1	Most symbols, punctuation
Halfwidth (H)	Narrow in East Asian context	1	Halfwidth Katakana
Ambiguous (A)	Context-dependent width	1 or 2*	Greek letters, some symbols
Wide (W)	Characters that are always wide	2	CJK ideographs, Hiragana
Fullwidth (F)	Wide in East Asian context	2	Fullwidth ASCII variants

*Ambiguous characters default to width 1 but can be configured to width 2.

Performance

This library is optimized for performance:

O(1) lookups: Uses pre-generated range checks, not hash tables
Zero allocations: All functions work with stack data only
Minimal binary size: Lookup tables are efficiently encoded
No dependencies: Zero runtime dependencies for core functionality

Unicode Compliance

The library is based on the official Unicode East Asian Width property from the Unicode Character Database. The lookup tables are automatically generated from the latest Unicode data to ensure accuracy and completeness.

Development

Building

# Build the library
cargo build --release

# Run tests
cargo test

# Generate documentation
cargo doc --open

Regenerating Lookup Tables

The lookup tables are generated from the official Unicode data:

# This requires the 'build-deps' feature
make generate
# or
cargo run --bin generate --features build-deps

Development Workflow

# Format, lint, test, and generate docs
make prepare-release

# Quick development check
make quick

License

Licensed under either of

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Documentation

API Reference - Comprehensive API documentation
Unicode Reference - Detailed Unicode East Asian Width information
Performance Guide - Performance characteristics and optimization tips

Examples

Run the included examples to see the library in action:

# Terminal width calculation example
cargo run --example terminal_width

# Text alignment example
cargo run --example text_alignment

Related Projects

unicode-width - Similar functionality with different API design
wcswidth - POSIX wcwidth implementation
textwrap - Text wrapping with East Asian width support

Commit count: 5