utf8-fix

Crates.ioutf8-fix
lib.rsutf8-fix
version0.1.1
created_at2025-11-17 18:41:22.820697+00
updated_at2025-11-18 07:12:01.502643+00
descriptionFix invalid UTF-8 sequences in-place while preserving buffer size - useful for fuzzing and mutation testing
homepage
repositoryhttps://gitlab.com/andrzej1_1/utf8-fix
max_upload_size
id1937319
size10,589
(koxu1996)

documentation

README

= utf8-fix

Rust library for fixing invalid UTF-8 sequences in-place.

== Overview

This library processes byte slices containing potentially invalid UTF-8 data and fixes common encoding errors:

  • Overlong encodings
  • UTF-16 surrogates
  • Out-of-range codepoints

Valid UTF-8 sequences are preserved unchanged. The buffer size remains the same - invalid codepoints are replaced with valid ones of the same byte length.

== Use Cases

This library is particularly useful for:

  • Fuzzing and mutation testing: Generate valid UTF-8 test inputs from arbitrary byte sequences while preserving the original buffer size.
  • Data sanitization: Clean up corrupted UTF-8 data without changing the data length.

== Usage

[source,rust]

use utf8_fix::fix_utf8_string; use rand::thread_rng;

let mut data = b"Hello, world!".to_vec(); let mut rng = thread_rng();

fix_utf8_string(&mut data, &mut rng);

== Features

This library is no_std by default.

  • std: Enables standard library support including Display and Error trait implementations.

To use with std:

[source,toml]

[dependencies] utf8-fix = { version = "0.1.1", features = ["std"] }

Commit count: 0

cargo fmt