| Crates.io | newline_normalizer |
| lib.rs | newline_normalizer |
| version | 0.1.6 |
| created_at | 2025-04-25 18:35:23.62159+00 |
| updated_at | 2025-04-25 22:07:02.329623+00 |
| description | Zero-copy newline normalization to \n or \r\n with SIMD acceleration. |
| homepage | |
| repository | https://github.com/digitalcortex/newline_normalizer |
| max_upload_size | |
| id | 1649347 |
| size | 36,755 |
The library for normalizing text into Unix (\n) or DOS (\r\n) newline formats, using fast SIMD search and zero-copy when possible.
str โ call .to_unix_newlines() and .to_dos_newlines() directly.Cow<str> โ skips allocation if no changes are needed.\r and \r\n into consistent Unix (\n) or DOS (\r\n) newlines.use newline_normalizer::{ToUnixNewlines, ToDosNewlines};
let unix = "line1\r\nline2\rline3".to_unix_newlines();
assert_eq!(unix, "line1\nline2\nline3");
let dos = "line1\nline2\nline3".to_dos_newlines();
assert_eq!(dos, "line1\r\nline2\r\nline3");
Benchmarks are in the /benches folder.
Run them using:
cargo bench --bench to_unix
cargo bench --bench to_dos
All suggestions on how to improve the benchmarks are welcome.
Hardware: AMD Ryzen 9 9900X 12-Core Processor with 64 GB RAM.
Rust version: rustc 1.86.0 (05f9846f8 2025-03-31)
Benchmark framework: Criterion
\r\n):| Case | newline-converter |
This crate (newline_normalizer) |
|---|---|---|
| Small Unicode paragraph | ~685.46 ns | ~88.789 ns ๐ |
| Small Unicode paragraph pre-normalized | ~151.39 ns | ~58.350 ns ๐ |
| The Adventures of Sherlock Holmes (608kb) | ~345.27 ยตs | ~138.26 ยตs ๐ |
| The Adventures of Sherlock Holmes (608kb) pre-normalized | ~342.91 ยตs | ~137.54 ยตs ๐ |
Note: Pre-normalized means the input already has correct line endings and does not require changes.
\n):| Case | newline-converter |
regex replace all |
This crate (newline_normalizer) |
|---|---|---|---|
| Small Unicode paragraph | ~1.0858 ยตs | ~101.42 ns | ~24.464 ns ๐ |
| Small Unicode paragraph pre-normalized | ~164.41 ns | ~20.744 ns | ~4.6608 ns ๐ |
| The Adventures of Sherlock Holmes (608kb) | ~680.12 ยตs | ~289.84 ยตs | ~89.150 ยตs ๐ |
| The Adventures of Sherlock Holmes (608kb) pre-normalized | ~318.83 ยตs | ~7.7864 ยตs | ~2.5146 ยตs ๐ |
newline_normalizer can skip allocations and return a borrowed reference.Cow::Borrowed, avoiding an allocation of a new string when the input does not change.This crate does not alter Unicode content. It only rewrites newline boundaries.
All valid UTF-8 sequences are preserved, including:
This crate does not currently normalize U+2028 (LINE SEPARATOR) or U+2029 (PARA SEP). Only ASCII newline formats are converted.
This project is licensed under the MIT License.