Crates.io | unicode-bom |
lib.rs | unicode-bom |
version | 2.0.3 |
source | src |
created_at | 2018-10-12 15:47:27.154514 |
updated_at | 2023-11-13 18:51:54.829815 |
description | Unicode byte-order mark detection for files and byte arrays. |
homepage | |
repository | https://gitlab.com/philbooth/unicode-bom |
max_upload_size | |
id | 89467 |
size | 37,650 |
Unicode byte-order mark detection for Rust projects.
unicode-bom
will read
the first few bytes from
an array or a file on disk,
then determine whether
a byte-order mark is present.
It won't check the rest of the data to determine whether it's actually valid according to the indicated encoding.
Add it to your dependencies
in Cargo.toml
:
[dependencies]
unicode-bom = "2"
For more detailed information see the API docs, but the general gist is as follows:
use unicode_bom::Bom;
// The BOM can be parsed from a file on disk via the `FromStr` trait...
let bom: Bom = "foo.txt".parse().unwrap();
match bom {
Bom::Null => {
// No BOM was detected
}
Bom::Bocu1 => {
// BOCU-1 BOM was detected
}
Bom::Gb18030 => {
// GB 18030 BOM was detected
}
Bom::Scsu => {
// SCSU BOM was detected
}
Bom::UtfEbcdic => {
// UTF-EBCDIC BOM was detected
}
Bom::Utf1 => {
// UTF-1 BOM was detected
}
Bom::Utf7 => {
// UTF-7 BOM was detected
}
Bom::Utf8 => {
// UTF-8 BOM was detected
}
Bom::Utf16Be => {
// UTF-16 (big-endian) BOM was detected
}
Bom::Utf16Le => {
// UTF-16 (little-endian) BOM was detected
}
Bom::Utf32Be => {
// UTF-32 (big-endian) BOM was detected
}
Bom::Utf32Le => {
// UTF-32 (little-endian) BOM was detected
}
}
// ...or you can detect the BOM in a byte array
let bytes = [0u8, 0u8, 0xfeu8, 0xffu8];
let bom = Bom::from(&bytes[0..]);
assert_eq!(bom, Bom::Utf32Be);
assert_eq(bom.len(), 4);
If you don't already have Rust installed,
get that first using rustup
:
curl https://sh.rustup.rs -sSf | sh
Then you can build the project:
cargo b
And run the tests:
cargo t
Yes.
Yes.