compact-enc-det

Crates.iocompact-enc-det
lib.rscompact-enc-det
version0.1.0
created_at2025-12-12 14:19:25.373507+00
updated_at2025-12-12 14:19:25.373507+00
descriptionRust bindings for Compact Encoding Detection (CED) library - detect character encodings in text
homepagehttps://github.com/greenhat616/compact_enc_det_rs
repositoryhttps://github.com/greenhat616/compact_enc_det_rs
max_upload_size
id1981609
size21,554
Jonson Petard (greenhat616)

documentation

https://docs.rs/compact-enc-det

README

compact-enc-det

Crates.io Documentation License: MIT

High-level Rust bindings for Compact Encoding Detection (CED) - a library for detecting character encodings in text.

Features

  • Safe, ergonomic Rust API
  • Type-safe enums for encodings, languages, and corpus types
  • Support for 75+ character encodings
  • Provides reliability confidence scores
  • Optional hints for improved detection accuracy

Installation

[dependencies]
compact-enc-det = "0.1"

Usage

use compact_enc_det::{detect_encoding, DetectHints, Encoding};

fn main() {
    let text = "Hello, world! 你好世界";
    let detection = detect_encoding(text.as_bytes(), DetectHints::default());

    println!("Detected: {:?}", detection.encoding);
    println!("MIME name: {}", detection.mime_name);
    println!("Reliable: {}", detection.is_reliable);

    assert_eq!(detection.encoding, Encoding::UTF8);
}

Detection with Hints

use compact_enc_det::{detect_encoding, DetectHints, Language, TextCorpusType};

let hints = DetectHints {
    url_hint: "https://example.jp/page.html",
    language_hint: Some(Language::JAPANESE),
    corpus_type: TextCorpusType::WEB_CORPUS,
    ..Default::default()
};

let detection = detect_encoding(japanese_text, hints);

Documentation

For complete documentation, see docs.rs/compact-enc-det.

License

MIT License - see LICENSE file for details.

The underlying C++ library is licensed under Apache License 2.0.

Commit count: 0

cargo fmt