tesseract-rs

Crates.iotesseract-rs
lib.rstesseract-rs
version0.1.18
sourcesrc
created_at2024-10-14 16:15:58.696921
updated_at2024-10-22 17:33:11.619772
descriptionRust bindings for Tesseract OCR with optional built-in compilation
homepage
repositoryhttps://github.com/cafercangundogdu/tesseract-rs
max_upload_size
id1408445
size148,247
Cafer Can Gündoğdu (cafercangundogdu)

documentation

https://docs.rs/tesseract-rs

README

tesseract-rs

tesseract-rs is a Rust binding for Tesseract OCR with built-in compilation of Tesseract and Leptonica libraries. This project aims to provide a safe and idiomatic Rust interface to Tesseract's functionality while handling the complexity of compiling the underlying C++ libraries.

Features

  • Safe Rust bindings for Tesseract OCR
  • Built-in compilation of Tesseract and Leptonica
  • Automatic download of Tesseract training data
  • High-level Rust API for common OCR tasks

Installation

Add this to your Cargo.toml:

[dependencies]
tesseract-rs = { version = "0.1.18", features = ["build-tesseract"] }

System Requirements

To build this crate, you need:

  • A C++ compiler (e.g., gcc, clang)
  • CMake
  • Internet connection (for downloading Tesseract training data)

Usage

Here's a basic example of how to use tesseract-rs:

use tesseract_rs::TesseractAPI;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api = TesseractAPI::new();
    api.init("path/to/tessdata", "eng")?;

    // Assume we have a 3x3 black and white image with a "1"
    let image_data: Vec<u8> = vec![
        0xFF, 0x00, 0xFF,
        0x00, 0x00, 0xFF,
        0x00, 0x00, 0xFF,
    ];

    api.set_image(&image_data, 3, 3, 1, 3);
    api.set_variable("tessedit_char_whitelist", "0123456789")?;

    let text = api.get_utf8_text()?;
    println!("Recognized text: {}", text);

    Ok(())
}

Building

The crate will automatically download and compile Tesseract and Leptonica during the build process. This may take some time on the first build.

Documentation

For more detailed information, please check the API documentation.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributors

Contribution

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgements

This project uses Tesseract OCR and Leptonica. We are grateful to the maintainers and contributors of these projects.

Commit count: 48

cargo fmt