Crates.io | ocrmypdf-rs |
lib.rs | ocrmypdf-rs |
version | 0.0.8 |
source | src |
created_at | 2024-05-21 15:49:03.430598 |
updated_at | 2024-05-21 18:14:04.133619 |
description | A sdk for the ocrmypdf command line tool |
homepage | https://github.com/piazin/ocrmypdf-rs |
repository | https://github.com/piazin/ocrmypdf-rs |
max_upload_size | |
id | 1246956 |
size | 100,281 |
A Rust library that adds “layers” of text to images in PDFs, making scanned image PDFs searchable using ocrmypdf, which is a Python application and library.
For everything to work correctly, you need to have it installed on your OS ocrmypdf.
Debian or Ubuntu users can simply use the following:
sudo apt install ocrmypdf
For more information on how to install on different OS, see the installation documents.
Install ocrmypdf-rs with cargo;
[dependencies]
ocrmypdf-rs = "0.0.7"
use ocrmypdf_rs::{Ocr, OcrMyPdf};
fn main() {
let mut ocr = OcrMyPdf::new(None, None, None);
ocr.set_input_path("input.pdf".into())
.set_output_path("output.pdf".into())
.set_args(vec!["--force-ocr".into()])
.execute();
}
When instantiating the OcrMyPdf
structure it is possible to pass the following parameters:
args: Option<Vec<String>>
see about arguments in documentationinput_path: Option<String>
input pdf pathoutput_path: Option<String>
output pdf path[!TIP] 💡 If the input_path or output_path fields are provided, there is no need to provide them at runtime.
use ocrmypdf_rs::{Ocr, OcrMyPdf};
fn main() {
let args: Vec<String> = vec!["-l por".into()];
let input_path = "input.pdf";
let output_path = "output.pdf";
let mut ocr = OcrMyPdf::new(
Some(args),
Some(input_path.into()),
Some(output_path.into()),
);
ocr.execute();
}
[!NOTE] The
-l por
args to work requires the additional selected language to be installed, see how install;