Crates.io | textract |
lib.rs | textract |
version | 0.1.0 |
source | src |
created_at | 2022-12-05 09:15:26.046222 |
updated_at | 2022-12-05 09:15:26.046222 |
description | Rust library to extract text from various types of files. |
homepage | |
repository | |
max_upload_size | |
id | 730118 |
size | 1,625,301 |
Rust library for extracting text from various file types. supported file extension.
txt odf ods odt pptx xlsx pdf
Use cargo to install textract.
// there is a pdf file at ./tmp.pdf
let content = textract::extract("tmp.pdf","pdf").unwrap;
// content contains raw text in pdf. do whatever you want.
main.rs contains usage of textract library.
The command line as simple.
textract tmp.pdf pdf
This lib is in beta stage with few file types support. but texract supports will keep increasing the file types support. since this project is part of