dotext_ed8fc7b

Crates.iodotext_ed8fc7b
lib.rsdotext_ed8fc7b
version0.1.2
sourcesrc
created_at2022-12-05 09:08:03.022232
updated_at2022-12-05 09:08:03.022232
descriptionSimple Rust library to extract readable text from specific document format like Word Document (docx). Currently only support several format, other format coming soon.
homepage
repositoryhttps://github.com/anvie/dotext
max_upload_size
id730113
size1,580,830
Krishna Kanhaiya (kcubeterm)

documentation

README

Document File Text Extractor

Build Status Build status Crates.io

Simple Rust library to extract readable text from specific document format like Word Document (docx). Currently only support several format, other format coming soon.

Supported Document

  • Microsoft Word (docx)
  • Microsoft Excel (xlsx)
  • Microsoft Power Point (pptx)
  • OpenOffice Writer (odt)
  • OpenOffice Spreadsheet (ods)
  • OpenDocument Presentation (odp)
  • PDF

Usage

let mut file = Docx::open("samples/sample.docx").unwrap();
let mut isi = String::new();
let _ = file.read_to_string(&mut isi);
println!("CONTENT:");
println!("----------BEGIN----------");
println!("{}", isi);
println!("----------EOF----------");

Test

$ cargo test

or run example:

$ cargo run --example readdocx data/sample.docx

[] Robin Sy.

Commit count: 36

cargo fmt