Crates.io | indicator-extractor |
lib.rs | indicator-extractor |
version | 0.2.0 |
source | src |
created_at | 2024-08-30 17:05:09.27116 |
updated_at | 2024-09-02 17:30:02.9872 |
description | Extract indicators (IP, domain, email, hashes, etc.) from a string or a PDF file |
homepage | |
repository | https://github.com/BenJeau/indicator-extractor |
max_upload_size | |
id | 1357923 |
size | 715,230 |
Extract indicators (IP, domain, email, hashes, etc.) from a string or a PDF file written in Rust.
A WebAssembly build is available on npm and can be installed like the following:
npm install indicator-extractor
Then you can use it like this:
import {
extractIndicators,
extractIndicatorsBytes,
parsePdf,
Indicator,
} from "indicator-extractor/indicator_extractor";
// Extract indicators from a string
const indicators: Indicator[] = extractIndicators("https://github.com");
console.log(indicators); // [{"kind":"url","value":"https://github.com"}]
// Or if you prefer bytes
const extractPdf: Indicator[] = extractIndicatorsBytes(new Uint8Array());
// You can also parse a PDF file to get its text
const pdfData: string = parsePdf(new Uint8Array());
// Where you can then use `extractIndicators` on the text
const pdfIndicators: Indicator[] = extractIndicators(pdfData);
The crate is available on crates.io and can be installed like the following:
cargo add indicator-extractor
use indicator_extractor::parser::extract_indicators;
let result = extract_indicators("https://github.com".as_bytes());
println!("{:?}", result); // Ok(([], [Indicator::Url("https://github.com")])
use indicator_extractor::{data::{PdfExtractor, DataExtractor}, parser::extract_indicators};
let pdf_data = std::fs::read("./somewhere/pdf_file_path.pdf").unwrap();
let pdf_string = PdfExtractor.extract(&pdf_data);
let result = extract_indicators(pdf_string.as_bytes());