| Crates.io | uninews |
| lib.rs | uninews |
| version | 0.2.3 |
| created_at | 2025-02-16 00:55:30.008673+00 |
| updated_at | 2025-08-07 23:52:01.044334+00 |
| description | A universal news scraper for extracting content from various news blogs and news sites. |
| homepage | |
| repository | https://github.com/CloudLLM-ai/uninews |
| max_upload_size | |
| id | 1557242 |
| size | 77,848 |
Uninews is a universal news smart scraper written in Rust.
It downloads a news article from a given URL, cleans the HTML content, and leverages CloudLLM (via OpenAI) to convert the content into Markdown format.
With its powerful translation capabilities, Uninews can seamlessly translate articles into multiple languages while preserving formatting, making it ideal for multilingual content processing.
The final output (via API) is a JSON object containing the article's title, the Markdown-formatted content (translated if specified), and a featured image URL.
It can be used both as a library and as a command-line tool in Linux, Mac and Windows.
When used as a command-line tool, it outputs the final Markdown with the contents of the news article or blog post in the requested language.
uninews --help
A universal news scraper for extracting content from various news blogs and news sites.
Usage: uninews [OPTIONS] <URL>
Arguments:
<URL> The URL of the news article to scrape
Options:
-l, --language <LANGUAGE> Optional output language (default: english) [default: english]
-j, --json Output the result as JSON instead of human-readable text
-h, --help Print help
-V, --version Print version
<article> tag (or falling back to <body>) and removing unwanted elements.universal_scrape function is exposed for easy integration into other Rust projects.universal_scrape function accepts an optional language parameter to specify the language of the article to scrape, otherwise it defaults to English.You need to have Rust and Cargo installed on your system.
If you do have Rust installed, follow these steps:
cargo install uninews
If you don't have Rust installed, follow these steps to install Rust and build from source:
On Unix/macOS:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustc --version
cargo --version
git clone https://github.com/gubatron/uninews.git
cd uninews
make build
make install
# make sure to either export the OPEN_AI_SECRET token before running it
export OPEN_AI_SECRET=sk-xxxxxxxxxxxxxxxxxxxxxxxxxx
uninews <some post url>
# or you can set it on the same statement and not export it
OPEN_AI_SECRET=sk-xxxxxxxxxxxxxxxxxxxxxxxxxx uninews [-l <some language name>] <some post url>
Command line usage
A universal news scraper for extracting content from various news blogs and newsites.
Usage: uninews [OPTIONS] <URL>
Arguments:
<URL> The URL of the news article to scrape
Options:
-l, --language <LANGUAGE> Optional output language (default: english) [default: english]
-j, --json Output the result as JSON instead of human-readable text
-h, --help Print help
-V, --version Print version
Integrating it with your rust project
uninews requires the OPEN_AI_SECRET environment variable to be set, you can set it in your code before calling the universal_scrape function.
If you've loaded your OPEN_AI_SECRET from a file or some other means, you can set it like this so uninews won't break:
std::env::set_var("OPEN_AI_SECRET", my_open_ai_secret);
using uninews::{universal_scrape, Post};
// Scrape the URL and convert its content to Markdown in the requested language.
let post = universal_scrape(&args.url, &args.language, Some(cloudllm::clients::openai::Model::GPT41Mini)).await;
if !post.error.is_empty() {
eprintln!("Error during scraping: {}", post.error);
return;
}
// Print the title and Markdown-formatted content.
println!("{}\n\n{}", post.title, post.content);
Licensed under the MIT License.
Copyright (c) 2025 Ángel León