| Crates.io | websearch |
| lib.rs | websearch |
| version | 0.1.1 |
| created_at | 2025-09-14 22:54:00.4481+00 |
| updated_at | 2025-09-15 09:00:01.108722+00 |
| description | High-performance Rust library and CLI tool for multi-provider web search. Supports Google, ArXiv, DuckDuckGo, Tavily AI, and more with aggregation strategies. |
| homepage | https://github.com/xynehq/websearch |
| repository | https://github.com/xynehq/websearch |
| max_upload_size | |
| id | 1839242 |
| size | 378,269 |
A high-performance Rust library and command-line tool for searching across multiple web search providers. Use it as an SDK in your Rust applications or as a standalone CLI binary for direct command-line searches. Initially based on the PlustOrg/search-sdk TypeScript library, this Rust implementation includes significant additional features and enhancements.
cargo install command gets you bothwebsearch "your query" - that's it!| Provider | Status | API Key Required | Notes |
|---|---|---|---|
| Google Custom Search | ✅ Complete | Yes | Requires API key + Search Engine ID |
| DuckDuckGo | ✅ Complete | No | HTML scraping (text search) |
| Brave Search | ✅ Complete | Yes | High-quality independent search |
| SerpAPI | ✅ Complete | Yes | Google, Bing, Yahoo via SerpAPI |
| Tavily | ✅ Complete | Yes | AI-powered search optimized for LLMs |
| Exa | ✅ Complete | Yes | Semantic search with embeddings |
| SearXNG | ✅ Complete | No | Self-hosted privacy-focused search |
| ArXiv | ✅ Complete | No | Academic papers and research |
Install both the Rust library and CLI binary with a single command:
# Install both library and CLI tool
cargo install --git https://github.com/xynehq/websearch.git
# Verify installation
websearch --version
websearch "hello world" --max-results 1
# Install from GitHub (gets you the latest features)
cargo install --git https://github.com/xynehq/websearch.git
# Test the CLI immediately
websearch "rust programming" --provider duckduckgo --max-results 3
# Install from crates.io (when published)
cargo install websearch
# Test the installation
websearch --help
# Clone and install from source
git clone https://github.com/xynehq/websearch.git
cd websearch
cargo install --path .
# Run tests to verify everything works
cargo test
After installation, you have access to:
✅ CLI Binary: websearch command available globally
✅ Rust Library: Add websearch = "0.1.1" to your Cargo.toml
✅ All Providers: Google, Tavily, DuckDuckGo, ArXiv, and more
✅ No API Keys Needed: Start searching immediately with DuckDuckGo
# Check CLI is installed
websearch --version
# Test search (no API keys needed)
websearch "test query" --max-results 1
# See all available providers
websearch providers
# Test as library in your Rust project
echo '[dependencies]
websearch = "0.1.1"
tokio = { version = "1.0", features = ["full"] }' >> Cargo.toml
Common Issues:
~/.cargo/bin to your PATHrustup update stablecargo install --git https://github.com/xynehq/websearch.git --offlinePlatform Support:
xcode-select --installDockerfile in repository# Search with default provider (DuckDuckGo - no API key needed)
websearch "rust async programming"
# Search with specific provider
websearch "quantum computing" --provider arxiv --max-results 5
# Multi-provider aggregation
websearch multi "artificial intelligence" --strategy aggregate --max-results 3
# List available providers and their status
websearch providers
use websearch::{web_search, providers::GoogleProvider, SearchOptions};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize with Google provider
let google = GoogleProvider::new("YOUR_API_KEY", "YOUR_SEARCH_ENGINE_ID")?;
// Perform search
let results = web_search(SearchOptions {
query: "Rust programming language".to_string(),
max_results: Some(5),
provider: Box::new(google),
..Default::default()
}).await?;
// Process results
for result in results {
println!("{}: {}", result.title, result.url);
if let Some(snippet) = result.snippet {
println!(" {}", snippet);
}
}
Ok(())
}
use websearch::{web_search, providers::GoogleProvider, SearchOptions, types::SafeSearch};
let google = GoogleProvider::new("YOUR_API_KEY", "YOUR_CX_ID")?;
let results = web_search(SearchOptions {
query: "machine learning tutorials".to_string(),
max_results: Some(10),
language: Some("en".to_string()),
region: Some("US".to_string()),
safe_search: Some(SafeSearch::Moderate),
provider: Box::new(google),
..Default::default()
}).await?;
use websearch::{web_search, providers::DuckDuckGoProvider, SearchOptions};
let duckduckgo = DuckDuckGoProvider::new();
let results = web_search(SearchOptions {
query: "privacy-focused search engines".to_string(),
max_results: Some(5),
provider: Box::new(duckduckgo),
..Default::default()
}).await?;
use websearch::{web_search, providers::TavilyProvider, SearchOptions};
// Basic search
let tavily = TavilyProvider::new("tvly-dev-YOUR_API_KEY")?;
// Advanced search with more comprehensive results
let tavily_advanced = TavilyProvider::new_advanced("tvly-dev-YOUR_API_KEY")?
.with_answer(true) // Include AI-generated answers
.with_images(false); // Exclude image results
let results = web_search(SearchOptions {
query: "latest developments in AI and machine learning 2024".to_string(),
max_results: Some(5),
provider: Box::new(tavily_advanced),
..Default::default()
}).await?;
use websearch::{web_search, providers::SerpApiProvider, SearchOptions};
let serpapi = SerpApiProvider::new("YOUR_SERPAPI_KEY")?
.with_engine("google")? // google, bing, yahoo, etc.
.with_location("United States");
let results = web_search(SearchOptions {
query: "machine learning frameworks".to_string(),
max_results: Some(10),
provider: Box::new(serpapi),
..Default::default()
}).await?;
use websearch::{web_search, providers::ExaProvider, SearchOptions};
let exa = ExaProvider::new("YOUR_EXA_API_KEY")?
.with_model("embeddings")? // "keyword" or "embeddings"
.with_contents(true); // Include full content
let results = web_search(SearchOptions {
query: "semantic search technology".to_string(),
max_results: Some(5),
provider: Box::new(exa),
..Default::default()
}).await?;
The SearchOptions struct provides comprehensive configuration:
pub struct SearchOptions {
pub query: String, // Search query
pub id_list: Option<String>, // ArXiv-specific: comma-separated IDs
pub max_results: Option<u32>, // Maximum results (default: 10)
pub language: Option<String>, // Language code (e.g., "en")
pub region: Option<String>, // Region code (e.g., "US")
pub safe_search: Option<SafeSearch>, // Off, Moderate, Strict
pub page: Option<u32>, // Page number for pagination
pub start: Option<u32>, // Start index (ArXiv)
pub sort_by: Option<SortBy>, // Sort order (ArXiv)
pub sort_order: Option<SortOrder>, // Ascending/Descending
pub timeout: Option<u64>, // Request timeout in milliseconds
pub debug: Option<DebugOptions>, // Debug configuration
pub provider: Box<dyn SearchProvider>, // Search provider instance
}
All providers return results in this standardized format:
pub struct SearchResult {
pub url: String, // Result URL
pub title: String, // Page title
pub snippet: Option<String>, // Description/excerpt
pub domain: Option<String>, // Source domain
pub published_date: Option<String>, // Publication date
pub provider: Option<String>, // Provider name
pub raw: Option<serde_json::Value>, // Raw provider data
}
The SDK provides comprehensive error handling with troubleshooting hints:
use websearch::{web_search, SearchOptions, error::SearchError};
match web_search(options).await {
Ok(results) => {
println!("Found {} results", results.len());
}
Err(SearchError::AuthenticationError(msg)) => {
eprintln!("Auth failed: {}", msg);
}
Err(SearchError::RateLimit(msg)) => {
eprintln!("Rate limited: {}", msg);
}
Err(SearchError::HttpError { message, status_code, .. }) => {
eprintln!("HTTP error {}: {}", status_code.unwrap_or(0), message);
}
Err(e) => {
eprintln!("Search failed: {}", e);
}
}
Enable detailed logging for development:
use websearch::{SearchOptions, types::DebugOptions};
let results = web_search(SearchOptions {
query: "test query".to_string(),
debug: Some(DebugOptions {
enabled: true,
log_requests: true,
log_responses: true,
}),
provider: Box::new(provider),
..Default::default()
}).await?;
WebSearch provides a powerful CLI tool for searching from the command line with a simple, intuitive interface:
The CLI uses a simplified structure:
websearch "query" searches using DuckDuckGo (no API key required)websearch "query" --provider google searches with a specific providerwebsearch multi "query" --strategy aggregate for advanced multi-provider searcheswebsearch providers to see all available search enginesAfter installation, you can immediately start searching:
# Quick test with DuckDuckGo (no API key needed)
websearch "rust programming" --provider duckduckgo --max-results 3
# List all available providers
websearch providers
# Get help for any command
websearch --help
# Search with DuckDuckGo (no API key required) - default provider
websearch "rust programming" --max-results 5
# Search with Google (requires API keys)
export GOOGLE_API_KEY="your_key"
export GOOGLE_CX="your_search_engine_id"
websearch "machine learning" --provider google --max-results 10 --format table
# Search with Tavily AI (requires API key)
export TAVILY_API_KEY="tvly-dev-your_key"
websearch "latest AI developments" --provider tavily --format json
# Aggregate results from multiple providers
websearch multi "artificial intelligence" --strategy aggregate --max-results 5
# Use failover strategy (try providers in order until one succeeds)
websearch multi "quantum computing" --strategy failover --providers google,tavily,duckduckgo
# Load balance across available providers
websearch multi "blockchain technology" --strategy load-balance --stats
# Search ArXiv by paper IDs
websearch "" --provider arxiv --arxiv-ids "2301.00001,2301.00002" --max-results 3
# Search ArXiv by query
websearch "quantum machine learning" --provider arxiv --sort-by submitted-date
# List all available providers and their status
websearch providers
# Output shows which providers are available:
# ✅ DuckDuckGo - No API key required
# ❌ Google - Requires GOOGLE_API_KEY and GOOGLE_CX
# ❌ Tavily - Requires TAVILY_API_KEY (AI-powered search)
--help - Show help information--version - Show version information--provider - Search provider (google, tavily, exa, serpapi, duckduckgo, brave, searxng, arxiv) [default: duckduckgo]--max-results - Maximum number of results [default: 10]--language - Language code (e.g., en, es, fr)--region - Region code (e.g., US, UK, DE)--safe-search - Safe search setting (off, moderate, strict)--format - Output format (table, json, simple) [default: table]--debug - Enable debug output--raw - Show raw provider response--arxiv-ids - Comma-separated ArXiv paper IDs (for ArXiv provider)--sort-by - Sort by field (relevance, submitted-date, last-updated-date)--sort-order - Sort order (ascending, descending)multi subcommand)--strategy - Multi-provider strategy (aggregate, failover, load-balance, race)--providers - Specific providers to use--stats - Show provider performance statisticsSet these environment variables to enable different providers:
# Google Custom Search
export GOOGLE_API_KEY="your_google_api_key"
export GOOGLE_CX="your_custom_search_engine_id"
# Tavily AI Search (Recommended for AI/LLM applications)
export TAVILY_API_KEY="tvly-dev-your_api_key"
# SerpAPI (Google, Bing, Yahoo)
export SERPAPI_API_KEY="your_serpapi_key"
# Exa Semantic Search
export EXA_API_KEY="your_exa_api_key"
# Brave Search
export BRAVE_API_KEY="your_brave_api_key"
# SearXNG
export SEARXNG_URL="https://your-searxng-instance.com"
# DuckDuckGo and ArXiv work without API keys
Search Results from duckduckgo
────────────────────────────────────────────────────────────────────────────────
1. Rust Programming Language
🔗 https://www.rust-lang.org/
🌐 rust-lang.org
📄 Rust is a fast, reliable, and productive programming language...
🔍 Provider: duckduckgo
1. Rust Programming Language
https://www.rust-lang.org/
Rust is a fast, reliable, and productive programming language...
[
{
"url": "https://www.rust-lang.org/",
"title": "Rust Programming Language",
"snippet": "Rust is a fast, reliable, and productive programming language...",
"domain": "rust-lang.org",
"provider": "duckduckgo"
}
]
The CLI includes comprehensive automated tests:
# Run CLI integration tests
cargo test --test cli_tests
# Test specific functionality
cargo test --test cli_tests test_providers_command
cargo test --test cli_tests test_duckduckgo_search_dry_run
This Rust implementation provides significant performance improvements over the TypeScript version:
reqwestSet up environment variables for the providers you want to use:
# Google Custom Search
export GOOGLE_API_KEY="your_google_api_key"
export GOOGLE_CX="your_custom_search_engine_id"
# Tavily AI Search
export TAVILY_API_KEY="tvly-dev-your_api_key"
# SerpAPI
export SERPAPI_API_KEY="your_serpapi_key"
# Exa Search
export EXA_API_KEY="your_exa_api_key"
# Run examples
cargo run --example tavily_search # AI-powered search
cargo run --example google_search # Google Custom Search
cargo run --example serpapi_test # SerpAPI
cargo run --example basic_search # DuckDuckGo (no key needed)
# Check compilation
cargo check
# Run tests
cargo test
# Run example with DuckDuckGo (no API key needed)
cargo run --example basic_search
# Build optimized release
cargo build --release
cargo test passesThe SDK follows a clean architecture with these core components:
types.rs: Core types and traitserror.rs: Comprehensive error handlingproviders/: Individual search provider implementationsutils/: HTTP client and debugging utilitieslib.rs: Main API with the web_search() functionMIT License - See the TypeScript version's LICENSE file for details.
The SDK includes comprehensive test coverage:
# Run all tests
cargo test
# Run unit tests only
cargo test --lib
# Run integration tests
cargo test --test integration_tests
# Run Tavily integration tests
cargo test --test tavily_integration_tests
# Run with test script
./test.sh
Test Coverage:
This Rust implementation was initially based on the excellent PlustOrg/search-sdk TypeScript library. While maintaining the same core API design and provider support, this version has evolved beyond a simple port to include additional functionality.
Performance Improvements:
Additional Functionality:
Rust-Specific Benefits:
This Rust port maintains conceptual API compatibility with the TypeScript version while adapting to Rust idioms:
// TypeScript version
const results = await webSearch({
query: 'rust programming',
maxResults: 5,
provider: googleProvider
});
// Rust version
let results = web_search(SearchOptions {
query: "rust programming".to_string(),
max_results: Some(5),
provider: Box::new(google_provider),
..Default::default()
}).await?;
# Install once, get both CLI and library
cargo install --git https://github.com/xynehq/websearch.git
# Start searching immediately (no API keys needed)
websearch "your search query"
# Or use in your Rust project
echo 'websearch = "0.1.1"' >> Cargo.toml
Perfect for:
This Rust implementation was initially based on PlustOrg/search-sdk and has evolved to include additional features while maintaining API compatibility and leveraging Rust's performance and safety benefits.