snm-brightdata-client

Crates.iosnm-brightdata-client
lib.rssnm-brightdata-client
version0.4.0
created_at2025-07-20 16:40:18.299833+00
updated_at2025-09-08 12:50:19.002358+00
descriptionBright Data Wrapper Client Highly compacted Data implemented in Rust with Actix Web
homepage
repositoryhttps://github.com/snmmaurya/snm-brightdata-client
max_upload_size
id1761225
size855,330
Sandeep Maurya (snmmaurya)

documentation

README

🌐 SNM BrightData Client

A powerful Rust crate providing MCP-compatible integration with BrightData's web scraping and data extraction services. Built with Actix Web for high-performance web scraping, search, data extraction, and screenshot capabilities.

Rust License: MIT Maintenance

✨ Features

  • 🔍 Web Search: Search across Google, Bing, Yandex, and DuckDuckGo
  • 🌐 Website Scraping: Extract content in markdown, raw HTML, or structured formats
  • 📊 Data Extraction: Intelligent data extraction from any webpage
  • 📸 Screenshots: Capture website screenshots using BrightData Browser
  • 🤖 MCP Compatible: Full Model Context Protocol support for AI integrations
  • Multiple Interfaces: Library, CLI, and HTTP server
  • 🔒 Authentication: Secure token-based authentication
  • 📈 Rate Limiting: Built-in rate limiting and error handling
  • 🚀 High Performance: Built with Actix Web for production workloads

🚀 Quick Start

Installation

Add to your Cargo.toml:

[dependencies]
snm-brightdata-client = "0.1.0"

Environment Setup

# BrightData Configuration
export BRIGHTDATA_API_TOKEN="your_api_token"
export BRIGHTDATA_BASE_URL="https://api.brightdata.com"
export WEB_UNLOCKER_ZONE="your_zone_name"
export BROWSER_ZONE="your_browser_zone"

# Proxy Credentials (optional)
export BRIGHTDATA_PROXY_USERNAME="your_username"
export BRIGHTDATA_PROXY_PASSWORD="your_password"

# Server Configuration
export MCP_AUTH_TOKEN="your_secure_token"
export PORT="8080"

📖 Usage

As a Library

use snm_brightdata_client::{BrightDataClient, BrightDataConfig};
use snm_brightdata_client::tool::{ToolResolver, Tool};
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize client
    let config = BrightDataConfig::from_env()?;
    let client = BrightDataClient::new(config);

    // Use tools directly
    let resolver = ToolResolver::default();
    let search_tool = resolver.resolve("search_web").unwrap();
    
    let result = search_tool.execute(json!({
        "query": "Rust programming language",
        "engine": "google"
    })).await?;

    println!("Search results: {:#?}", result);
    Ok(())
}

CLI Usage

# Search the web
snm_cli search "Bitcoin price today" --engine google

# Scrape a website
snm_cli scrape https://example.com --format markdown

# Extract data
snm_cli extract https://example.com --format json

# Take screenshot
snm_cli screenshot https://example.com --width 1920 --height 1080

HTTP Server

# Start the server
cargo run --bin snm_server

# Health check
curl http://localhost:8080/health

# List available tools
curl http://localhost:8080/tools

# Use tools via API
curl -X POST http://localhost:8080/invoke \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "tool": "search_web",
    "parameters": {
      "query": "Rust web scraping",
      "engine": "google"
    }
  }'

🛠️ Available Tools

🔍 Search Web (search_web)

Search across multiple search engines with BrightData's unblocking capabilities.

{
  "tool": "search_web",
  "parameters": {
    "query": "your search query",
    "engine": "google"  // google, bing, yandex, duckduckgo
  }
}

🌐 Scrape Website (scrape_website)

Extract content from any website, bypassing anti-bot protections.

{
  "tool": "scrape_website",
  "parameters": {
    "url": "https://example.com",
    "format": "markdown"  // raw, markdown
  }
}

📊 Extract Data (extract_data)

Intelligent data extraction from webpages.

{
  "tool": "extract_data",
  "parameters": {
    "url": "https://example.com"
  }
}

📸 Take Screenshot (take_screenshot)

Capture high-quality screenshots of websites.

{
  "tool": "take_screenshot",
  "parameters": {
    "url": "https://example.com"
  }
}

🤖 MCP Integration

This crate is fully compatible with the Model Context Protocol (MCP), making it easy to integrate with AI systems like Claude.

MCP Server Configuration

{
  "type": "url",
  "url": "https://your-server.com/sse",
  "name": "brightdata-mcp",
  "authorization_token": "your_token",
  "tool_configuration": {
    "enabled": true,
    "allowed_tools": ["search_web", "scrape_website", "extract_data", "take_screenshot"]
  }
}

Example with Claude

curl https://api.anthropic.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: mcp-client-2025-04-04" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 2000,
    "messages": [
      {
        "role": "user",
        "content": "Search for the latest news about Rust programming language"
      }
    ],
    "mcp_servers": [
      {
        "type": "url",
        "url": "https://your-server.com/sse",
        "name": "brightdata-mcp",
        "authorization_token": "your_token"
      }
    ]
  }'

🏗️ API Reference

HTTP Endpoints

Endpoint Method Description
/health GET Health check
/tools GET List available tools
/invoke POST Direct tool invocation
/sse POST Server-Sent Events streaming
/mcp POST MCP JSON-RPC protocol

Response Format

All tools return MCP-compatible responses:

{
  "content": [
    {
      "type": "text",
      "text": "Response content here"
    }
  ],
  "is_error": false,
  "raw_value": {
    // Original response data
  }
}

⚙️ Configuration

BrightData Setup

  1. Sign up for BrightData account
  2. Create zones for Web Unlocker and Browser
  3. Get API credentials from your dashboard
  4. Set environment variables as shown above

Zone Configuration

  • Web Unlocker Zone: For web scraping and search
  • Browser Zone: For screenshots and JavaScript rendering

🔧 Development

Building

# Build library
cargo build

# Build with all features
cargo build --all-features

# Run tests
cargo test

# Run with debug logging
RUST_LOG=debug cargo run --bin snm_server

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

📊 Performance

  • Concurrent Requests: Supports high-concurrency workloads
  • Rate Limiting: Built-in 10 requests/minute per tool (configurable)
  • Timeout Handling: Configurable timeouts for different operations
  • Error Recovery: Automatic retry mechanisms with backoff

🛡️ Security

  • Token Authentication: Secure API access
  • Rate Limiting: Prevents abuse
  • Input Validation: Comprehensive parameter validation
  • CORS Support: Configurable cross-origin requests

📝 Examples

Check out the examples/ directory for:

  • Basic usage examples
  • Integration patterns
  • Advanced configurations
  • Error handling strategies

🤝 Integration Examples

With Anthropic Claude

Use as an MCP server to enhance Claude with web scraping capabilities.

With Custom Applications

Integrate into your Rust applications for:

  • E-commerce price monitoring
  • Content aggregation
  • Market research
  • Competitive analysis

📋 Requirements

  • Rust: 1.70 or later
  • BrightData Account: With API access
  • Network Access: HTTPS outbound connections

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • BrightData for providing robust web scraping infrastructure
  • Actix Web for high-performance HTTP server framework
  • Anthropic for MCP protocol specification

📞 Support

🚀 Roadmap

  • Additional search engines
  • Enhanced data extraction templates
  • WebSocket support for real-time scraping
  • GraphQL API interface
  • Kubernetes deployment examples
  • Advanced proxy rotation
  • Machine learning integration for content classification

Made with ❤️ by SNM Maurya

Commit count: 1

cargo fmt