| Crates.io | mcp-server-fetch |
| lib.rs | mcp-server-fetch |
| version | 0.1.0 |
| created_at | 2025-09-22 21:38:48.25881+00 |
| updated_at | 2025-09-22 21:38:48.25881+00 |
| description | A powerful Model Context Protocol (MCP) server for web content fetching and HTTP operations |
| homepage | https://github.com/sabry-awad97/rust-mcp-servers |
| repository | https://github.com/sabry-awad97/rust-mcp-servers |
| max_upload_size | |
| id | 1850672 |
| size | 106,721 |
A powerful Model Context Protocol (MCP) server that provides secure web content fetching with robots.txt compliance, HTML-to-markdown conversion, content truncation, and comprehensive HTTP operations.
cargo install mcp-server-fetch
# Start the MCP server (communicates via stdio)
mcp-server-fetch
# Use custom user agent
mcp-server-fetch --user-agent "MyApp/1.0"
# Ignore robots.txt restrictions
mcp-server-fetch --ignore-robots-txt
# Use HTTP proxy
mcp-server-fetch --proxy-url "http://proxy.example.com:8080"
# Enable debug logging
LOG_LEVEL=debug mcp-server-fetch
# Install and run the MCP Inspector to test the server
npx @modelcontextprotocol/inspector mcp-server-fetch
Add to your Claude Desktop MCP configuration:
{
"mcpServers": {
"fetch": {
"command": "mcp-server-fetch",
"args": ["--user-agent", "Claude-Desktop/1.0"]
}
}
}
fetchFetches a URL from the internet and optionally extracts its contents as markdown. This tool provides internet access capabilities with intelligent content processing.
Parameters:
url (string): The URL to fetchmax_length (optional number): Maximum number of characters to return (default: 5000, max: 1,000,000)start_index (optional number): Starting character index for content extraction (default: 0)raw (optional boolean): Return raw HTML content without markdown conversion (default: false)Example Request:
{
"url": "https://example.com/article",
"max_length": 10000,
"start_index": 0,
"raw": false
}
Example Response:
{
"content": [
{
"type": "text",
"text": "Contents of https://example.com/article:\n\n# Article Title\n\nThis is the converted markdown content..."
}
]
}
Content Truncation:
When content exceeds the max_length, the response includes continuation instructions:
Content truncated. Call the fetch tool with a start_index of 5000 to get more content.
Robots.txt Compliance:
The server automatically checks robots.txt for autonomous fetching:
--ignore-robots-txt flag to bypass restrictionsfetchManual URL fetching prompt that retrieves and processes web content for immediate use in conversations.
Parameters:
url (string): The URL to fetchExample Usage:
Use the fetch prompt with URL: https://news.example.com/latest
Response:
Returns a prompt message containing the fetched and processed content, ready for use in the conversation context.
mcp-server-fetch [OPTIONS]
Options:
--user-agent <USER_AGENT> Custom User-Agent string to use for requests
--ignore-robots-txt Ignore robots.txt restrictions
--proxy-url <PROXY_URL> Proxy URL to use for requests (e.g., http://proxy:8080)
-h, --help Print help information
-V, --version Print version information
LOG_LEVEL: Set logging level (trace, debug, info, warn, error)The server uses different user agents depending on the context:
ModelContextProtocol/1.0 (Autonomous; +https://github.com/modelcontextprotocol/servers)ModelContextProtocol/1.0 (User-Specified; +https://github.com/modelcontextprotocol/servers)The server implements several security measures:
Once configured, you can ask Claude:
"Fetch the latest news from https://news.example.com"
"Get the content from this documentation page: https://docs.example.com/api"
"Retrieve the raw HTML from https://example.com without markdown conversion"
"Fetch the first 2000 characters from this long article: https://blog.example.com/long-post"
# Test the server interactively
npx @modelcontextprotocol/inspector mcp-server-fetch
# Try these operations:
# 1. Use fetch tool with different URLs
# 2. Test content truncation with max_length
# 3. Try raw HTML mode
# 4. Test robots.txt compliance
# 5. Use the fetch prompt for immediate content
Fetching Large Content in Chunks:
{
"url": "https://example.com/large-document",
"max_length": 5000,
"start_index": 0
}
Follow up with:
{
"url": "https://example.com/large-document",
"max_length": 5000,
"start_index": 5000
}
Raw HTML Extraction:
{
"url": "https://example.com/complex-page",
"raw": true,
"max_length": 10000
}
Custom Configuration:
# Production setup with custom user agent and proxy
mcp-server-fetch \
--user-agent "MyCompany-AI/2.0 (+https://mycompany.com/bot)" \
--proxy-url "http://corporate-proxy:8080"
The server provides detailed error messages for common issues:
Example Error Response:
{
"error": {
"code": -32602,
"message": "Robots.txt disallows autonomous fetching of this URL",
"data": {
"url": "https://example.com/restricted",
"robots_txt_url": "https://example.com/robots.txt",
"user_agent": "ModelContextProtocol/1.0 (Autonomous)"
}
}
}
Run the comprehensive test suite:
# Run all tests
cargo test
# Run specific test categories
cargo test fetch_service
cargo test validation
cargo test server
# Run with coverage
cargo tarpaulin --out html
# Test with real URLs (requires internet)
cargo test --features integration-tests
# Test robots.txt compliance
cargo test robots_txt_tests
# Test content processing
cargo test content_processing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
cargo buildcargo testThis project follows SOLID principles and Domain-Driven Design:
Run cargo fmt and cargo clippy before submitting.
This project is licensed under the MIT License - see the LICENSE file for details.