backdisco

Crates.iobackdisco
lib.rsbackdisco
version0.3.0
created_at2026-01-23 20:41:13.328113+00
updated_at2026-01-23 21:38:15.806869+00
descriptionDiscover backend origins from CDN frontends using LLM-assisted pattern analysis and brute force enumeration
homepage
repository
max_upload_size
id2065432
size162,134
(johnnyDEP)

documentation

README

backDisco

backDisco is a tool that discovers exposed backend origins from CDN frontends using LLM-assisted pattern analysis and brute force enumeration.

Overview

Given a known pattern of CDN frontend and backend hostname pairs, backDisco uses an LLM to identify naming patterns and then applies those patterns to discover additional backend origins from a list of target frontends. It can also perform brute force subdomain enumeration based on extracted patterns.

Features

  • Multi-API LLM Support: Automatic detection and support for OpenAI-compatible, Ollama, and Anthropic APIs
  • LLM-Powered Pattern Analysis: Automatically identifies naming patterns between frontend and backend hostnames
  • Pattern-Based Discovery: Applies discovered patterns to target frontends to generate backend candidates
  • Position-Aware Candidate Generation: Parses hostnames by both '.' and '-' delimiters for context-aware expansion
  • Batched LLM Expansion: Efficiently expands word lists using configurable batch processing
  • SAN Extraction: Extracts Subject Alternative Names from target certificates to discover additional hosts
  • Brute Force Enumeration: Generates subdomain candidates based on backend URL patterns with structured generation
  • LLM Word Expansion: Uses LLM to generate related words for brute force wordlists at specific positions
  • Model Selection: Interactive model selection when model is not specified
  • Concurrent Verification: Efficiently tests candidates with configurable concurrency
  • DNS and HTTP Verification: Validates discovered backends via DNS resolution and HTTP/HTTPS checks

Requirements

  • Rust (edition 2021)
  • Access to an LLM API endpoint (OpenAI-compatible, Ollama, or Anthropic)
  • Network access to target hosts

Installation

From crates.io

cargo install backdisco

From source

git clone <repository-url>
cd backdisco
cargo build --release

The binary will be located at target/release/backdisco.

Configuration

backDisco supports multiple LLM API types with automatic detection:

  • OpenAI-compatible APIs: Standard OpenAI API format (e.g., https://api.openai.com/v1)
  • Ollama: Local or remote Ollama instances (e.g., http://localhost:11434/v1 or http://localhost:11434/api)
  • Anthropic: Claude API endpoints (e.g., https://api.anthropic.com/v1)

The LLM endpoint URL and model can be specified via command-line arguments:

  • --llmurl: LLM API base URL (defaults to http://localhost:11434/v1)
  • --model: Model name (if not provided, the tool will fetch available models and prompt for selection)

The API type is automatically detected from the URL using pattern matching, so you don't need to specify it manually.

Usage

Basic Usage

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt

With Custom LLM Configuration

Specify a custom LLM endpoint and model:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --llmurl "https://api.openai.com/v1" \
  --model "gpt-4"

If --model is not specified, the tool will fetch available models and prompt for selection:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --llmurl "http://localhost:11434/v1"

With SAN Extraction

Extract Subject Alternative Names from target certificates:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --extract-sans

With Brute Force Enumeration

Enable brute force subdomain enumeration with position-aware expansion:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --brute \
  --llm-expand 5 \
  --llm-batch-size 20

The --brute flag enables structured candidate generation that:

  • Parses hostnames by both '.' and '-' delimiters to preserve positional context
  • Expands words at each position using LLM (e.g., "dev" → ["dev", "prod", "test", "staging"])
  • Generates candidates using cartesian products of position expansions
  • Processes expansions in batches for efficiency

Complete Example

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --output results.txt \
  --llmurl "http://localhost:11434/v1" \
  --model "llama2" \
  --extract-sans \
  --brute \
  --llm-expand 5 \
  --llm-batch-size 20 \
  --max-depth 3 \
  --concurrency 50 \
  --timeout 5 \
  --gen-wordlist-output candidates.txt \
  --verbose 2

Command-Line Options

Option Short Description
--front -f Known frontend hostname (required)
--back -b Known backend hostname (required)
--targets -t File with target frontends, one per line (required)
--output -o Output file (defaults to stdout)
--verbose -v Verbosity level 0-2 (default: 1)
--dns-only Skip HTTP checks, DNS only
--timeout HTTP timeout in seconds (default: 5)
--concurrency Concurrent check limit (default: 50)
--extract-sans Extract SANs from target certificates and add to target list
--no-sans Skip SAN extraction (opposite of --extract-sans)
--brute Enable brute force subdomain enumeration with position-aware expansion
--llm-expand Number of related words to generate per seed word using LLM (default: 5, 0 to disable)
--llm-batch-size Batch size for LLM position expansion (default: 20)
--max-depth Override maximum subdomain depth for brute forcing (default: derived from backend URL)
--llmurl LLM API base URL (default: http://localhost:11434/v1)
--model LLM model name (if not provided, will fetch and prompt for selection)
--gen-wordlist-output Output file for generated candidate list (one hostname per line)

How It Works

  1. LLM Configuration:

    • If --llmurl is provided, uses that endpoint; otherwise uses default
    • If --model is provided, uses that model; otherwise fetches available models and prompts for selection
    • Automatically detects API type (OpenAI-compatible, Ollama, or Anthropic) from the URL
  2. Pre-flight Check: Verifies connectivity to the LLM endpoint and model availability

  3. Backend Analysis:

    • Extracts subdomain words, depth, and base domain from the known backend hostname
    • Parses hostname structure by both '.' and '-' delimiters for position-aware processing
  4. Target Loading: Reads target frontends from the specified file

  5. SAN Extraction (optional): Extracts Subject Alternative Names from target certificates to discover additional hosts and wildcard patterns

  6. Pattern Discovery: Queries the LLM to identify naming patterns between the frontend and backend hostnames

  7. Pattern Application: Applies discovered patterns to target frontends to generate backend candidates

  8. Brute Force (optional):

    • Parses hostname structures from backend, targets, and SANs
    • Groups words by position across all structures
    • Expands words at each position using batched LLM calls (context-aware: environment, service type, etc.)
    • Generates structured candidates using cartesian products of position expansions
    • Filters candidates to match backend base domain
  9. Verification: Tests all candidates via DNS resolution and HTTP/HTTPS checks with configurable concurrency

  10. Output: Reports live backends with DNS and HTTP status information

Output Format

The tool provides colored output with different verbosity levels:

  • Level 0: Minimal output
  • Level 1 (default): Standard output with progress and results
  • Level 2: Detailed debug information including pattern details, SAN extraction results, and failed candidates

Example output:

[+] LIVE: api.internal.example.com
    DNS:   192.168.1.100
    HTTPS: 200 OK
    HTTP:  301 Moved Permanently

Examples

Example 1: Simple Pattern Discovery

Given:

  • Frontend: cdn.example.com
  • Backend: api.internal.example.com
  • Pattern discovered: Replace cdn with api.internal

Applied to targets:

  • cdn.target1.comapi.internal.target1.com
  • cdn.target2.comapi.internal.target2.com

Example 2: SAN Extraction

If cdn.example.com has a certificate with SANs:

  • *.internal.example.com
  • api.internal.example.com
  • admin.internal.example.com

These will be added to the target list for pattern application.

Example 3: Position-Aware Brute Force with LLM Expansion

Backend: api-v2.internal.example.com

  • Parsed structure: segments ["api", "v2", "internal"], base "example.com"
  • Position-aware expansion:
    • Position 0 (service): ["api"]["api", "rest", "graphql", "rpc", "service", "gateway"]
    • Position 1 (version): ["v2"]["v2", "v1", "v3", "beta", "alpha", "prod"]
    • Position 2 (environment): ["internal"]["internal", "int", "private", "corp", "internal"]
  • Generated candidates (cartesian product):
    • api.v2.internal.example.com, rest.v1.internal.example.com, graphql.prod.int.example.com, etc.
  • All candidates filtered to match base domain example.com

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! Please ensure your code follows Rust best practices and includes appropriate error handling.

Commit count: 0

cargo fmt