backdisco

Crates.io	backdisco
lib.rs	backdisco
version	0.3.0
created_at	2026-01-23 20:41:13.328113+00
updated_at	2026-01-23 21:38:15.806869+00
description	Discover backend origins from CDN frontends using LLM-assisted pattern analysis and brute force enumeration
homepage
repository
max_upload_size
id	2065432
size	162,134

(johnnyDEP)

documentation

README

backDisco

backDisco is a tool that discovers exposed backend origins from CDN frontends using LLM-assisted pattern analysis and brute force enumeration.

Overview

Given a known pattern of CDN frontend and backend hostname pairs, backDisco uses an LLM to identify naming patterns and then applies those patterns to discover additional backend origins from a list of target frontends. It can also perform brute force subdomain enumeration based on extracted patterns.

Features

Multi-API LLM Support: Automatic detection and support for OpenAI-compatible, Ollama, and Anthropic APIs
LLM-Powered Pattern Analysis: Automatically identifies naming patterns between frontend and backend hostnames
Pattern-Based Discovery: Applies discovered patterns to target frontends to generate backend candidates
Position-Aware Candidate Generation: Parses hostnames by both '.' and '-' delimiters for context-aware expansion
Batched LLM Expansion: Efficiently expands word lists using configurable batch processing
SAN Extraction: Extracts Subject Alternative Names from target certificates to discover additional hosts
Brute Force Enumeration: Generates subdomain candidates based on backend URL patterns with structured generation
LLM Word Expansion: Uses LLM to generate related words for brute force wordlists at specific positions
Model Selection: Interactive model selection when model is not specified
Concurrent Verification: Efficiently tests candidates with configurable concurrency
DNS and HTTP Verification: Validates discovered backends via DNS resolution and HTTP/HTTPS checks

Requirements

Rust (edition 2021)
Access to an LLM API endpoint (OpenAI-compatible, Ollama, or Anthropic)
Network access to target hosts

Installation

From crates.io

cargo install backdisco

From source

git clone <repository-url>
cd backdisco
cargo build --release

The binary will be located at target/release/backdisco.

Configuration

backDisco supports multiple LLM API types with automatic detection:

OpenAI-compatible APIs: Standard OpenAI API format (e.g., https://api.openai.com/v1)
Ollama: Local or remote Ollama instances (e.g., http://localhost:11434/v1 or http://localhost:11434/api)
Anthropic: Claude API endpoints (e.g., https://api.anthropic.com/v1)

The LLM endpoint URL and model can be specified via command-line arguments:

--llmurl: LLM API base URL (defaults to http://localhost:11434/v1)
--model: Model name (if not provided, the tool will fetch available models and prompt for selection)

The API type is automatically detected from the URL using pattern matching, so you don't need to specify it manually.

Usage

Basic Usage

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt

With Custom LLM Configuration

Specify a custom LLM endpoint and model:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --llmurl "https://api.openai.com/v1" \
  --model "gpt-4"

If --model is not specified, the tool will fetch available models and prompt for selection:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --llmurl "http://localhost:11434/v1"

With SAN Extraction

Extract Subject Alternative Names from target certificates:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --extract-sans

With Brute Force Enumeration

Enable brute force subdomain enumeration with position-aware expansion:

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --brute \
  --llm-expand 5 \
  --llm-batch-size 20

The --brute flag enables structured candidate generation that:

Parses hostnames by both '.' and '-' delimiters to preserve positional context
Expands words at each position using LLM (e.g., "dev" → ["dev", "prod", "test", "staging"])
Generates candidates using cartesian products of position expansions
Processes expansions in batches for efficiency

Complete Example

backdisco \
  --front "cdn.example.com" \
  --back "api.internal.example.com" \
  --targets targets.txt \
  --output results.txt \
  --llmurl "http://localhost:11434/v1" \
  --model "llama2" \
  --extract-sans \
  --brute \
  --llm-expand 5 \
  --llm-batch-size 20 \
  --max-depth 3 \
  --concurrency 50 \
  --timeout 5 \
  --gen-wordlist-output candidates.txt \
  --verbose 2

Command-Line Options

Option	Short	Description
`--front`	`-f`	Known frontend hostname (required)
`--back`	`-b`	Known backend hostname (required)
`--targets`	`-t`	File with target frontends, one per line (required)
`--output`	`-o`	Output file (defaults to stdout)
`--verbose`	`-v`	Verbosity level 0-2 (default: 1)
`--dns-only`		Skip HTTP checks, DNS only
`--timeout`		HTTP timeout in seconds (default: 5)
`--concurrency`		Concurrent check limit (default: 50)
`--extract-sans`		Extract SANs from target certificates and add to target list
`--no-sans`		Skip SAN extraction (opposite of --extract-sans)
`--brute`		Enable brute force subdomain enumeration with position-aware expansion
`--llm-expand`		Number of related words to generate per seed word using LLM (default: 5, 0 to disable)
`--llm-batch-size`		Batch size for LLM position expansion (default: 20)
`--max-depth`		Override maximum subdomain depth for brute forcing (default: derived from backend URL)
`--llmurl`		LLM API base URL (default: http://localhost:11434/v1)
`--model`		LLM model name (if not provided, will fetch and prompt for selection)
`--gen-wordlist-output`		Output file for generated candidate list (one hostname per line)

How It Works

LLM Configuration:
- If --llmurl is provided, uses that endpoint; otherwise uses default
- If --model is provided, uses that model; otherwise fetches available models and prompts for selection
- Automatically detects API type (OpenAI-compatible, Ollama, or Anthropic) from the URL
Pre-flight Check: Verifies connectivity to the LLM endpoint and model availability
Backend Analysis:
- Extracts subdomain words, depth, and base domain from the known backend hostname
- Parses hostname structure by both '.' and '-' delimiters for position-aware processing
Target Loading: Reads target frontends from the specified file
SAN Extraction (optional): Extracts Subject Alternative Names from target certificates to discover additional hosts and wildcard patterns
Pattern Discovery: Queries the LLM to identify naming patterns between the frontend and backend hostnames
Pattern Application: Applies discovered patterns to target frontends to generate backend candidates
Brute Force (optional):
- Parses hostname structures from backend, targets, and SANs
- Groups words by position across all structures
- Expands words at each position using batched LLM calls (context-aware: environment, service type, etc.)
- Generates structured candidates using cartesian products of position expansions
- Filters candidates to match backend base domain
Verification: Tests all candidates via DNS resolution and HTTP/HTTPS checks with configurable concurrency
Output: Reports live backends with DNS and HTTP status information

Output Format

The tool provides colored output with different verbosity levels:

Level 0: Minimal output
Level 1 (default): Standard output with progress and results
Level 2: Detailed debug information including pattern details, SAN extraction results, and failed candidates

Example output:

[+] LIVE: api.internal.example.com
    DNS:   192.168.1.100
    HTTPS: 200 OK
    HTTP:  301 Moved Permanently

Examples

Example 1: Simple Pattern Discovery

Given:

Frontend: cdn.example.com
Backend: api.internal.example.com
Pattern discovered: Replace cdn with api.internal

Applied to targets:

cdn.target1.com → api.internal.target1.com
cdn.target2.com → api.internal.target2.com

Example 2: SAN Extraction

If cdn.example.com has a certificate with SANs:

*.internal.example.com
api.internal.example.com
admin.internal.example.com

These will be added to the target list for pattern application.

Example 3: Position-Aware Brute Force with LLM Expansion

Backend: api-v2.internal.example.com

Parsed structure: segments ["api", "v2", "internal"], base "example.com"
Position-aware expansion:
- Position 0 (service): ["api"] → ["api", "rest", "graphql", "rpc", "service", "gateway"]
- Position 1 (version): ["v2"] → ["v2", "v1", "v3", "beta", "alpha", "prod"]
- Position 2 (environment): ["internal"] → ["internal", "int", "private", "corp", "internal"]
Generated candidates (cartesian product):
- api.v2.internal.example.com, rest.v1.internal.example.com, graphql.prod.int.example.com, etc.
All candidates filtered to match base domain example.com

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! Please ensure your code follows Rust best practices and includes appropriate error handling.

Commit count: 0