scribe-scaling

Crates.io	scribe-scaling
lib.rs	scribe-scaling
version	0.5.1
created_at	2025-09-13 19:29:18.426355+00
updated_at	2025-12-01 20:16:27.527825+00
description	High-performance scaling optimizations for large repositories
homepage	https://github.com/sibyllinesoft/scribe
repository	https://github.com/sibyllinesoft/scribe
max_upload_size
id	1838057
size	311,319

(githubcustomerserviceistrash)

documentation

https://docs.rs/scribe

README

scribe-scaling

High-performance scaling optimizations for large repository analysis in Scribe.

Overview

scribe-scaling is the performance and optimization layer that enables Scribe to handle repositories of any size—from small projects to enterprise codebases with 100k+ files. It implements streaming architecture, intelligent caching, parallel processing, and context positioning to maintain sub-second performance on small repos and sub-30-second analysis on massive repositories.

Key Features

Progressive Loading Architecture

Metadata-first streaming: Load file metadata before content to enable early filtering
Lazy content loading: Only read file content when selected for inclusion
Backpressure management: Adaptive throttling prevents memory exhaustion
Incremental processing: Process files as they're discovered, not after full scan

Intelligent Caching

Persistent cache: Disk-based cache with Blake3 content signatures
Signature-based invalidation: Only re-process changed files
Multi-level caching: In-memory LRU + disk persistence
Compression: Compressed cache entries using flate2 (gzip)
Cache warming: Pre-populate cache for common operations

Parallel Processing

Multi-core scanning: Parallel directory traversal using Rayon
Concurrent analysis: Parallel AST parsing and scoring
Async I/O: Non-blocking file operations with tokio
Work stealing: Dynamic load balancing across threads
Adaptive parallelism: Scale thread count based on system resources

Context Positioning Optimization

Transformer-aware ordering: Exploits LLM attention patterns
3-tier positioning: HEAD (20%) → MIDDLE (60%) → TAIL (20%)
Query-aware relevance: Surfaces most relevant files at HEAD
Centrality-based placement: High-centrality files at HEAD and TAIL
Relatedness grouping: Clusters related files together

Adaptive Thresholds

Repository-aware configuration: Auto-tune based on repo size
Memory-aware limits: Adjust based on available system memory
Performance targets: Dynamic optimization for time/memory tradeoffs
Quality preservation: Maintain information density while scaling

Architecture

Repository → Streaming Scan → Parallel Analysis → Selection + Positioning → Caching → Output
     ↓              ↓                ↓                      ↓                 ↓          ↓
  Metadata     Progressive      Multi-threaded        3-Tier Context      Blake3    Optimized
   First         Loading          AST/Scoring          HEAD/MIDDLE/TAIL   Signature   Bundle
                ↓                     ↓                                       ↓
            Backpressure         Work Stealing                          LRU + Disk
             Control              Load Balance                            Cache

Core Components

`ScalingSelector`

Main entry point for scaled repository processing:

Integrates all optimization layers
Configurable performance/quality tradeoffs
Automatic adaptation based on repository characteristics
Progress reporting and cancellation support

`StreamingScanner`

Progressive file system traversal:

Yields file metadata as discovered
Supports filtering during traversal
Adaptive buffering based on memory pressure
Early termination support

`ParallelAnalyzer`

Multi-threaded file analysis:

Rayon-based thread pool
Dynamic work distribution
Priority-based scheduling (high-importance files first)
Error isolation (one failure doesn't stop processing)

`ContextPositioner`

Optimizes file order for LLM recall:

HEAD: Query-relevant high-centrality files (transformers attend best here)
MIDDLE: Supporting files with lower centrality (background context)
TAIL: Core functionality high-centrality files (attention strengthens again)
Centrality calculation using PageRank
Query relevance scoring with term matching
Relatedness grouping by imports and structure

`CacheManager`

Persistent caching system:

Blake3 content hashing for change detection
LRU eviction for memory cache
Compressed disk storage
Atomic cache updates
Cache statistics and monitoring

`AdaptiveConfig`

Auto-tuning configuration:

Detects repository size and complexity
Queries available system memory
Adjusts thread counts, buffer sizes, cache limits
Scales thresholds based on detected patterns

Usage

Basic Scaled Selection

use scribe_scaling::{ScalingSelector, ScalingConfig};

let config = ScalingConfig {
    performance_target: PerformanceTarget::Balanced,
    enable_caching: true,
    enable_positioning: true,
    ..Default::default()
};

let selector = ScalingSelector::new(config);
let result = selector.select_and_process(repo_path).await?;

println!("Analyzed {} files in {:?}",
    result.files_processed,
    result.elapsed_time
);
println!("Cache hit rate: {:.1}%", result.cache_stats.hit_rate * 100.0);

Query-Aware Context Positioning

use scribe_scaling::{ScalingSelector, ContextPositioningConfig};

let mut config = ScalingConfig::default();
config.positioning = ContextPositioningConfig {
    enable_positioning: true,
    head_percentage: 0.20,      // 20% high-priority files at HEAD
    tail_percentage: 0.20,      // 20% core files at TAIL
    centrality_weight: 0.5,
    query_relevance_weight: 0.3,
    relatedness_weight: 0.2,
};

let selector = ScalingSelector::new(config);
let result = selector.select_and_process_with_query(
    repo_path,
    Some("authentication middleware")  // Query hint
).await?;

if result.has_context_positioning() {
    let ordered = result.get_optimally_ordered_files();
    println!("HEAD: {} files", ordered.head.len());
    println!("MIDDLE: {} files", ordered.middle.len());
    println!("TAIL: {} files", ordered.tail.len());
}

Streaming with Progress

use scribe_scaling::{StreamingScanner, ScanConfig};
use indicatif::{ProgressBar, ProgressStyle};

let scanner = StreamingScanner::new(ScanConfig::default());
let progress = ProgressBar::new_spinner();

progress.set_style(
    ProgressStyle::default_spinner()
        .template("{spinner} [{elapsed}] {msg} ({pos} files)")
);

let mut file_stream = scanner.scan_streaming(repo_path).await?;

while let Some(file) = file_stream.next().await {
    progress.set_message(format!("Scanning {}", file.path.display()));
    progress.inc(1);

    // Process file metadata immediately
    if file.score > threshold {
        // Load and analyze content only for high-scoring files
        let content = file.load_content().await?;
        analyze(content).await?;
    }
}

progress.finish_with_message("Scan complete");

Custom Performance Targets

use scribe_scaling::{PerformanceTarget, ScalingConfig};

// Fast mode: Prioritize speed over completeness
let fast_config = ScalingConfig {
    performance_target: PerformanceTarget::Speed,
    parallel_threads: Some(num_cpus::get()),
    cache_size_mb: 500,
    max_file_size: 500_000,  // Skip large files
    enable_positioning: false,
};

// Quality mode: Prioritize completeness and quality
let quality_config = ScalingConfig {
    performance_target: PerformanceTarget::Quality,
    parallel_threads: Some(4),  // Fewer threads, more thorough
    cache_size_mb: 2000,
    max_file_size: 5_000_000,
    enable_positioning: true,
};

// Balanced mode (default)
let balanced_config = ScalingConfig::default();

Cache Management

use scribe_scaling::cache::{CacheManager, CacheConfig};

let cache_config = CacheConfig {
    cache_dir: PathBuf::from(".scribe-cache"),
    max_size_mb: 1000,
    compression_level: 6,
    ttl_hours: 24,
};

let cache = CacheManager::new(cache_config)?;

// Check cache status
let stats = cache.stats();
println!("Cache entries: {}", stats.entry_count);
println!("Total size: {} MB", stats.size_mb);
println!("Hit rate: {:.1}%", stats.hit_rate * 100.0);

// Clear old entries
cache.evict_expired()?;

// Clear entire cache
cache.clear()?;

Performance Targets

Repository Size Profiles

Size	Files	Time Target	Memory Target	Strategy
Small	≤1k	<1s	<50MB	In-memory, minimal caching
Medium	1k-10k	<5s	<200MB	Parallel + caching
Large	10k-100k	<15s	<1GB	Streaming + aggressive caching
Enterprise	100k+	<30s	<2GB	Full optimization suite

Achieved Performance

Based on internal benchmarks:

Linux kernel (70k files): ~12s analysis time, ~800MB memory
Chromium (300k files): ~28s analysis time, ~1.8GB memory
Small Rust project (500 files): ~450ms analysis time, ~40MB memory

Context Positioning

Why It Matters

Transformer models don't attend equally to all tokens:

Strong attention at HEAD: First 20% gets highest attention
Weak attention in MIDDLE: Middle 60% gets less focus
Strong attention at TAIL: Final 20% gets second-highest attention

Strategy: Place most important files where LLMs attend best.

3-Tier System

HEAD (20%)

Query-relevant files with high centrality
Entry points matching query terms
Critical dependencies for query context
Goal: Give LLM immediate relevant context

MIDDLE (60%)

Supporting files with lower centrality
Utility code and helpers
Test files (if included)
Goal: Provide background without overwhelming attention

TAIL (20%)

Core high-centrality files (lib.rs, init.py)
Foundational configuration and types
Architectural anchors
Goal: Ground LLM understanding with core concepts

Positioning Algorithm

1. Compute PageRank centrality for all files
2. Score query relevance (if query provided)
3. Combined score = centrality_weight * centrality +
                    query_weight * relevance
4. Sort by combined score
5. Top 20% → HEAD
6. Middle 60% → MIDDLE
7. Bottom 20% with highest centrality → TAIL
8. Group related files within each tier

Configuration

`ScalingConfig`

Field	Type	Default	Description
`performance_target`	`PerformanceTarget`	`Balanced`	Speed, Balanced, or Quality
`parallel_threads`	`Option<usize>`	CPU count	Thread pool size
`cache_size_mb`	`usize`	`1000`	Maximum cache size
`enable_caching`	`bool`	`true`	Enable persistent cache
`enable_positioning`	`bool`	`true`	Enable context positioning
`max_file_size`	`usize`	`1_000_000`	Skip files larger than this

`ContextPositioningConfig`

Field	Type	Default	Description
`enable_positioning`	`bool`	`true`	Enable/disable positioning
`head_percentage`	`f64`	`0.20`	Percentage for HEAD section
`tail_percentage`	`f64`	`0.20`	Percentage for TAIL section
`centrality_weight`	`f64`	`0.4`	Weight for centrality scoring
`query_relevance_weight`	`f64`	`0.3`	Weight for query matching
`relatedness_weight`	`f64`	`0.3`	Weight for file grouping

Optimizations

Memory Management

Streaming processing: Never load entire repository into memory
Lazy content loading: Load file content only when needed
Compressed caching: Reduce cache memory footprint
Incremental GC: Release memory as processing progresses

I/O Optimization

Async file operations: Non-blocking reads with tokio
Read-ahead buffering: Pre-fetch likely-needed files
Memory-mapped files: For very large files
Batched writes: Coalesce cache updates

CPU Optimization

Work stealing: Dynamic thread load balancing
SIMD: Use SIMD for hash computation (Blake3)
Compiled patterns: Pre-compile regex and globs
Lazy evaluation: Skip unnecessary computations

Integration

scribe-scaling is used by:

CLI: Top-level orchestration of repository analysis
scribe-webservice: Powers web API for large repositories
scribe-selection: Provides scalable selection infrastructure
scribe-graph: Scales PageRank to large graphs

scribe-scaling

documentation

README

scribe-scaling

Overview

Key Features

Progressive Loading Architecture

Intelligent Caching

Parallel Processing

Context Positioning Optimization

Adaptive Thresholds

Architecture

Core Components

ScalingSelector

StreamingScanner

ParallelAnalyzer

ContextPositioner

CacheManager

AdaptiveConfig

Usage

Basic Scaled Selection

Query-Aware Context Positioning

Streaming with Progress

Custom Performance Targets

Cache Management

Performance Targets

Repository Size Profiles

Achieved Performance

Context Positioning

Why It Matters

3-Tier System

HEAD (20%)

MIDDLE (60%)

TAIL (20%)

Positioning Algorithm

Configuration

ScalingConfig

ContextPositioningConfig

Optimizations

Memory Management

I/O Optimization

CPU Optimization

Integration

See Also

cargo fmt

`ScalingSelector`

`StreamingScanner`

`ParallelAnalyzer`

`ContextPositioner`

`CacheManager`

`AdaptiveConfig`

`ScalingConfig`

`ContextPositioningConfig`