concept-analyzer

Crates.io	concept-analyzer
lib.rs	concept-analyzer
version	0.1.1
created_at	2025-07-24 21:49:50.595354+00
updated_at	2025-07-24 22:00:28.921403+00
description	A unified pipeline that analyzes code repositories and extracts first-principles instructions for AI agents
homepage	https://github.com/CodifyInc/concept-analyzer
repository	https://github.com/CodifyInc/concept-analyzer
max_upload_size
id	1766950
size	148,248

A (ArionHardison)

documentation

https://docs.rs/concept-analyzer

README

Concept Analyzer - First Principles Extractor

A unified pipeline that analyzes code repositories and extracts first-principles instructions for AI agents to recreate systems from scratch.

Features

Single API Call: Point at a repository and get results in S3
Parallel Processing: Uses the pipelines crate for efficient parallel analysis
First-Principles Output: Strips away implementation details to focus on core concepts
Demo-Friendly: Detailed progress logging for presentations
AI-Optimized: Output designed for agent consumption

Installation

cargo build --release

Usage

CLI

# Set environment variables
export LLM_API_KEY="your-openai-api-key"
export AWS_REGION="us-east-1"
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"

# Run analysis
cargo run --bin analyze-repo -- /path/to/repo my-bucket results/project.json

As a Library

use concept_analyzer::analyze_repository;
use std::path::Path;

let s3_url = analyze_repository(
    Path::new("/path/to/repo"),
    "my-bucket",
    "results/project.json",
    api_key,
    Some(8), // Use 8 workers
).await?;

HTTP API

# Start the API server
cargo run --features api

# Make a request
curl -X POST http://localhost:3000/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "repo_path": "/path/to/repo",
    "s3_bucket": "my-bucket",
    "s3_key": "results/project.json"
  }'

Output Format

The S3 output contains a JSON document with:

{
  "project_name": "my-app",
  "core_purpose": "A web application that manages user tasks with real-time collaboration",
  "essential_concepts": [
    {
      "name": "TaskManager",
      "purpose": "Manages the lifecycle of user tasks",
      "core_responsibilities": ["Create tasks", "Update tasks", "Track status"],
      "interfaces": ["createTask()", "updateTask()", "deleteTask()"]
    }
  ],
  "concept_relationships": [
    {
      "from": "TaskManager",
      "to": "UserService",
      "relationship_type": "depends_on",
      "reason": "Needs user information for task assignment"
    }
  ],
  "implementation_order": ["Database", "UserService", "TaskManager"],
  "critical_gaps": [
    {
      "missing_functionality": "Task prioritization",
      "impact": "Users cannot order tasks effectively",
      "suggested_solution": "Implement scoring algorithm"
    }
  ],
  "rebuild_instructions": [
    {
      "step_number": 1,
      "concept": "Database",
      "implementation_details": "Set up PostgreSQL with user and task tables",
      "dependencies": [],
      "validation_criteria": ["Can connect", "Tables created"]
    }
  ]
}

Demo Mode

When running the CLI, you'll see detailed progress:

🎯 CONCEPT ANALYZER - First Principles Extractor
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🔍 STARTING REPOSITORY ANALYSIS
  📁 Repository: /path/to/repo
  🪣 S3 Target: s3://my-bucket/results.json
  ⚙️  Workers: 8

📋 STAGE 1/6: File Collection
  └─ Scanning repository for source files...
  ✓ Collected 150 files in 3 batches (125ms)

🔬 STAGE 2-4: Parallel Analysis Pipeline
  ├─ Stage 2: Extracting abstractions from code
  ├─ Stage 3: Analyzing relationships between concepts
  └─ Stage 4: Processing with 8 parallel workers
  ✓ Found 25 abstractions and 45 relationships (2341ms)

🧬 STAGE 5/6: Synthesizing First Principles
  ├─ Extracting essential concepts
  ├─ Simplifying relationships
  ├─ Determining build order
  └─ Generating rebuild instructions
  ✓ Synthesis complete (1523ms)

☁️  STAGE 6/6: Publishing to S3
  └─ Uploading to s3://my-bucket/results.json
  ✓ Published successfully (234ms)

✅ ANALYSIS COMPLETE!
  ⏱️  Total time: 4.22s
  📊 Output size: 12.34 KB
  🔗 Results: s3://my-bucket/results.json

Architecture

The pipeline consists of 6 stages:

File Collection: Scans repository for source files
Abstraction Extraction: Uses LLM to identify high-level concepts
Relationship Analysis: Maps dependencies between concepts
Parallel Processing: Processes batches concurrently
Synthesis: Generates first-principles output
Publishing: Uploads results to S3

Dependencies

pipelines: For parallel processing
aws-sdk-s3: For S3 publishing
tokio: Async runtime
axum: Web framework (optional)
clap: CLI parsing
serde: JSON serialization

Commit count: 0