| Crates.io | scribe-graph |
| lib.rs | scribe-graph |
| version | 0.5.1 |
| created_at | 2025-09-13 06:23:30.880816+00 |
| updated_at | 2025-12-01 20:15:27.848536+00 |
| description | Graph-based code representation and analysis for Scribe |
| homepage | https://github.com/sibyllinesoft/scribe |
| repository | https://github.com/sibyllinesoft/scribe |
| max_upload_size | |
| id | 1837302 |
| size | 236,910 |
Graph-based code representation and centrality analysis for Scribe.
scribe-graph builds dependency graphs from repository code and computes centrality metrics to identify the most important files. It implements research-grade graph algorithms specifically adapted for code analysis, enabling Scribe to intelligently prioritize files based on their structural importance.
FileMetadata → Import Extraction → Dependency Graph → Centrality Analysis → Selection
↓ ↓ ↓ ↓ ↓
AST Parse Language-specific petgraph DiGraph PageRank/SCC Coverage Set
Patterns (Node: File) Scoring/Ranking Computation
(Edge: Import)
DependencyGraphMain graph structure built on petgraph::DiGraph:
HashMapImportExtractorLanguage-specific import parsing:
import, from ... import statementsimport, require(), ES modulesuse, mod, extern crate declarationsimport blocks with package resolutionimport statements with wildcard supportCentralityComputerImplements graph centrality algorithms:
CoveringSetSelectorComputes minimal file sets for entity understanding:
use scribe_graph::{DependencyGraph, ImportExtractor};
let extractor = ImportExtractor::new();
let mut graph = DependencyGraph::new();
for file in files {
let imports = extractor.extract_imports(&file)?;
graph.add_node(file.path.clone(), file.clone());
for import in imports {
graph.add_edge(file.path.clone(), import.target, import.weight);
}
}
println!("Graph has {} nodes and {} edges",
graph.node_count(),
graph.edge_count()
);
use scribe_graph::centrality::{PageRankConfig, compute_pagerank};
let config = PageRankConfig {
damping: 0.85,
max_iterations: 100,
tolerance: 1e-6,
parallel: true,
};
let scores = compute_pagerank(&graph, config)?;
// Get top 10 most central files
let mut ranked: Vec<_> = scores.iter().collect();
ranked.sort_by(|a, b| b.1.partial_cmp(a.1).unwrap());
for (file, score) in ranked.iter().take(10) {
println!("{}: {:.6}", file.display(), score);
}
use scribe_graph::covering::{CoveringSetConfig, compute_covering_set};
let config = CoveringSetConfig {
entity_name: "authenticate_user".to_string(),
entity_type: EntityType::Function,
max_depth: 3,
include_dependents: false, // for understanding
importance_threshold: 0.01,
};
let result = compute_covering_set(&graph, config)?;
println!("Selected {} files to understand function:", result.files.len());
for (file, reason) in result.files {
println!(" {}: {:?}", file.display(), reason);
}
use scribe_graph::traversal::compute_transitive_dependencies;
let target = PathBuf::from("src/auth/login.rs");
let deps = compute_transitive_dependencies(&graph, &target, Some(3))?;
println!("Dependencies of {}: ", target.display());
for (depth, dep) in deps {
println!(" [depth {}] {}", depth, dep.display());
}
Traditional PageRank flows importance FROM pages that link TO you. For code:
Key Insight: Important files are those that many other files depend on, not files that depend on many things.
Implementation: We reverse edge direction when computing PageRank:
A imports B, we add edge B → A in the PageRank graphutils.rs, config.py) get high scoresExample:
src/main.rs imports utils/helpers.rs
src/api.rs imports utils/helpers.rs
src/db.rs imports utils/helpers.rs
→ helpers.rs gets high PageRank (many files depend on it)
→ main.rs gets lower PageRank (only helpers depends on it)
PageRankConfig| Field | Type | Default | Description |
|---|---|---|---|
damping |
f64 |
0.85 |
Damping factor (probability of following links) |
max_iterations |
usize |
100 |
Maximum iterations before stopping |
tolerance |
f64 |
1e-6 |
Convergence threshold (L1 norm delta) |
parallel |
bool |
true |
Use parallel computation |
CoveringSetConfig| Field | Type | Default | Description |
|---|---|---|---|
entity_name |
String |
- | Function/class/module to target |
entity_type |
EntityType |
- | Function, Class, Module, etc. |
max_depth |
Option<usize> |
None |
Limit transitive dependency depth |
include_dependents |
bool |
false |
Include files that depend on target |
importance_threshold |
f64 |
0.01 |
Exclude files below centrality score |
scribe-graph is used by:
scribe-analysis: Provides AST parsing for import extractionscribe-selection: Uses centrality scores for file rankingscribe-core: Shared types and configuration../../scribe-rs/scribe-scaling/docs/context-positioning.md: Context optimization using centrality