| Crates.io | codegraph-c |
| lib.rs | codegraph-c |
| version | 0.1.2 |
| created_at | 2025-12-16 06:38:16.873645+00 |
| updated_at | 2026-01-02 00:28:22.816998+00 |
| description | C parser for CodeGraph - extracts code entities and relationships from C source files |
| homepage | |
| repository | https://github.com/anvanster/codegraph |
| max_upload_size | |
| id | 1987330 |
| size | 320,438 |
C language parser for the CodeGraph framework.
codegraph-c provides robust C source code parsing with specialized support for Linux kernel and system-level code. It uses tree-sitter for fault-tolerant AST generation and includes a sophisticated preprocessing pipeline that handles:
__attribute__, __asm__, typeof, etc.)__init, __exit, container_of, etc.)#if 0, #ifdef, etc.)codegraph for code analysisuse codegraph_c::CParser;
use codegraph_parser_api::CodeParser;
use codegraph::CodeGraph;
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut graph = CodeGraph::in_memory()?;
let parser = CParser::new();
let file_info = parser.parse_file(Path::new("main.c"), &mut graph)?;
println!("Parsed {} functions", file_info.functions.len());
Ok(())
}
For code with kernel-specific constructs:
use codegraph_c::{CParser, ExtractionOptions};
use codegraph_c::extractor::extract_with_options;
use codegraph_parser_api::ParserConfig;
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let source = r#"
#include <linux/module.h>
static __init int my_init(void) {
return 0;
}
"#;
let options = ExtractionOptions::for_kernel_code();
let result = extract_with_options(
source,
Path::new("driver.c"),
&ParserConfig::default(),
&options
)?;
println!("Extracted {} functions", result.ir.functions.len());
Ok(())
}
For maximum control over preprocessing:
use codegraph_c::pipeline::{Pipeline, PipelineConfig};
fn main() {
let pipeline = Pipeline::new();
let config = PipelineConfig::for_kernel_code();
let source = r#"
#include <linux/module.h>
MODULE_LICENSE("GPL");
static __init int my_init(void) { return 0; }
"#;
let result = pipeline.process(source, &config);
println!("Platform: {} (confidence: {:.0}%)",
result.platform.platform_id,
result.platform.confidence * 100.0);
println!("Processed source ready for parsing");
}
The parser uses a layered processing pipeline:
Source Code
│
▼
┌─────────────────────────────────────┐
│ 1. Platform Detection │
│ Detect Linux/FreeBSD/Darwin │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 2. Conditional Evaluation │
│ Strip #if 0 blocks │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 3. GCC Neutralization │
│ Handle __attribute__, etc. │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 4. Macro Expansion │
│ Expand kernel macros │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 5. tree-sitter Parsing │
│ Fault-tolerant AST │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 6. Entity Extraction │
│ Functions, structs, calls │
└─────────────────────────────────────┘
u8, u16, u32, u64, s8, s16, s32, s64size_t, ssize_t, uintptr_t, intptr_t__le16, __le32, __be16, __be32 (kernel types)bool, _Bool__init, __exit, __user, __kernel__iomem, __force, __percpu, __rcu__must_check, __always_inline, __noinline__section(...), __aligned(...)__attribute__((...))__extension____asm__, __asm volatiletypeof(), __typeof__()({ ... })container_of(), offsetof()likely(), unlikely()BUILD_BUG_ON(), WARN_ON()list_for_each() and iterator macros# Run all tests
cargo test -p codegraph-c
# Run integration tests
cargo test -p codegraph-c --test integration_tests
The parser achieves good parsing rates on real-world kernel code:
| Codebase | Files | Clean Parse Rate |
|---|---|---|
| ICE Driver | 84 | 75%+ |
| i915 Graphics | 526 | 70%+ |
| NVMe CLI | 156 | 85%+ |
| netperf | 47 | 94%+ |
MIT OR Apache-2.0