pith

Crates.iopith
lib.rspith
version0.1.0
created_at2026-01-15 16:28:12.027325+00
updated_at2026-01-15 16:28:12.027325+00
descriptionGenerate optimized codebase context for LLMs
homepage
repositoryhttps://github.com/moradology/pith
max_upload_size
id2045954
size2,273,983
Nathan Zimmerman (moradology)

documentation

README

pith

pith

Generate structured codebase context for LLM consumption.

What is pith?

Pith extracts the essential structure of a codebase into a format optimized for large language models. Rather than dumping raw source files, pith produces codemaps: AST-extracted declarations, imports, and signatures that capture what exists in your code without the implementation noise.

The output is designed around three goals:

  • Token efficiency: Codemaps capture the shape of code (what functions exist, what they accept and return) using fewer tokens than full source
  • Structural clarity: Organized sections (<file_map>, <codemaps>, <selected_files>) help LLMs parse and reason about code structure
  • Selective disclosure: Include full source only for files you're actively working on; use codemaps for surrounding context

Pith is both a CLI tool and a Rust library.

Quick Start

# Generate context for current directory, copy to clipboard
pith context . | pbcopy

# Then paste into Claude, ChatGPT, or your LLM of choice

Supported Languages

Language Extensions
Rust .rs
TypeScript .ts, .tsx
JavaScript .js, .jsx, .mjs, .cjs
Python .py, .pyi
Go .go

Extraction uses tree-sitter for accurate parsing.

CLI Usage

Commands

pith tree <PATH>       # Display file tree with metadata
pith codemap <PATH>    # Extract API signatures only
pith context <PATH>    # Full context: tree + codemaps + selected files
pith tokens <PATH>     # Count tokens for budget planning
pith languages         # Show supported languages

Key Options

--select <PATTERN>     # Include full source for matching files (glob)
--lang <LANG>          # Filter to specific language(s)
--json                 # Output as JSON (for programmatic use)
--include-docs         # Include doc comments in codemaps
--include-private      # Include private/internal items

Example: Generate context with selected files

pith context ./src --select "**/api/*.rs"

Example Output

<file_map>
src/
├── api/
│   ├── handlers.rs [rust, 245 lines, 6.2KB] *+
│   └── routes.rs [rust, 89 lines, 2.1KB] *+
├── db/
│   └── queries.rs [rust, 156 lines, 4.1KB] +
└── lib.rs [rust, 42 lines, 1.0KB] +

Legend: * = selected, + = has codemap
</file_map>

<codemaps>
## src/db/queries.rs

### Imports
- use sqlx::{Pool, Postgres}
- use crate::models::{User, Post}

### Declarations

#### pub async fn get_user (pool: &Pool<Postgres>, id: i64) -> Result<User> (lines 12-18)

#### pub async fn list_posts (pool: &Pool<Postgres>, limit: i32) -> Result<Vec<Post>> (lines 20-31)

---

## src/lib.rs

### Imports
- use api::{handlers, routes}

### Declarations

#### pub fn create_app () -> Router (lines 8-15)

</codemaps>

<selected_files>
--- src/api/handlers.rs (245 lines, 1,823 tokens) ---
// Full file content here...

--- src/api/routes.rs (89 lines, 672 tokens) ---
// Full file content here...
</selected_files>

<token_summary>
Total: 3,241 tokens

Component breakdown:
- File tree: 89 tokens
- Codemaps: 657 tokens
- Selected files: 2,495 tokens
</token_summary>

Understanding the Output

  • <file_map>: Directory tree with metadata. * marks selected files (full content included), + marks files with codemaps.
  • <codemaps>: Per-file API signatures extracted via tree-sitter. Shows imports, function signatures, struct definitions, etc. without implementation bodies.
  • <selected_files>: Full source content for files matching --select patterns.
  • <token_summary>: Token counts for budget planning against context limits.

Library Usage

use pith::{Pith, Language};

let result = Pith::new("./my-project")
    .languages(&[Language::Rust, Language::TypeScript])
    .include_docs(true)
    .build()?;

println!("Files: {}", result.codemaps.len());
println!("Tokens: {}", result.total_tokens());

Pith automatically respects .gitignore and detects binary/minified/generated files.

Limitations

  • Language coverage: Currently supports Rust, TypeScript, JavaScript, Python, and Go. No C/C++, Java, Ruby, etc.
  • Partial parsing: Syntactically invalid code may produce incomplete codemaps.
  • No semantic analysis: Type resolution is not performed. Import paths are extracted as-is.

This is a young project. API may change between versions.

Contributing

Issues and pull requests welcome. Please run cargo test before submitting.

License

MIT

Commit count: 0

cargo fmt