Crates.io | diffx-core |
lib.rs | diffx-core |
version | 0.5.21 |
created_at | 2025-07-04 12:31:31.023999+00 |
updated_at | 2025-08-05 19:20:56.159638+00 |
description | Core library for diffx - blazing fast semantic diff engine for structured data. Zero-copy parsing, streaming support, memory-efficient algorithms |
homepage | https://github.com/kako-jun/diffx |
repository | https://github.com/kako-jun/diffx |
max_upload_size | |
id | 1737967 |
size | 110,644 |
🚀 Semantic diff for structured data - Focus on what matters, not formatting
English README | 日本語版 README | 中文版 README
A next-generation diff tool that understands the structure and meaning of your data, not just text changes. Perfect for JSON, YAML, TOML, XML, INI, and CSV files.
# Traditional diff shows formatting noise (key order, trailing commas)
$ diff config_v1.json config_v2.json
< {
< "name": "myapp",
< "version": "1.0"
< }
> {
> "version": "1.1",
> "name": "myapp"
> }
# diffx shows only semantic changes
$ diffx config_v1.json config_v2.json
~ version: "1.0" -> "1.1"
Real benchmark results on AMD Ryzen 5 PRO 4650U:
# Test files: ~600 bytes JSON with nested config
$ time diff large_test1.json large_test2.json # Shows 15+ lines of noise
$ time diffx large_test1.json large_test2.json # Shows 3 semantic changes
# Results:
Traditional diff: ~0.002s (but with formatting noise)
diffx: ~0.005s (clean semantic output)
Why CLI matters for the AI era: As AI tools become essential in development workflows, having structured, machine-readable diff output becomes crucial. diffx
provides clean, parseable results that AI can understand and reason about, making it perfect for automated code review, configuration management, and intelligent deployment pipelines.
Traditional diff
tools show you formatting noise. diffx
shows you what actually changed.
diffx
outputs differences in the diffx format by default - a semantic diff representation designed specifically for structured data. The diffx format provides the richest expression of structural differences and can be complemented with machine-readable formats for integration:
diffx Format (Default)
+
(addition), -
(deletion), ~
(change), !
(type change) symbols with full path context (e.g., database.connection.host
).JSON Format
diffx
are output as a JSON array.YAML Format
Machine-readable format. Used for CI/CD and integration with other programs, similar to JSON.
Differences detected by diffx
are output as a YAML array.
graph TB
subgraph Core["diffx-core"]
B[Format Parsers]
C[Semantic Diff Engine]
D[Output Formatters]
B --> C --> D
end
E[CLI Tool] --> Core
F[NPM Package] --> E
G[Python Package] --> E
H[JSON] --> B
I[YAML] --> B
J[TOML] --> B
K[XML] --> B
L[INI] --> B
M[CSV] --> B
D --> N[CLI Display]
D --> O[JSON Output]
D --> P[YAML Output]
diffx/
├── diffx-core/ # Diff extraction library (Crate)
├── diffx-cli/ # CLI wrapper
├── tests/ # All test-related files
│ ├── fixtures/ # Test input data
│ ├── integration/ # CLI integration tests
│ ├── unit/ # Core library unit tests
│ └── output/ # Test intermediate files
├── docs/ # Documentation and specifications
└── ...
serde_json
, serde_yml
, toml
, configparser
, quick-xml
, csv
parsersclap
(CLI argument parsing)colored
(CLI output coloring)similar
(Unified Format output)Compare diff reports to track how changes evolve over time:
graph LR
A[config_v1.json] --> D1[diffx]
B[config_v2.json] --> D1
D1 --> R1[diff_report_v1.json]
B --> D2[diffx]
C[config_v3.json] --> D2
D2 --> R2[diff_report_v2.json]
R1 --> D3[diffx]
R2 --> D3
D3 --> M[Meta-Diff Report]
$ diffx config_v1.json config_v2.json --output json > report1.json
$ diffx config_v2.json config_v3.json --output json > report2.json
$ diffx report1.json report2.json # Compare the changes themselves!
# Rust (recommended - native performance)
cargo install diffx
# Node.js ecosystem (⚡ offline-ready with all platform binaries)
npm install diffx-js
# Python ecosystem (🆕 self-contained wheel with embedded binary)
pip install diffx-python
# Or download pre-built binaries from GitHub Releases
For detailed usage and examples, see the documentation.
# Compare JSON files
diffx file1.json file2.json
# Compare with different output formats
diffx config.yaml config_new.yaml --output json
diffx data.toml data_updated.toml --output yaml
# Advanced filtering options
diffx large.json large_v2.json --ignore-keys-regex "^timestamp$|^_.*"
diffx users.json users_v2.json --array-id-key "id"
diffx metrics.json metrics_v2.json --epsilon 0.001
# High-demand practical options
diffx config.yaml config_new.yaml --ignore-case # Ignore case differences
diffx api.json api_formatted.json --ignore-whitespace # Ignore whitespace changes
diffx large.json large_v2.json --output json # JSON output for automation
diffx file1.json file2.json --quiet && echo "Files identical" # Script automation
diffx dir1/ dir2/ --brief # Quick directory change check (automatic recursive)
# Performance optimization for large files
diffx huge_dataset.json huge_dataset_v2.json
# Directory comparison (automatic recursive detection)
diffx config_dir1/ config_dir2/
# Meta-chaining for change tracking
diffx config_v1.json config_v2.json --output json > diff1.json
diffx config_v2.json config_v3.json --output json > diff2.json
diffx diff1.json diff2.json # Compare the changes themselves!
CI/CD Pipeline:
- name: Check configuration changes
run: |
diffx config/prod.yaml config/staging.yaml --output json > changes.json
# Process changes.json for deployment validation
- name: Quick file change detection
run: |
if ! diffx config/current.json config/new.json --quiet; then
echo "Configuration changed, triggering deployment"
fi
- name: Compare with ignore options for cleaner diffs
run: |
diffx api_old.json api_new.json --ignore-case --ignore-whitespace --output json > api_changes.json
# Focus on semantic changes, ignore formatting
- name: Compare large datasets efficiently
run: |
diffx large_prod_data.json large_staging_data.json --output json > data_changes.json
# Optimized processing for large files in CI
Git Hook:
#!/bin/bash
# pre-commit hook
if diffx package.json HEAD~1:package.json --output json | jq -e '.[] | select(.Added)' > /dev/null; then
echo "New dependencies detected, running security audit..."
fi
diffx is available across multiple ecosystems:
# Rust (native CLI)
cargo install diffx
# Node.js wrapper
npm install diffx-js
# Python wrapper
pip install diffx-python
All packages provide the same semantic diff capabilities:
diffx-tui
): A powerful viewer showcasing diffx capabilities with side-by-side data displaydiffx-web
)diffx-vscode
)We welcome contributions! See CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.