| Crates.io | data-doctor-cli |
| lib.rs | data-doctor-cli |
| version | 1.0.4 |
| created_at | 2025-12-08 15:27:01.508321+00 |
| updated_at | 2025-12-21 07:34:23.132085+00 |
| description | A powerful data validation and cleaning tool for JSON and CSV files |
| homepage | |
| repository | https://github.com/jeevanms003/data-doctor |
| max_upload_size | |
| id | 1973777 |
| size | 31,590 |
DataDoctor CLI is your command-line companion for maintaining data health. It brings the power of the DataDoctor engine directly to your terminal, allowing you to validate, analyze, and repair JSON and CSV files instantly.
If you have Rust installed, this is the easiest way:
cargo install data-doctor-cli
This installs the data-doctor binary to your path.
git clone https://github.com/jeevanms003/data-doctor.git
cd data-doctor
cargo install --path cli
DataDoctor provides three primary modes of operation, designed for different workflows:
validate (The Checkup)Best for: CI/CD pipelines, pre-commit hooks, or just checking file integrity.
This command scans your file and reports issues without modifying anything. It returns a non-zero exit code if errors are found, making it perfect for automated scripts.
data-doctor validate users.csv
fix (The Surgery)Best for: Cleaning messy data dumps, fixing "broken" JSON from APIs.
This command actively repairs the file and saves the clean version to a new output path. It applies all available auto-fix strategies (e.g., adding missing quotes, padding columns).
data-doctor fix broken_data.json --out clean_data.json
doctor (The Full Treatment)Best for: Interactive analysis and reporting.
This runs a validation pass, then an auto-fix pass, and generates a comprehensive report comparing the "before" and "after" states.
data-doctor doctor input.csv --out fixed.csv
validatedata-doctor validate <INPUT> [OPTIONS]
Options:
--format <json|csv>: Force a specific file format (overrides extension detection).--report-json: Print a machine-readable JSON object instead of the human-readable report.--schema <FILE>: Validate against a custom schema definition.fixdata-doctor fix <INPUT> --out <OUTPUT> [OPTIONS]
Options:
--out <FILE>: (Required) Where to save the fixed file.--format <json|csv>: Force specific file format.doctordata-doctor doctor <INPUT> --out <OUTPUT> [OPTIONS]
Combines validate and fix functionalities with detailed logging.
| Issue | Example (Before) | Example (After) |
|---|---|---|
| Broken Structure | [ { "a": 1 } } |
[ { "a": 1 } ] (Mismatched bracket fix) |
| Embedded Keys | "desc": "val,"key": "v" |
"desc": "val", "key": "v" |
| Numeric Formats | {"val": 0xFF, "oct": 0o77} |
{"val": 255, "oct": 63} |
| Invalid Booleans | {"active": yes} |
{"active": true} |
| Leading Zeros | {"id": 030} |
{"id": 30} |
| Trailing Commas | {"a": 1,} |
{"a": 1} |
| Missing Commas | {"a": 1 "b": 2} |
{"a": 1, "b": 2} |
| Unquoted Keys | {name: "John"} |
{"name": "John"} |
| Single Quotes | {'name': 'John'} |
{"name": "John"} |
| Unclosed Brackets | [1, 2, 3 |
[1, 2, 3] |
| Issue | Before | After |
|---|---|---|
| Padding Columns | A,B,C1,2 |
A,B,C1,2, (Empty added) |
| Trimming Cols | A,B1,2,3,4 |
A,B1,2 (Extras removed) |
| Booleans | Yes, No |
true, false |
| Whitespace | Value |
Value |
For integration with other tools (like dashboards), use --report-json.
Command:
data-doctor validate data.csv --report-json
Output:
{
"success": false,
"total_records": 100,
"invalid_records": 5,
"issues": [
{
"severity": "Error",
"code": "CSV_TYPE_MISMATCH",
"message": "Invalid Integer value",
"row": 42,
"column": 2
}
]
}
This project is licensed under the MIT License.