| Crates.io | f-ck |
| lib.rs | f-ck |
| version | 0.1.0 |
| created_at | 2025-09-20 00:04:24.653989+00 |
| updated_at | 2025-09-20 00:04:24.653989+00 |
| description | Fields combined with columnar keys |
| homepage | |
| repository | https://github.com/arlyon/f-ck |
| max_upload_size | |
| id | 1847272 |
| size | 145,022 |
Universal Columnar Data Merging Tool
f*ck is a powerful Rust-based data merging engine that empowers users to combine, clean, and transform messy tabular data through an intuitive DSL and visual interface.
f*ck stands for "fields combined with columnar keys" - the core concept of merging data fields across multiple sources using columnar key relationships with intelligent merge policies.
git clone https://github.com/your-repo/f-ck
cd f-ck
cargo build --release
# Preview results
./target/release/f-ck --query query.json --output result.csv --preview
# Write to file
./target/release/f-ck --query query.json --output result.csv
customers.csv
id,name,email
1,John Doe,john@example.com
2,Jane Smith,jane@example.com
3,Bob Johnson,bob@example.com
orders.csv
customer_id,order_total,product
1,99.99,Widget A
2,149.50,Widget B
1,25.00,Widget C
{
"sources": [
{
"id": "customers",
"path": "customers.csv",
"format": "csv"
},
{
"id": "orders",
"path": "orders.csv",
"format": "csv"
}
],
"destination_schema": [
{"name": "customer_id", "data_type": "Int64"},
{"name": "customer_name", "data_type": "String"},
{"name": "email", "data_type": "String"},
{"name": "total_spent", "data_type": "Float64"}
],
"primary_keys": {
"logic": "or",
"keys": ["customer_id"]
},
"mappings": [
{
"destination_field": "customer_id",
"policy": {"type": "firstMatch", "priority": ["customers"]},
"source_fields": [
{"id": "cust_id", "source_file_id": "customers", "column_name": "id"},
{"id": "order_cust_id", "source_file_id": "orders", "column_name": "customer_id"}
]
},
{
"destination_field": "customer_name",
"policy": {"type": "firstMatch", "priority": ["customers"]},
"source_fields": [
{"id": "name", "source_file_id": "customers", "column_name": "name"}
]
},
{
"destination_field": "email",
"policy": {"type": "firstMatch", "priority": ["customers"]},
"source_fields": [
{"id": "email", "source_file_id": "customers", "column_name": "email"}
]
},
{
"destination_field": "total_spent",
"policy": {"type": "sum"},
"source_fields": [
{"id": "order_total", "source_file_id": "orders", "column_name": "order_total"}
]
}
]
}
customer_id,customer_name,email,total_spent
1,John Doe,john@example.com,124.99
2,Jane Smith,jane@example.com,149.50
3,Bob Johnson,bob@example.com,0.0
| Policy | Description | Use Case |
|---|---|---|
FirstMatch |
Take first non-null value | Contact info, names |
Sum |
Add all values | Order totals, quantities |
Count |
Count non-null entries | Number of transactions |
Average |
Mean of all values | Average order size |
Min |
Minimum value | Earliest date, lowest price |
Max |
Maximum value | Latest date, highest price |
f-ck [OPTIONS]
OPTIONS:
-q, --query <FILE> JSON file containing the query plan [required]
-o, --output <FILE> Output file path [required]
-f, --format <FORMAT> Output format: csv, tsv, xlsx, sqlite [default: csv]
-p, --preview Preview results without writing to file
-l, --limit <N> Limit preview to N rows
-h, --help Print help information
-V, --version Print version information
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Data Sources β β Query DSL β β Output β
β β β β β β
β β’ CSV/TSV βββββΆβ β’ Field Maps βββββΆβ β’ CSV/TSV β
β β’ XLSX β β β’ Join Logic β β β’ XLSX β
β β’ SQLite β β β’ Merge Policy β β β’ SQLite β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
git checkout -b feature/amazing-featuregit commit -m 'Add amazing feature'git push origin feature/amazing-feature# Build and test
cargo build
cargo test
# Run with sample data
cargo run -- --query test_data/test_query.json --output result.csv --preview
# Check WASM compatibility (currently limited)
cargo check --target wasm32-unknown-unknown --lib
This project is licensed under the MIT License - see the LICENSE file for details.
The name represents both the frustration of working with messy data and the satisfaction of finally getting it clean. f*ck is about taking control of your data and making it work for you.
"fck around and find out... how clean your data can be."*