| Crates.io | nail-parquet |
| lib.rs | nail-parquet |
| version | 1.6.6 |
| created_at | 2025-06-12 17:12:23.055205+00 |
| updated_at | 2025-11-13 10:39:12.726037+00 |
| description | Lightning-fast CLI for data analysis: explore, filter, transform Parquet/CSV/Excel files with SQL-powered operations |
| homepage | |
| repository | https://github.com/Vitruves/nail-parquet |
| max_upload_size | |
| id | 1710010 |
| size | 874,816 |
nail is a high-performance command-line tool for analyzing, transforming, and exploring data files at blazing speed. Built with Rust, Apache Arrow, and DataFusion, it handles Parquet, CSV, JSON, and Excel files with ease—perfect for data engineers, analysts, and scientists who need quick insights without loading heavy tools.
🚀 Why nail? Process gigabyte-scale datasets in seconds • SQL-powered operations • Zero configuration • Works offline • Single binary
nail is the perfect tool for:
.xlsx files without Excel or LibreOfficecargo install nail-parquet
or
# From source
git clone https://github.com/Vitruves/nail-parquet
cd nail-parquet
cargo build --release
sudo cp target/release/nail /usr/local/bin/
# Verify installation
nail --help
All commands support these global flags:
-v, --verbose - Enable verbose output with timing and progress information-j, --jobs N - Number of parallel jobs (default: half of available CPU cores)-o, --output FILE - Output file path (prints to console if not specified)-f, --format FORMAT - Output format: json, csv, parquet, text (auto-detect by default)-h, --help - Display command helpnail describeShow comprehensive global file overview with metadata, dimensions, column types, and data quality metrics.
# Display file overview with colored output
nail describe data.parquet
# Include verbose logging
nail describe data.parquet --verbose
Output includes:
nail headDisplay the first N rows of a dataset.
# Basic usage
nail head data.parquet
# Display first 10 rows
nail head data.parquet -n 10
# Save to JSON file
nail head data.parquet -n 5 -o sample.json
# Verbose output with timing
nail head data.parquet -n 3 --verbose
Options:
-n, --number N - Number of rows to display (default: 5)nail tailDisplay the last N rows of a dataset.
# Display last 5 rows
nail tail data.parquet
# Display last 20 rows with verbose logging
nail tail data.parquet -n 20 --verbose
# Save last 10 rows to CSV
nail tail data.parquet -n 10 -o tail_sample.csv
Options:
-n, --number N - Number of rows to display (default: 5)nail previewRandomly sample and display N rows from the dataset. Supports both static display and interactive browsing mode.
# Random preview of 5 rows
nail preview data.parquet
# Reproducible random sample with seed
nail preview data.parquet -n 10 --random 42
# Interactive mode for browsing records one by one
nail preview data.parquet --interactive
Options:
-n, --number N - Number of rows to display (default: 5)--random SEED - Random seed for reproducible results-I, --interactive - Interactive mode with scrolling (use arrow keys, q to quit)Interactive Mode Controls:
←/→ or h/l - Navigate between records (previous/next)↑/↓ or k/j - Navigate between fields within a recordq, Esc - Quit interactive modeCtrl+C - Force quitnail headersList column names, optionally filtered by regex patterns.
# List all column headers
nail headers data.parquet
# Filter headers with regex
nail headers data.parquet --filter "^price.*"
# Save headers to file
nail headers data.parquet -o columns.txt
# JSON format output
nail headers data.parquet -f json
Options:
--filter REGEX - Filter headers with regex patternnail schemaDisplay detailed schema information including column types and nullability.
# Display schema
nail schema data.parquet
# Save schema to JSON
nail schema data.parquet -o schema.json
# Verbose schema analysis
nail schema data.parquet --verbose
nail metadataDisplay detailed Parquet file metadata, including schema, row groups, column chunks, compression, encoding, and statistics.
# Display basic metadata
nail metadata data.parquet
# Show all available metadata
nail metadata data.parquet --all
# Show detailed schema and row group information
nail metadata data.parquet --schema --row-groups --detailed
# Save all metadata to JSON
nail metadata data.parquet --all -o metadata.json
Options:
--schema - Show detailed schema information--row-groups - Show row group information--column-chunks - Show column chunk information--compression - Show compression information--encoding - Show encoding information--statistics - Show statistics information--all - Show all available metadata--detailed - Show metadata in detailed formatnail sizeAnalyze file and memory usage with detailed size breakdowns.
# Basic size analysis
nail size data.parquet
# Show per-column size breakdown
nail size data.parquet --columns
# Show per-row analysis
nail size data.parquet --rows
# Show all size metrics
nail size data.parquet --columns --rows
# Raw bits output (no human-friendly formatting)
nail size data.parquet --bits
# Save size analysis to file
nail size data.parquet --columns --rows -o size_report.txt
Options:
--columns - Show per-column sizes--rows - Show per-row analysis--bits - Show raw bits without human-friendly conversionnail searchSearch for specific values across columns with flexible matching options.
# Basic search across all columns
nail search data.parquet --value "John"
# Search in specific columns
nail search data.parquet --value "error" -c "status,message,log"
# Case-insensitive search
nail search data.parquet --value "ACTIVE" --ignore-case
# Exact match only (no partial matches)
nail search data.parquet --value "complete" --exact
# Return row numbers only
nail search data.parquet --value "Bob" --rows
# Save search results
nail search data.parquet --value "error" -o search_results.json
Options:
--value VALUE - Value to search for (required)-c, --columns PATTERN - Comma-separated column names to search in--rows - Return matching row numbers only--ignore-case - Case-insensitive search--exact - Exact match only (no partial matches)nail outliersDetect and optionally remove outliers from numeric columns using various methods.
# Detect outliers using IQR method for specific column
nail outliers data.parquet -c "price" --method iqr
# Detect outliers using Z-score with a custom threshold, showing values
nail outliers data.parquet -c "revenue" --method z-score --z-score-threshold 2.5 --show-values
# Remove outliers using Modified Z-score method from multiple columns
nail outliers data.parquet -c "age,income" --method modified-z-score --remove -o cleaned_data.parquet
# Detect outliers using Isolation Forest method (simplified)
nail outliers data.parquet -c "score" --method isolation-forest
Options:
-c, --columns PATTERN - Comma-separated column names or regex patterns for outlier detection--method METHOD - Outlier detection method: iqr, z-score, modified-z-score, isolation-forest (default: iqr)--iqr-multiplier VALUE - IQR multiplier for outlier detection (default: 1.5)--z-score-threshold VALUE - Z-score threshold for outlier detection (default: 3.0)--show-values - Show outlier values instead of just flagging them--include-row-numbers - Include row numbers in output--remove - Remove outliers from dataset and save cleaned datanail statsCompute statistical summaries for numeric and categorical columns with flexible percentile calculations and type filtering.
# Basic statistics (mean, Q25, Q50, Q75, unique count)
nail stats data.parquet
# Exhaustive statistics
nail stats data.parquet --stats-type exhaustive
# Statistics for specific columns
nail stats data.parquet -c "price,volume,quantity"
# Statistics with regex column selection
nail stats data.parquet -c "^(price|vol).*" --stats-type exhaustive
# Custom percentiles
nail stats data.parquet -c "revenue" --percentiles "0.1,0.5,0.9,0.95,0.99"
# Numeric columns only
nail stats data.parquet --numeric-only
# Categorical columns only
nail stats data.parquet --categorical-only
# Save statistics to file
nail stats data.parquet --stats-type basic -o stats.json
Options:
-c, --columns PATTERN - Comma-separated column names or regex patterns-t, --stats-type TYPE - Statistics type: basic, exhaustive, hypothesis (default: basic)-p, --percentiles VALUES - Custom percentiles (comma-separated, e.g., '0.1,0.5,0.9')--numeric-only - Include only numeric columns--categorical-only - Include only categorical (string) columnsStatistics Types:
nail correlationsCompute correlation matrices between numeric columns with optional statistical significance testing.
# Basic Pearson correlation
nail correlations data.parquet
# Specific correlation types
nail correlations data.parquet --type kendall
nail correlations data.parquet --type spearman
# Correlations for specific columns
nail correlations data.parquet -c "price,volume,quantity"
# Output as correlation matrix format
nail correlations data.parquet --matrix
# Include statistical significance tests (p-values for fisher, t-test, chi-sqr)
nail correlations data.parquet --tests fisher_exact,t_test
# Comprehensive correlation analysis with significance tests
nail correlations data.parquet --tests fisher_exact -o correlations.json
Options:
-c, --columns PATTERN - Comma-separated column names or regex patterns-t, --type TYPE - Correlation type: pearson, kendall, spearman (default: pearson)--matrix - Output as correlation matrix format--tests - Include statistical significance tests (fisher_exact, chi_sqr, t_test)--digits N - Number of decimal places for correlation values (default: 4)nail frequencyCompute frequency tables for categorical columns showing value counts, distributions, and percentages.
# Basic frequency table for a single column
nail frequency data.parquet -c "category"
# Multiple columns frequency analysis
nail frequency data.parquet -c "category,status,region"
# Save frequency table to file
nail frequency data.parquet -c "product_type" -o frequency_table.csv
# Verbose output with progress information
nail frequency data.parquet -c "category,status" --verbose
Options:
-c, --columns PATTERN - Comma-separated column names to analyze (required).Output: Shows frequency counts with percentages for each value, helping identify data distribution patterns.
nail selectSelect specific columns and/or rows from the dataset.
# Select specific columns
nail select data.parquet -c "id,name,price"
# Select columns with regex
nail select data.parquet -c "^(id|price).*"
# Select specific rows
nail select data.parquet -r "1,5,10-20"
# Select both columns and rows
nail select data.parquet -c "id,price" -r "1-100"
# Save selection to new file
nail select data.parquet -c "id,name" -o subset.parquet
Options:
-c, --columns PATTERN - Column names or regex patterns (comma-separated)-r, --rows SPEC - Row numbers or ranges (e.g., "1,3,5-10")nail dropRemove specific columns and/or rows from the dataset.
# Drop specific columns
nail drop data.parquet -c "temp_col,debug_info"
# Drop columns matching pattern
nail drop data.parquet -c "^temp_.*"
# Drop specific rows
nail drop data.parquet -r "1,5,100-200"
# Drop both columns and rows
nail drop data.parquet -c "temp_col" -r "1-10"
Options:
-c, --columns PATTERN - Column names or regex patterns to drop-r, --rows SPEC - Row numbers or ranges to dropnail filterFilter data based on column conditions or row characteristics.
# Filter by column conditions
nail filter data.parquet -c "price>100,volume<1000"
# Multiple conditions with different operators
nail filter data.parquet -c "age>=18,status=active,score!=0"
# String matching and numeric comparisons
nail filter data.parquet -c "name!=test,salary<=50000,active=true"
# Filter to numeric columns only
nail filter data.parquet --rows numeric-only
# Remove rows with NaN values
nail filter data.parquet --rows no-nan
# Remove rows with zeros
nail filter data.parquet --rows no-zeros
# String columns only
nail filter data.parquet --rows char-only
Options:
-c, --columns CONDITIONS - Column filter conditions (comma-separated). Supported operators:
= (equals), != (not equals)> (greater than), >= (greater or equal)< (less than), <= (less or equal)age>25, status=active, price<=100--rows FILTER - Row filter type: no-nan, numeric-only, char-only, no-zerosnail fillFill missing values using various strategies.
# Fill with specific value
nail fill data.parquet --method value --value 0
# Fill specific columns with value
nail fill data.parquet -c "price,quantity" --method value --value -1
# Fill with mean (for numeric columns)
nail fill data.parquet --method mean
# Fill with median
nail fill data.parquet --method median -c "price,volume"
Options:
--method METHOD - Fill method: value, mean, median, mode, forward, backward (default: value)--value VALUE - Fill value (required for 'value' method)-c, --columns PATTERN - Comma-separated column names to fillnail renameRename one or more columns.
# Rename a single column
nail rename data.parquet --column "old_name=new_name" -o renamed.parquet
# Rename multiple columns
nail rename data.parquet --column "id=user_id,val=value" -o renamed.parquet
Options:
-c, --column SPECS - Column rename specs (before=after), comma-separated.nail createCreate new columns with expressions based on existing columns.
# Create a single new column
nail create data.parquet --column "total=price*quantity" -o enhanced.parquet
# Create multiple columns
nail create data.parquet --column "total=price*quantity,margin=(price-cost)" -o enhanced.parquet
# Filter rows while creating columns
nail create data.parquet --column "category_score=score*2" --row-filter "score>50" -o filtered_enhanced.parquet
Options:
-c, --column SPECS - Column creation specifications (name=expression), comma-separated.-r, --row-filter FILTER - Row filter expression to apply before creating columns.nail dedupRemove duplicate rows or columns from the dataset.
# Remove duplicate rows (all columns considered)
nail dedup data.parquet --row-wise
# Remove duplicate rows based on specific columns
nail dedup data.parquet --row-wise -c "id,email"
# Keep last occurrence instead of first
nail dedup data.parquet --row-wise --keep last
# Remove duplicate columns (by name)
nail dedup data.parquet --col-wise
# Save deduplicated data
nail dedup data.parquet --row-wise -o clean_data.parquet
Options:
--row-wise - Remove duplicate rows (conflicts with --col-wise)--col-wise - Remove duplicate columns (conflicts with --row-wise)-c, --columns PATTERN - Columns to consider for row-wise deduplication--keep STRATEGY - Keep 'first' or 'last' occurrence (default: first)nail sampleSample data using various strategies.
# Random sampling
nail sample data.parquet -n 1000
# Reproducible random sampling
nail sample data.parquet -n 500 --method random --random 42
# Stratified sampling
nail sample data.parquet -n 1000 --method stratified --stratify-by category
# First N rows
nail sample data.parquet -n 100 --method first
# Last N rows
nail sample data.parquet -n 100 --method last
Options:
-n, --number N - Number of samples (default: 10)--method METHOD - Sampling method: random, stratified, first, last (default: random)--stratify-by COLUMN - Column name for stratified sampling--random SEED - Random seed for reproducible resultsnail shuffleRandomly shuffle the order of rows in the dataset.
# Random shuffle
nail shuffle data.parquet
# Reproducible shuffle (Note: DataFusion's RANDOM() may not be deterministic)
nail shuffle data.parquet --random 42
# Shuffle and save to new file
nail shuffle data.parquet -o shuffled.parquet --verbose
Options:
--random SEED - Random seed for reproducible resultsnail sortSort data by one or more columns with flexible sorting strategies and null handling.
# Sort by all columns (auto-detect data types)
nail sort data.parquet
# Sort by specific columns
nail sort data.parquet -c "price,date"
# Sort with specific strategies
nail sort data.parquet -c "date,amount,name" -s "date,numeric,alphabetic"
# Descending sort
nail sort data.parquet -c "revenue" -d true
# Multiple columns with mixed directions
nail sort data.parquet -c "category,price" -d "false,true"
# Handle nulls differently
nail sort data.parquet -c "score" --nulls first
nail sort data.parquet -c "rating" --nulls skip
# Case-insensitive alphabetic sorting
nail sort data.parquet -c "name,category" -s "alphabetic" --case-insensitive
# Sort dates with custom format
nail sort data.parquet -c "date" -s "date" --date-format "mm-dd-yyyy"
# Sort time values with custom format
nail sort data.parquet -c "timestamp" -s "hour" --hour-format "hh:mm:ss"
Options:
-c, --column COLUMNS - Columns to sort by (comma-separated or 'all') (default: all)-s, --strategy STRATEGIES - Sort strategy per column: numeric, date, alphabetic, alphabetic-numeric, numeric-alphabetic, hour, auto (comma-separated)-d, --descending FLAGS - Sort in descending order (comma-separated true/false per column)--nulls HANDLING - Null value handling: first, last, skip (default: last)--date-format FORMAT - Date format pattern (e.g., 'mm-dd-yyyy', 'dd/mm/yyyy', 'yyyy-mm-dd')--hour-format FORMAT - Time format pattern (e.g., 'hh:mm:ss', 'mm:ss')--case-insensitive - Case-insensitive alphabetic sortingSort Strategies:
nail idAdd a unique ID column to the dataset.
# Add simple numeric ID column
nail id data.parquet --create
# Add ID with custom name and prefix
nail id data.parquet --create --id-col-name record_id --prefix "REC-"
# Save with new ID column
nail id data.parquet --create -o data_with_ids.parquet
Options:
--create - Create new ID column--prefix PREFIX - Prefix for ID values (default: "id")--id-col-name NAME - ID column name (default: "id")nail diffCompare two datasets and show differences using key-based or row-based comparison.
# Compare files using key columns
nail diff old_data.parquet --compare new_data.parquet --keys "id,timestamp"
# Show only changed rows
nail diff v1.parquet --compare v2.parquet --keys "id" --changes-only
# Show rows only in left file
nail diff current.parquet --compare archive.parquet --keys "record_id" --left-only
# Show rows only in right file
nail diff baseline.parquet --compare updated.parquet --keys "user_id" --right-only
# Row-by-row positional comparison (no keys)
nail diff file1.csv --compare file2.csv
# Save diff results
nail diff old.parquet --compare new.parquet --keys "id" -o differences.parquet
Options:
-c, --compare FILE - Second file to compare with (required)-k, --keys COLUMNS - Columns to use as primary key for comparison (comma-separated)--changes-only - Show only rows that differ--left-only - Show only rows in left file--right-only - Show only rows in right fileOutput:
diff_status column indicates: ADDED, REMOVED, or MODIFIEDnail mergeJoin two datasets horizontally based on a common key column.
# Inner join (default)
nail merge left.parquet --right right.parquet --key id -o merged.parquet
# Left join - keep all records from left table
nail merge customers.parquet --right orders.parquet --left-join --key customer_id -o customer_orders.parquet
# Right join - keep all records from right table
nail merge orders.parquet --right customers.parquet --right-join --key customer_id -o order_customers.parquet
# Merge on columns with different names
nail merge table1.parquet --right table2.parquet --key-mapping "table1_id=table2_user_id"
Options:
--right FILE - Right table file to merge with (required)--key COLUMN - Join key column name (if same in both tables)--key-mapping MAPPING - Join key mapping for different column names (left_col=right_col)--left-join - Perform left join--right-join - Perform right joinnail appendAppend multiple datasets vertically (union operation).
# Append files with matching schemas
nail append base.parquet --files "file1.parquet,file2.parquet" -o combined.parquet
# Append with verbose logging
nail append base.parquet --files "jan.parquet,feb.parquet,mar.parquet" --verbose -o q1_data.parquet
# Force append with schema differences (fills missing columns with nulls)
nail append base.parquet --files "different_schema.parquet" --ignore-schema -o combined.parquet
Options:
--files FILES - Comma-separated list of files to append (required)--ignore-schema - Ignore schema mismatches and force appendnail splitSplit dataset into multiple files based on ratios or stratification.
# Split by ratio
nail split data.parquet --ratio "0.7,0.3" --names "train,test" --output-dir splits/
# Stratified split
nail split data.parquet --ratio "0.8,0.2" --stratified-by category --output-dir splits/
# Reproducible split
nail split data.parquet --ratio "0.6,0.2,0.2" --random 42 --output-dir splits/
Options:
--ratio RATIOS - Comma-separated split ratios (must sum to 1.0 or 100.0)--names NAMES - Comma-separated output file names--output-dir DIR - Output directory for split files--stratified-by COLUMN - Column for stratified splitting--random SEED - Random seed for reproducible splitsnail optimizeOptimize Parquet files by applying compression, sorting, and encoding techniques.
# Basic optimization with default compression (snappy)
nail optimize input.parquet -o optimized.parquet
# Optimize with specific compression type and level
nail optimize input.parquet -o optimized.parquet --compression zstd --compression-level 5
# Sort data while optimizing
nail optimize input.parquet -o optimized.parquet --sort-by "timestamp,id"
# Enable dictionary encoding for better compression
nail optimize input.parquet -o optimized.parquet --dictionary
# Comprehensive optimization and validation
nail optimize input.parquet -o optimized.parquet --compression zstd --sort-by "date,category" --dictionary --validate --verbose
Options:
--compression TYPE - Compression type: snappy, gzip, zstd, brotli (default: snappy)--compression-level LEVEL - Compression level (1-9, default: 6)--sort-by COLUMNS - Comma-separated columns to sort by--dictionary - Enable dictionary encoding--no-dictionary - Disable dictionary encoding--validate - Validate optimized file after creationnail binningBin continuous variables into categorical ranges for analysis. Note: Current implementation supports one column at a time.
# Equal-width binning with 5 bins
nail binning data.parquet -c "age" -b 5 -o binned.parquet
# Custom bins with specific edges
nail binning data.parquet -c "score" -b "0,50,80,90,100" --method custom -o binned.parquet
# Add bin labels
nail binning data.parquet -c "temperature" -b 3 --labels "Cold,Warm,Hot" -o binned.parquet
Options:
-c, --columns COLUMN - Column to bin (required)-b, --bins BINS - Number of bins or custom edges (e.g., "5" or "0,10,50,100") (default: 10)--method METHOD - Binning method: equal-width, custom (default: equal-width)--labels LABELS - Comma-separated bin labels--suffix SUFFIX - Suffix for new binned column (default: "_binned")--drop-original - Drop original column after binningnail pivotCreate pivot tables for data aggregation and cross-tabulation. Note: Current implementation is a simplified group-by aggregation and does not create a wide-format pivot table.
# Basic pivot table with sum aggregation
nail pivot data.parquet -i "region" -c "product" -l "sales" -o pivot.parquet
# Pivot with different aggregation functions
nail pivot data.parquet -i "category" -c "product" -l "revenue" --agg mean -o avg_pivot.parquet
Options:
-i, --index COLUMNS - Row index columns (comma-separated) (required)-c, --columns COLUMNS - Column pivot columns (comma-separated) (required)-l, --values COLUMNS - Value columns to aggregate (comma-separated)--agg FUNC - Aggregation function: sum, mean, count, min, max (default: sum)--fill VALUE - Fill missing values (default: "0")nail convertConvert between different file formats.
# Convert Parquet to CSV
nail convert data.parquet -o data.csv
# Convert CSV to Parquet
nail convert data.csv -o data.parquet
# Convert Parquet to Excel
nail convert data.parquet -o data.xlsx
# Verbose conversion with progress
nail convert large_dataset.csv -o large_dataset.parquet --verbose
Supported Formats:
nail countCount the number of rows in a dataset.
# Basic row count
nail count data.parquet
# Count with verbose output
nail count data.parquet --verbose
nail updateCheck for newer versions of the nail tool.
# Check for updates
nail update
# Check with verbose output
nail update --verbose
# Quick dataset overview with comprehensive file info
nail describe sales_data.parquet
# Traditional exploration
nail schema sales_data.parquet
nail size sales_data.parquet --columns --rows
nail head sales_data.parquet -n 10
nail stats sales_data.parquet --stats-type basic
# Advanced statistics with custom percentiles
nail stats sales_data.parquet -c "revenue,profit" --percentiles "0.25,0.5,0.75,0.90,0.95,0.99"
# Column inspection
nail headers sales_data.parquet --filter "price"
nail correlations sales_data.parquet -c "price,quantity,discount" --stats-tests t_test
# Frequency analysis for categorical data
nail frequency sales_data.parquet -c "category,region,status"
# Comprehensive data quality overview
nail describe data.parquet
# Compare datasets to find differences
nail diff yesterday.parquet --compare today.parquet --keys "id" --changes-only
# Search for problematic values
nail search data.parquet --value "error" --ignore-case
nail search data.parquet --value "null" -c "critical_fields" --rows
# Find and remove duplicates
nail dedup data.parquet --row-wise -c "id" --verbose -o unique_data.parquet
# Analyze data size and memory usage
nail size data.parquet --columns --bits
# Check for specific patterns in text fields
nail search data.parquet --value "@gmail.com" -c "email" --exact
# Frequency analysis to identify data quality issues
nail frequency data.parquet -c "status" --verbose
# Create new calculated columns
nail create sales_data.parquet --column "total=price*quantity,profit=(price-cost)" -o enhanced_sales.parquet
# Rename columns for clarity
nail rename enhanced_sales.parquet --column "total=total_sale,profit=net_profit" -o final_sales.parquet
# Create complex derived metrics
nail create customer_data.parquet --column "lifetime_value=orders*avg_order" -o customer_metrics.parquet
# Filter and enhance in one step
nail create large_dataset.parquet --column "score=performance*weight" --row-filter "active=true" -o active_scored.parquet
# 1. Optimize raw data files for better performance
nail optimize raw_data.parquet -o optimized_data.parquet --compression zstd --sort-by "timestamp,customer_id" --dictionary --verbose
# 2. Create analytical features with binning
nail binning optimized_data.parquet -c "age" -b "18,25,35,50,65" --method custom --labels "18-24,25-34,35-49,50-64,65+" -o aged_data.parquet
# 3. Group data for analysis (using pivot's group-by functionality)
nail pivot aged_data.parquet -i "age_binned" -c "category" -l "revenue" --agg sum -o age_revenue_summary.parquet
# 4. Add derived metrics
nail create age_revenue_summary.parquet --column "avg_revenue=sum_revenue/count_revenue" -o enhanced_summary.parquet
# 5. Statistical analysis of optimized data
nail stats enhanced_summary.parquet --stats-type exhaustive -o summary_stats.json
-c with regex patterns to operate only on relevant columns.-j to control parallel processing based on your system.--verbose to monitor performance and progress on large datasets.nail provides detailed error messages for common issues:
MIT License - see LICENSE file for details.
For issues and questions: