Crates.io | csv-guillotine |
lib.rs | csv-guillotine |
version | 0.3.5 |
source | src |
created_at | 2019-04-29 06:21:01.703732 |
updated_at | 2019-05-27 19:38:39.732687 |
description | CSV's often have metadata at top before data headers. This removes it. |
homepage | https://github.com/forbesmyester/csv-guillotine |
repository | https://github.com/forbesmyester/csv-guillotine |
max_upload_size | |
id | 130890 |
size | 22,637 |
Many banks, stockbrokers and other large institutions will allow you to download your account history in a CSV file. This is good and to be applauded but they often include an extra metadata header at the top of the file explaining what it is. The CSV file may look something like the following:
Account:,****07493
£4536.24
£4536.24
Transaction type,Description,Paid out,Paid in,Balance
Bank credit YOUR EMPLOYER,Bank credit YOUR EMPLOYER,,£2016.12,£4536.24
Direct debit CREDIT PLASTIC,CREDIT PLASTIC,£402.98,,£520.12
For many users this is fine as it can still be loaded into a spreadsheet application.
For my use case, I need to download many of these files, which makes up one large data set and these extra metadata headers are quite an issue because I can no longer use xsv to parse them directly.
This library is a form of buffer which removes this metadata header. It does this by looking at the field count in a given number of rows and removes the lines before the maximum is reached.
To compile install rust from rustup, check out this repository and run:
cargo install --path .
This can be used like the following:
cat with_metadata_headers.csv | csv-guillotine --separator=',' --consider=20 > csv_header_and_data only.csv
or
csv-guillotine -i with_metadata_headers.csv -o csv_header_and_data only.csv
see csv-guillotine --help
for full usage instructions
Errors will be printed to STDERR and their existence can be detected via the exit status.
NOTE: This software makes no attempt to actually validate that your CSV.
This library exposes a Blade
class which is constructed with a Read
as well as a character (expressed as a u8) and a line limit. The Blade
class can be used as a Read
to get the actual data out.
Example below:
extern crate csv_guillotine;
use std::io::{BufRead, BufReader};
use csv_guillotine::Blade;
fn main() {
let stdin = std::io::stdin();
let blade = Blade::new(stdin, 44, 20);
let mut buf_reader = BufReader::new(blade);
let mut read_size = 1;
while read_size != 0 {
let mut buffer = String::new();
match buf_reader.read_line(&mut buffer) {
Ok(r) => {
print!("{}", buffer);
read_size = r;
},
Err(e) => {
eprintln!("ERROR: {}", e);
}
}
}
}