| Crates.io | csv-diff |
| lib.rs | csv-diff |
| version | 0.1.2 |
| created_at | 2021-12-30 19:38:34.827996+00 |
| updated_at | 2025-12-23 20:14:38.400116+00 |
| description | Compare two CSVs - with ludicrous speed ๐. |
| homepage | https://gitlab.com/janriemer/csv-diff |
| repository | https://gitlab.com/janriemer/csv-diff |
| max_upload_size | |
| id | 505475 |
| size | 343,284 |
csv-diff is powering qsv's diff command! If you need a CLI tool for comparing CSVs or you want to do all kinds of crazy operations on your CSV data, you should definitely check out qsv!
use std::io::Cursor;
use csv_diff::{csv_diff::CsvByteDiffLocal, csv::Csv};
use csv_diff::diff_row::{ByteRecordLineInfo, DiffByteRecord};
use std::collections::HashSet;
use std::iter::FromIterator;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// some csv data with a header, where the first column is a unique id
let csv_data_left = "id,name,kind\n\
1,lemon,fruit\n\
2,strawberry,fruit";
let csv_data_right = "id,name,kind\n\
1,lemon,fruit\n\
2,strawberry,nut";
let csv_byte_diff = CsvByteDiffLocal::new()?;
let mut diff_byte_records = csv_byte_diff.diff(
Csv::with_reader_seek(csv_data_left.as_bytes()),
Csv::with_reader_seek(csv_data_right.as_bytes()),
)?;
diff_byte_records.sort_by_line();
let diff_byte_rows = diff_byte_records.as_slice();
assert_eq!(
diff_byte_rows,
&[DiffByteRecord::Modify {
delete: ByteRecordLineInfo::new(
csv::ByteRecord::from(vec!["2", "strawberry", "fruit"]),
3
),
add: ByteRecordLineInfo::new(csv::ByteRecord::from(vec!["2", "strawberry", "nut"]), 3),
field_indices: vec![2]
}]
);
Ok(())
}
In your Cargo.toml file add the following lines under [dependencies]:
csv-diff = "0.1"
This will use a rayon thread-pool, but you can opt-out of it and for example use threads without a thread-pool, by opting in into the crossbeam-threads feature (and opting-out of the default features):
csv-diff = { version = "0.1", default-features = false, features = ["crossbeam-threads"] }
This crate should be used on CSV data that has some sort of primary key for uniquely identifying a record. It is not a general line-by-line diffing crate. You can imagine dumping a database table in CSV format from your test and production system and comparing it with each other to find differences.
Due to the fact that this crate is still in it's infancy, there are still some caveats, which we might resolve in the near future:
StringRecord::from_byte_record , provided by the csv crate, to try converting them into UTF-8 encoded records.You can run benchmarks with the following command:
cargo bench
This crate is implemented in 100% Safe Rust, which is ensured by using #![forbid(unsafe_code)].
The Minimum Supported Rust Version for this crate is 1.88. An increase of MSRV will be indicated by a minor change (according to SemVer).
This crate is inspired by the CLI tool csvdiff by Aswin Karthik, which is written in Go. Definitely check it out. It is a great tool.
Additionally, this crate would not exist without the awesome Rust community and these fantastic crates ๐ฆ: