| Crates.io | biblib |
| lib.rs | biblib |
| version | 0.3.0 |
| created_at | 2025-01-25 10:39:19.861905+00 |
| updated_at | 2025-08-16 19:10:47.181542+00 |
| description | Parse, manage, and deduplicate academic citations |
| homepage | |
| repository | https://github.com/AliAzlanDev/biblib |
| max_upload_size | |
| id | 1530494 |
| size | 321,797 |
A comprehensive Rust library for parsing, managing, and deduplicating academic citations. biblib provides robust support for multiple citation formats, intelligent deduplication, and extensive metadata handling.
Note: This crate is under active development. There may be some breaking changes between versions. Please check the changelog before upgrading.
RIS (Research Information Systems)
PubMed/MEDLINE
EndNote XML
CSV with Custom Mappings
This crate can be minimized like so:
cargo add biblib --no-default-features --features lite
Add this to your Cargo.toml:
[dependencies]
biblib = "0.3.0" # All features enabled by default
Or select specific features:
[dependencies]
biblib = { version = "0.3.0", default-features = false, features = ["csv", "ris"] }
Available features:
csv - CSV format supportpubmed - PubMed/MEDLINE format supportxml - EndNote XML support (requires quick-xml)ris - RIS format supportdedupe - Citation deduplication (requires rayon and strsim)All features are enabled by default. Disable default-features to select specific ones.
use biblib::{CitationParser, RisParser};
// Parse RIS format
let input = r#"TY - JOUR
TI - Example Article
AU - Smith, John
ER -"#;
let parser = RisParser::new();
let citations = parser.parse(input).unwrap();
println!("Title: {}", citations[0].title);
use biblib::dedupe::{Deduplicator, DeduplicatorConfig};
// Configure deduplication
let config = DeduplicatorConfig {
group_by_year: true,
run_in_parallel: true,
};
let deduplicator = Deduplicator::with_config(config);
let duplicate_groups = deduplicator.find_duplicates(&citations).unwrap();
for group in duplicate_groups {
println!("Original: {}", group.unique.title);
for duplicate in group.duplicates {
println!(" Duplicate: {}", duplicate.title);
}
}
use biblib::csv::{CsvParser, CsvConfig};
let mut config = CsvConfig::new();
config.set_header_mapping("title", vec!["Article Name".to_string()])
.set_delimiter(b';');
let parser = CsvParser::with_config(config);
let citations = parser.parse("Article Name;Author;Year\nExample Paper;Smith J;2023").unwrap();
| Field | Description | RIS | PubMed | EndNote XML | CSV |
|---|---|---|---|---|---|
| Title | Work title | ✓ | ✓ | ✓ | ✓ |
| Authors | Author names and affiliations | ✓ | ✓ | ✓ | ✓ |
| Journal | Journal name and abbreviation | ✓ | ✓ | ✓ | ✓ |
| Year | Publication year | ✓ | ✓ | ✓ | ✓ |
| Volume | Journal volume | ✓ | ✓ | ✓ | ✓ |
| Issue | Journal issue | ✓ | ✓ | ✓ | ✓ |
| Pages | Page range | ✓ | ✓ | ✓ | ✓ |
| DOI | Digital Object Identifier | ✓ | ✓ | ✓ | ✓ |
| PMID | PubMed ID | ✓ | ✓ | - | ✓ |
| PMC ID | PubMed Central ID | ✓ | ✓ | ✓ | ✓ |
| Abstract | Abstract text | ✓ | ✓ | ✓ | ✓ |
| Keywords | Keywords/tags | ✓ | ✓ | ✓ | ✓ |
| Language | Publication language | ✓ | ✓ | ✓ | ✓ |
| Publisher | Publisher information | ✓ | - | ✓ | ✓ |
| URLs | Related URLs | ✓ | - | ✓ | ✓ |
| ISSN | International Standard Serial Number | ✓ | ✓ | ✓ | ✓ |
| MeSH Terms | Medical Subject Headings | - | ✓ | - | - |
use biblib::dedupe::{Deduplicator, DeduplicatorConfig};
// Fine-tune deduplication settings
let config = DeduplicatorConfig {
group_by_year: true, // Enable year-based grouping
run_in_parallel: true, // Enable parallel processing
};
let deduplicator = Deduplicator::with_config(config);
use biblib::{CitationParser, RisParser, ParseError};
let result = RisParser::new().parse("invalid input");
match result {
Ok(citations) => println!("Parsed {} citations", citations.len()),
Err(parse_err) => eprintln!("Parse error: {}", parse_err),
}
All parser implementations are thread-safe and can be shared between threads. The deduplication engine supports parallel processing through the run_in_parallel configuration option.
We welcome contributions! Please feel free to submit pull requests. For major changes, please open an issue first to discuss what you would like to change.
Make sure to update tests as appropriate and follow the existing code style.
This project is licensed under either of
at your option.
This crate is under active development. There may be some breaking changes.