| Crates.io | qsar |
| lib.rs | qsar |
| version | 0.0.2 |
| created_at | 2025-11-27 18:14:12.884157+00 |
| updated_at | 2025-11-27 19:32:47.677882+00 |
| description | qsar: a small Rust library for computing molecular descriptors and integrating with Linfa for QSAR modeling. |
| homepage | https://github.com/sawsimeon/qsar |
| repository | https://github.com/sawsimeon/qsar |
| max_upload_size | |
| id | 1954200 |
| size | 70,493 |
qsar is a lightweight Rust library for computing common molecular descriptors and integrating descriptor data with Linfa for basic QSAR modeling. The initial release focuses on an exact molecular weight calculator implemented in pure Rust (no native Open Babel / RDKit dependencies), convenient ndarray conversion utilities, and a minimal Linfa example.
This project is developed and maintained by Saw Simeon, author of multiple highly cited works in QSAR, cheminformatics, and structure-based virtual screening.
Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers
P Gomez-Sacristan, S Simeon, VK Tran-Nguyen, S Patil, PJ Ballester
Journal of Advanced Research 67, 185–196 (2025)
A practical guide to machine-learning scoring for structure-based virtual screening
VK Tran-Nguyen, M Junaid, S Simeon, PJ Ballester
Nature Protocols 18 (11), 3460–3511 (2023) – 68 citations
Structure-based virtual screening for PDL1 dimerizers: evaluating generic scoring functions
VK Tran-Nguyen, S Simeon, M Junaid, PJ Ballester
Current Research in Structural Biology 4, 206–210 (2022)
Characterizing the relationship between the chemical structures of drugs and their activities on primary cultures of pediatric solid tumors
S Simeon, G Ghislat, PJ Ballester
Current Medicinal Chemistry 28 (38), 7830–7839 (2021)
Towards reproducible computational drug discovery
N Schaduangrat, S Lampa, S Simeon, MP Gleeson, O Spjuth et al.
Journal of Cheminformatics 12 (1), 9 (2020) – 207 citations
Full list available on Google Scholar → Saw Simeon
Goals
Features
Quickstart
Add to your Cargo.toml:
[dependencies]
qsar = "0.0.1"
Compute molecular weight:
use qsar::descriptors::molecular_weight;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mw = molecular_weight("CCO")?; // ethanol
println!("Ethanol exact molecular weight: {}", mw);
Ok(())
}
Convert descriptor vectors and train a linear model:
use qsar::models::{to_ndarrays, train_and_predict_example};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Example: run the built-in Linfa example
let pred = train_and_predict_example()?;
println!("Prediction for sample [5.0, 6.0]: {}", pred[0]);
// Example: convert your own data
let descriptors = vec![vec![1.0, 2.0], vec![3.0, 4.0]];
let targets = vec![3.0, 7.0];
let (x, y) = to_ndarrays(descriptors, targets)?;
println!("Features shape: {:?}", x.dim());
Ok(())
}
Loading descriptors from CSV
use qsar::data_io::read_csv_descriptors;
let (descriptors, targets) = read_csv_descriptors("data/my_dataset.csv", &["mol_wt", "logp"], "pIC50")?;
Limitations and roadmap
=, #, bracketed atoms with H counts). It does not yet fully support rings, branches, or stereochemistry. This is deliberate to keep the first release small and portable.Contributing
Contributions are welcome. Please open issues and pull requests on the repository. The project follows standard Rust contribution practices: format with cargo fmt, lint with cargo clippy, and run tests with cargo test.
Recommended local validation
License This project is dual-licensed under MIT OR Apache-2.0. See the LICENSE file for details.