Crates.io | rustyms |
lib.rs | rustyms |
version | 0.9.0-alpha.3 |
source | src |
created_at | 2023-06-07 11:27:22.378986 |
updated_at | 2024-11-22 11:32:20.372293 |
description | A library to handle proteomic mass spectrometry data and match peptides to spectra. |
homepage | |
repository | https://github.com/snijderlab/rustyms |
max_upload_size | |
id | 884672 |
size | 115,800,927 |
Handle mass spectrometry data in Rust. This crate is set up to handle very complex peptides with
loads of ambiguity and complexity. It pivots around the [CompoundPeptidoform
], [Peptidoform
] and [LinearPeptide
]
which encode the ProForma specification. Additionally
this crate enables the reading of mgf, doing spectrum annotation
(BU/MD/TD), finding isobaric sequences, doing alignments of peptides
, accessing the IMGT germline database, and reading identified peptide files.
# fn main() -> Result<(), rustyms::error::CustomError> {
# let raw_file_path = "data/annotated_example.mgf";
use rustyms::{*, system::{usize::Charge, e}};
// Open example raw data (this is the built in mgf reader, look into mzdata for more advanced raw file readers)
let spectrum = rawfile::mgf::open(raw_file_path)?;
// Parse the given ProForma definition
let peptide = CompoundPeptidoform::pro_forma("[Gln->pyro-Glu]-QVQEVSERTHGGNFD", None)?;
// Generate theoretical fragments for this peptide given EThcD fragmentation
let model = Model::ethcd();
let fragments = peptide.generate_theoretical_fragments(Charge::new::<e>(2), &model);
// Annotate the raw data with the theoretical fragments
let annotated = spectrum[0].annotate(peptide, &fragments, &model, MassMode::Monoisotopic);
// Calculate a peak false discovery rate for this annotation
let (fdr, _) = annotated.fdr(&fragments, &model, MassMode::Monoisotopic);
// This is the incorrect sequence for this spectrum so the peak FDR will indicate this
# dbg!(&fdr, fdr.peaks_sigma(), fdr.peaks_fdr(), fdr.peaks_score());
assert!(fdr.peaks_sigma() > 2.0);
# Ok(()) }
# fn main() -> Result<(), rustyms::error::CustomError> {
use rustyms::{*, align::*};
// Check how this peptide compares to a similar peptide (using the feature `align`)
let first_peptide = LinearPeptide::pro_forma("IVQEVT", None)?.into_simple_linear().unwrap();
let second_peptide = LinearPeptide::pro_forma("LVQVET", None)?.into_simple_linear().unwrap();
// Align the two peptides using mass based alignment
// IVQEVT A
// LVQVET B
// ─ ╶╴
let alignment = align::<4, SimpleLinear, SimpleLinear>(
&first_peptide,
&second_peptide,
AlignScoring::default(),
AlignType::GLOBAL);
# dbg!(&alignment);
// Calculate some more statistics on this alignment
let stats = alignment.stats();
assert_eq!(stats.mass_similar, 6); // 6 out of the 6 positions are mass similar
# Ok(()) }
Rustyms ties together multiple smaller modules into one cohesive structure. It has multiple features which allow you to slim it down if needed (all are enabled by default).
align
- gives access to mass based alignment of peptides.identification
- gives access to methods reading many different identified peptide formats.imgt
- enables access to the IMGT database of antibodies germline sequences, with annotations.isotopes
- gives access to generation of an averagine model for isotopes, also enables two additional dependencies.rand
- allows the generation of random peptides.rayon
- enables parallel iterators using rayon, mostly for imgt
but also in consecutive align.mzdata
- enables integration with mzdata which has more advanced raw file support.