epimetheus-methylome

Crates.ioepimetheus-methylome
lib.rsepimetheus-methylome
version1.1.0
created_at2026-01-08 10:22:19.110977+00
updated_at2026-01-08 10:22:19.110977+00
descriptionA library for DNA motif representation and methylation analysis, supporting IUPAC codes and Nanopore methylation data (6mA, 5mC, 4mC)
homepage
repositoryhttps://github.com/SebastianDall/epimetheus
max_upload_size
id2029919
size71,384
Sebastian Dall (SebastianDall)

documentation

README

epimetheus_methylome

A Rust library for DNA motif representation and methylation analysis, with support for IUPAC nucleotide codes and Nanopore methylation data.

Features

  • DNA Motif Handling: Create, manipulate, and search for DNA motifs with modification sites
  • IUPAC Support: Full support for IUPAC nucleotide codes (including ambiguous bases like R, Y, N, etc.)
  • Methylation Types: Support for common DNA modifications:
    • 6mA (N6-methyladenine)
    • 5mC (5-methylcytosine)
    • 4mC (4-methylcytosine)
  • Motif Operations: Reverse complement, regex conversion, motif clustering/hierarchy
  • Nanopore Integration: Parse methylation data from FASTQ records with MM/ML tags

Installation

Add this to your Cargo.toml:

[dependencies]
epimetheus_methylome = "1.1.0"

Usage

Creating and Working with Motifs

use epimetheus_methylome::{Motif, ModType};

// Create a GATC motif with 6mA modification at position 1
let motif = Motif::new("GATC", "a", 1).unwrap();
assert_eq!(motif.mod_type, ModType::SixMA);

// Get reverse complement
let rev_comp = motif.reverse_complement();
assert_eq!(rev_comp.sequence_to_string(), "GATC");

// Convert to regex pattern
let regex = motif.to_regex();
assert_eq!(regex, "GATC");

IUPAC Nucleotide Codes

use epimetheus_methylome::{IupacBase, Motif};

// Create motif with ambiguous bases
let motif = Motif::new("RGATCY", "m", 4).unwrap();
// R = A or G, Y = C or T

// Convert to regex
let regex = motif.to_regex();
assert_eq!(regex, "[AG]GATC[CT]");

// Get all possible DNA sequences
let sequences = motif.possible_dna_sequences();
// Returns all combinations: AGATCC, AGATCT, GGATCC, GGATCT

Methylation Types

use epimetheus_methylome::ModType;

// Parse from pileup codes
let mod_type = "a".parse::<ModType>().unwrap();
assert_eq!(mod_type, ModType::SixMA);

// Get pileup code
assert_eq!(ModType::FiveMC.to_pileup_code(), "m");
assert_eq!(ModType::FourMC.to_pileup_code(), "21839");

Finding Motifs in Sequences

use epimetheus_methylome::{Motif, find_motif_indices_in_sequence};
use epimetheus_methylome::sequence::Sequence;

let sequence = Sequence::from_str("GGATCTCCATGATC").unwrap();
let motif = Motif::new("GATC", "m", 3).unwrap();

let indices = find_motif_indices_in_sequence(&sequence, &motif);
assert_eq!(indices, vec![4, 13]); // Returns modification positions

Motif Hierarchy

use epimetheus_methylome::Motif;

// Check parent-child relationships
let parent = Motif::new("GATC", "a", 1).unwrap();
let child = Motif::new("RGATCY", "a", 2).unwrap();

assert!(parent.is_child_motif(&child));

Working with Nanopore Data

use epimetheus_methylome::read::Read;
use noodles_fastq as fastq;

// Parse FASTQ record with MM/ML tags
let record = fastq::Record::new(
    fastq::record::Definition::new("read_id", "MM:Z:A+a.,0,1; ML:B:C,204,255"),
    "GGGCGGATCAGATC",
    "JIGSSGIH=GLLML"
);

let read = Read::from_fastq_record(record).unwrap();
let modifications = read.get_modifications();
// Access methylation calls at specific positions
// first and third A should have the methylation scores 204 and 255, respectively

API Documentation

For detailed API documentation, see docs.rs/epimetheus_methylome.

Use Cases

This library is designed for:

  • Bioinformatics pipelines processing Nanopore methylation data
  • Restriction enzyme motif analysis
  • Methylation pattern detection and quantification
  • Motif clustering and hierarchical analysis

License

MIT

Contributing

This crate is part of the Epimetheus project for analyzing methylation patterns from pileup files.

Commit count: 437

cargo fmt