| Crates.io | modtector |
| lib.rs | modtector |
| version | 0.9.6 |
| created_at | 2025-09-27 19:33:40.676003+00 |
| updated_at | 2025-09-27 19:33:40.676003+00 |
| description | A high-performance modification detection tool in Rust |
| homepage | https://github.com/TongZhou2017/modtector |
| repository | https://github.com/TongZhou2017/modtector |
| max_upload_size | |
| id | 1857562 |
| size | 520,905 |
ModTector is a high-performance RNA modification detection tool developed in Rust, supporting multi-signal analysis, data normalization, reactivity calculation, and accuracy assessment. Based on pileup analysis, it can detect RNA modification sites from BAM-format sequencing data and generate rich visualization results.
The following diagram shows the complete ModTector workflow from raw data to evaluation results:

ModTector provides a complete workflow from raw BAM data to final evaluation results:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
Download and run the rustup installer from https://rustup.rs/
cargo install modtector
git clone https://github.com/TongZhou2017/ModTector.git
cd ModTector
cargo build --release
ModTector requires the following system libraries:
sudo apt-get update
sudo apt-get install build-essential pkg-config libssl-dev libhts-dev
sudo yum groupinstall "Development Tools"
sudo yum install pkgconfig openssl-devel htslib-devel
xcode-select --install
brew install htslib
ModTector uses the following Rust crates:
[dependencies]
rust-htslib = "0.47.0" # BAM file processing
bio = "2.0.1" # Bioinformatics utilities
plotters = { version = "0.3", features = ["svg_backend", "bitmap_backend"] } # Plotting and visualization
csv = "1.3" # CSV file handling
rayon = "1.8" # Parallel processing
clap = { version = "4.5", features = ["derive"] } # Command-line argument parsing
chrono = "0.4" # Date and time handling
rand = "0.8" # Random number generation
quick-xml = "0.36" # XML/SVG parsing
regex = "1.11" # Regular expressions
# Generate pileup data from BAM files
modtector count -b sample.bam -f reference.fa -o output.csv
# Calculate reactivity scores
modtector reactivity -M mod_sample.csv -U unmod_sample.csv -O reactivity.csv
# Normalize reactivity signals
modtector norm -i reactivity.csv -o normalized_reactivity.csv -m winsor90
# Compare samples and identify differences
modtector compare -m mod_sample.csv -u unmod_sample.csv -o comparison.csv
# Generate visualizations
modtector plot -M mod_sample.csv -U unmod_sample.csv -o plots/ -r normalized_reactivity.csv
# Generate RNA structure SVG plots
modtector plot -o svg_output/ --svg-template rna_structure.svg --reactivity normalized_reactivity.csv
# Evaluate accuracy
modtector evaluate -r normalized_reactivity.csv -s structure.dp -o results/ -g gene_id
Generate pileup data from BAM files by counting stop and mutation signals.
modtector count [OPTIONS] --bam <BAM> --fasta <FASTA> --output <OUTPUT>
Parameters:
-b, --bam: Input BAM file path (must be sorted and indexed)-f, --fasta: Reference FASTA file path-o, --output: Output CSV file path-t, --threads: Number of parallel threads (default: 1)Input File Requirements:
samtools sort input.bam -o sorted.bam
samtools index sorted.bam
Example:
modtector count \
-b data/mod_sample.bam \
-f data/reference.fa \
-o result/mod.csv \
-t 8
Output Format (CSV):
ChrID,Strand,Position,RefBase,StopCount,MutationCount,Depth,InsertionCount,DeletionCount,BaseA,BaseC,BaseG,BaseT
Normalize and filter signals to remove noise and outliers.
modtector norm [OPTIONS] --input <INPUT> --output <OUTPUT>
Parameters:
-i, --input: Input CSV file-o, --output: Output normalized CSV file-m, --method: Normalization method (winsor90, percentile28, boxplot)--bases: Target bases for analysis (e.g., AC for A and C bases)--coverage-threshold: Coverage threshold (default: 0.2)--depth-threshold: Depth threshold (default: 50)Example:
modtector norm \
-i result/mod.csv \
-o result/mod_norm.csv \
-m winsor90 \
--bases AC \
--coverage-threshold 0.2 \
--depth-threshold 50
Calculate reactivity scores by comparing modified and unmodified samples.
modtector reactivity [OPTIONS] --mod-csv <MOD_CSV> --unmod-csv <UNMOD_CSV> --output <OUTPUT>
Parameters:
-M, --mod-csv: Modified sample CSV-U, --unmod-csv: Unmodified sample CSV-O, --output: Output reactivity file-s, --stop-method: Stop signal method (current, ding)-m, --mutation-method: Mutation signal method (current, ding)-t, --threads: Number of parallel threads (default: 8)--ding-pseudocount: Pseudocount parameter for Ding method (default: 1.0)Calculation Methods:
Example:
# Using default method
modtector reactivity \
-M result/mod.csv \
-U result/unmod.csv \
-O result/reactivity.csv \
-t 24
# Using Ding method
modtector reactivity \
-M result/mod.csv \
-U result/unmod.csv \
-O result/reactivity.csv \
-t 24 \
-s ding \
-m ding \
--ding-pseudocount 1.0
Output Format (CSV):
ChrID,Strand,Position,RefBase,Reactivity
Compare modified and unmodified samples to identify differential modification sites.
modtector compare [OPTIONS] --mod-csv <MOD_CSV> --unmod-csv <UNMOD_CSV> --output <OUTPUT>
Parameters:
-m, --mod-csv: Modified sample CSV-u, --unmod-csv: Unmodified sample CSV-o, --output: Output comparison CSV file--min-depth: Minimum depth threshold (default: 10)--min-fold: Minimum fold change threshold (default: 2.0)Example:
modtector compare \
-m result/mod.csv \
-u result/unmod.csv \
-o result/comparison.csv \
--min-depth 10 \
--min-fold 2.0
Output Format (CSV):
ChrID,Strand,Position,RefBase,ModStop,ModMutation,ModDepth,UnmodStop,UnmodMutation,UnmodDepth,StopFoldChange,MutationFoldChange,ModificationScore
Generate visualization plots including signal distributions and RNA structure SVG plots.
modtector plot [OPTIONS] --mod-csv <MOD_CSV> --unmod-csv <UNMOD_CSV> --output <OUTPUT>
Parameters:
-M, --mod-csv: Modified sample CSV (optional for SVG-only mode)-U, --unmod-csv: Unmodified sample CSV (optional for SVG-only mode)-o, --output: Output directory-r, --reactivity: Reactivity file (optional)-t, --threads: Number of parallel threads (default: 8)--coverage-threshold: Coverage threshold (default: 0.2)--depth-threshold: Depth threshold (default: 50)SVG Plotting Options:
--svg-template: SVG template file path--svg-bases: Bases to plot (default: ATGC)--svg-signal: Signal type to plot (stop, mutation, all)--svg-strand: Strand to include (+, -, both)--svg-ref: Reference sequence file for alignment--svg-max-shift: Maximum shift for alignment (default: 50)Example:
# Regular plotting
modtector plot \
-M result/mod.csv \
-U result/unmod.csv \
-o result/plots \
-r result/reactivity.csv \
-t 8
# SVG-only mode
modtector plot \
-o svg_output/ \
--svg-template rna_structure.svg \
--reactivity result/reactivity.csv \
--svg-signal all \
--svg-strand +
# Combined mode (regular + SVG)
modtector plot \
-M result/mod.csv \
-U result/unmod.csv \
-o result/plots \
--svg-template rna_structure.svg \
--reactivity result/reactivity.csv
Evaluate the accuracy of results using known secondary structure.
modtector evaluate [OPTIONS] --reactivity-file <REACTIVITY_FILE> --structure-file <STRUCTURE_FILE> --output-dir <OUTPUT_DIR>
Parameters:
-r, --reactivity-file: Reactivity CSV file (stop or mutation type)-s, --structure-file: Secondary structure file (.dp format)-o, --output-dir: Output directory-g, --gene-id: Gene ID (optional, for base matching)-S, --strand: Strand information (optional, default: +)--signal-type: Signal type (stop or mutation)Example:
# Evaluate stop signal accuracy
modtector evaluate \
--reactivity-file result/reactivity_stop.csv \
--structure-file data/structure.dp \
--output-dir result/eval \
--signal-type stop \
--gene-id gene_id
# Evaluate mutation signal accuracy
modtector evaluate \
--reactivity-file result/reactivity_mutation.csv \
--structure-file data/structure.dp \
--output-dir result/eval \
--signal-type mutation \
--gene-id gene_id
Evaluation Metrics:
The following shows typical visualization results generated by ModTector:
Overall Signal Distribution Scatter Plot: Shows signal comparison between modified and unmodified groups at all positions

ROC Curve Evaluation Results: Accuracy assessment based on 16S rRNA secondary structure
| Stop Signal ROC Curve | Mutation Signal ROC Curve |
|---|---|
![]() |
![]() |
Signal Distribution Plots: Display stop and mutation signal distribution across gene positions
| Stop Signal Distribution | Mutation Signal Distribution |
|---|---|
![]() |
![]() |
Reactivity Plots: Bar charts showing reactivity intensity of modification sites
| Stop Signal Reactivity | Mutation Signal Reactivity |
|---|---|
![]() |
![]() |
The project provides complete automated testing scripts covering all functional modules:
./test_moddetector.sh
This script executes the following complete workflow:
After test completion, the following files and directories will be generated:
Raw Data Processing Results:
result/mod.csv, result/unmod.csv - Raw statistical resultsresult/compare.csv - Raw data comparison resultsresult/mod_reactivity_stop.csv, result/mod_reactivity_mutation.csv - Raw reactivity dataresult/plot/ - Raw data visualization chartsresult/eval/ - Raw data evaluation resultsNormalized Data Processing Results:
result/mod_norm.csv, result/unmod_norm.csv - Normalized statistical resultsresult/compare_norm.csv - Normalized data comparison resultsresult/mod_norm_reactivity_stop.csv, result/mod_norm_reactivity_mutation.csv - Normalized reactivity dataresult/plot_norm/ - Normalized data visualization chartsresult/eval_norm/ - Normalized data evaluation resultsThis project is licensed under the MIT License.
Issues and Pull Requests are welcome to improve ModTector.
For questions or suggestions, please contact us through GitHub Issues.
If you use ModTector in your research, please cite:
[Add citation information when available]