| Crates.io | rust-featurecounts |
| lib.rs | rust-featurecounts |
| version | 0.1.0 |
| created_at | 2026-01-04 04:38:14.087113+00 |
| updated_at | 2026-01-04 04:38:14.087113+00 |
| description | A fast feature counting tool for prokaryotic RNA-seq analysis, compatible with featureCounts |
| homepage | https://github.com/necoli1822/rust-featurecounts |
| repository | https://github.com/necoli1822/rust-featurecounts |
| max_upload_size | |
| id | 2021301 |
| size | 107,853 |
A fast, memory-efficient feature counting tool for prokaryotic RNA-seq analysis, written in Rust. This is a clean-room reimplementation inspired by featureCounts from the Subread package.
Note: This tool is designed specifically for bacterial and archaeal genomes. It has not been tested with eukaryotic genomes and may not handle complex features such as alternative splicing.
Vibe Coding: This project was developed through AI-assisted programming (vibe coding) using Claude.
Tested on E. coli K-12 MG1655 RNA-seq data (913,011 fragments, 4,506 genes):
| Metric | rust-featurecounts | Notes |
|---|---|---|
| Processing Speed | ~300ms (4 threads) | Region-based parallel processing |
| Processing Speed | ~140ms (30 threads) | Auto-scaling with -T 0 |
| Throughput | ~3 million fragments/sec | With multi-threading |
| Memory Usage | ~50 MB | For typical bacterial genome |
| Binary Size | ~4 MB | Statically linked, no runtime dependencies |
-T 0, automatically determines optimal thread count based on current system loadrust-featurecounts is designed to minimise memory usage:
Ensure you have Rust installed (https://rustup.rs/), then:
git clone https://github.com/necoli1822/rust-featurecounts.git
cd rust-featurecounts
cargo build --release
The binary will be available at target/release/rust-featurecounts.
cargo install rust-featurecounts
rust-featurecounts [options] -a <annotation> -o <output> input.bam [input2.bam ...]
| Option | Description |
|---|---|
-a <file> |
Annotation file (GFF/GTF/GenBank format) |
-o <file> |
Output file for count matrix |
| Option | Description | Default |
|---|---|---|
-t <string> |
Feature type to count (exon, gene, CDS, auto) | exon |
-g <string> |
Attribute for feature ID (gene_id, locus_tag, etc.) | gene_id |
-s <int> |
Strandedness: 0=unstranded, 1=forward, 2=reverse | 0 |
-Q <int> |
Minimum mapping quality | 0 |
-p |
Paired-end mode (counts fragments) | false |
-T <int> |
Number of threads (0=auto based on system load) | 1 |
-m <string> |
Counting mode: intersection, union | intersection |
-h, --help |
Show help message | |
-v, --version |
Show version |
Basic usage with GFF annotation:
rust-featurecounts -a annotation.gff -o counts.tsv aligned.bam
Strand-specific counting with multiple samples:
rust-featurecounts \
-a annotation.gff \
-o counts.tsv \
-t gene \
-g locus_tag \
-s 2 \
-p \
-T 0 \
sample1.bam sample2.bam sample3.bam
Using GenBank annotation (automatically detected):
rust-featurecounts -a genome.gbk -o counts.tsv -t gene -g locus_tag aligned.bam
The output is a tab-separated file compatible with featureCounts and downstream analysis tools:
Geneid Chr Start End Strand Length sample1.bam sample2.bam
gene001 NC_000913.3 190 255 + 66 22 18
gene002 NC_000913.3 337 2799 + 2463 453 512
A summary file (.summary) is also generated with statistics:
rust-featurecounts Summary
==========================
Overall Statistics:
Total reads processed: 919574
Total fragments: 913011
Assigned: 707488
Unassigned reads:
No feature: 33544
Ambiguous: 171979
Low quality: 0
Standard GFF3 and GTF formats are supported. Features are extracted based on the -t option (default: exon).
GenBank flat file format (.gbk, .gb, .genbank, .gbff) is automatically detected and parsed. The VERSION field is used for chromosome/sequence matching with BAM references.
Supported qualifiers for gene identification:
locus_tag (recommended for bacterial genomes)geneprotein_idThis is a clean-room Rust reimplementation, not a port of the original C code. Key differences:
This project is licensed under the MIT Licence. See the LICENCE file for details.
Note: This is an independent reimplementation and is not affiliated with or derived from the original Subread/featureCounts project (GPL-3.0).
If you use rust-featurecounts in your research, please cite:
Kim S. (2025). rust-featurecounts: A fast feature counting tool for prokaryotic RNA-seq analysis. https://github.com/necoli1822/rust-featurecounts
For the original featureCounts algorithm, please also cite:
Liao Y, Smyth GK and Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics, 30(7):923-30, 2014.
Contributions are welcome! Please feel free to submit a Pull Request at GitHub.