| Crates.io | orphos-cli |
| lib.rs | orphos-cli |
| version | 0.1.0 |
| created_at | 2025-11-07 22:10:38.27199+00 |
| updated_at | 2025-11-07 22:10:38.27199+00 |
| description | Command-line interface for Orphos, a tool for finding protein-coding genes in microbial genomes. |
| homepage | |
| repository | https://github.com/FullHuman/orphos |
| max_upload_size | |
| id | 1922246 |
| size | 96,171 |
Command-line interface for Orphos, a fast, parallel Rust implementation of Prodigal for finding protein-coding genes in microbial genomes.
cargo install orphos-cli
git clone https://github.com/FullHuman/orphos.git
cd orphos
cargo install --path orphos-cli
brew tap FullHuman/orphos
brew install orphos
conda install -c bioconda orphos
# Analyze a genome and output GenBank format
orphos -i genome.fasta -o genes.gbk
# Analyze with GFF3 output
orphos -i genome.fasta -f gff -o genes.gff
# Metagenomic mode for short contigs
orphos -i metagenome.fasta -p meta -o genes.gff
# Complete circular genome (closed ends)
orphos -i plasmid.fasta -c -o plasmid.gbk
# Input from stdin
cat genome.fasta | orphos -o genes.gbk
# Output to stdout
orphos -i genome.fasta > genes.gbk
# Pipe both
cat genome.fasta | orphos > genes.gbk
| Option | Short | Long | Description |
|---|---|---|---|
| Input file | -i |
--input |
Input FASTA file (default: stdin) |
| Output file | -o |
--output |
Output file (default: stdout) |
| Option | Short | Long | Default | Description |
|---|---|---|---|---|
| Format | -f |
--format |
gbk |
Output format: gbk, gff, sco, gca |
| Option | Short | Long | Default | Description |
|---|---|---|---|---|
| Mode | -p |
--mode |
single |
Analysis mode: single or meta |
| Closed ends | -c |
--closed |
false | No genes off edges (for complete genomes) |
| Mask N's | -m |
--mask |
false | Mask runs of N's |
| Translation table | -g |
--translation-table |
auto | Translation table (1-25) |
| Training file | -t |
--training |
- | Use pre-trained parameters |
| Option | Short | Long | Description |
|---|---|---|---|
| Quiet | -q |
--quiet |
Suppress progress messages |
| Help | -h |
--help |
Display help information |
| Version | -V |
--version |
Display version information |
Rich annotation format with gene features, translations, and metadata.
orphos -i genome.fasta -f gbk -o genes.gbk
General Feature Format version 3, widely used in genomics pipelines.
orphos -i genome.fasta -f gff -o genes.gff
Tab-delimited gene coordinates for easy parsing.
orphos -i genome.fasta -f sco -o genes.sco
Compact coordinate format.
orphos -i genome.fasta -f gca -o genes.gca
Use for complete or near-complete genomes (>100kb). Orphos will train on the genome to optimize gene prediction accuracy.
orphos -i complete_genome.fasta -o genes.gbk
Best for:
Use for short contigs or mixed metagenomic assemblies. Uses pre-trained parameters instead of training on the input.
orphos -i metagenome_contigs.fasta -p meta -o genes.gff
Best for:
For complete circular genomes (chromosomes, plasmids), use the -c flag to prevent genes from being called off the edges:
orphos -i circular_plasmid.fasta -c -o plasmid.gbk
Specify a custom genetic code (translation table):
# Use translation table 4 (Mycoplasma/Spiroplasma)
orphos -i mycoplasma.fasta -g 4 -o genes.gbk
# Use translation table 11 (Bacterial and Archaea)
orphos -i bacteria.fasta -g 11 -o genes.gbk
Mask runs of N's in low-quality sequences:
orphos -i draft_assembly.fasta -m -o genes.gff
Process multiple genomes:
for genome in genomes/*.fasta; do
base=$(basename "$genome" .fasta)
orphos -i "$genome" -f gff -o "results/${base}.gff"
done
Integrate with other bioinformatics tools:
# Find genes and extract protein sequences
orphos -i genome.fasta -f gff -o genes.gff
# ... then use genes.gff with other tools
# Combine with annotation pipelines
orphos -i assembly.fasta -p meta -f gff -o genes.gff
prokka --proteins genes.gff --outdir annotation genome.fasta
Orphos supports NCBI translation tables 1-25 (excluding 7, 8, 17-20). Common tables:
| Table | Name | Organisms |
|---|---|---|
| 1 | Standard | Most eukaryotes |
| 4 | Mycoplasma/Spiroplasma | Mycoplasma, Spiroplasma |
| 11 | Bacterial, Archaeal, Plant Plastid | Most bacteria and archaea (default) |
| 25 | Candidate Division SR1, Gracilibacteria | Certain bacteria |
We welcome contributions! Please see the main repository for contribution guidelines.
This project is licensed under the GPL-3.0 License - see the LICENSE file for details.
If you use Orphos in your research, please cite:
# TODO: Add citation information
This project is inspired by the original Prodigal by Doug Hyatt. We thank the authors for their groundbreaking work in prokaryotic gene prediction.