| Crates.io | seqtk-rs |
| lib.rs | seqtk-rs |
| version | 0.2.0 |
| created_at | 2025-07-01 19:41:49.750668+00 |
| updated_at | 2025-07-22 01:00:54.610281+00 |
| description | This is a sequence processing tool written in Rust for manipulating FASTA/FASTQ files. Pure rust version of seqtk. |
| homepage | https://github.com/yenyen1/seqtk-rs |
| repository | https://github.com/yenyen1/seqtk-rs |
| max_upload_size | |
| id | 1733626 |
| size | 122,239 |
This is a sequence processing tool written in Rust for manipulating FASTA/FASTQ files. I built this tool out of my passion for Rust. Its functionality and subcommand names are similar to those in seqtk, but I’ve made some changes based on my own design logic.
cargo install seqtk-rs
seqtk_rs -h
seq Common transformation of FASTA/Q
sample Random Sampling by given seed and fraction
size Report the stats of sequence length
(Output: #seq, #bases, avg_size, min_size, med_size, max_size, N50)
fqchk Report stats for sequence and quality by position
(Output: POS, #bases, %A, %C, %G, %T, %N, avgQ, errQ, ...)
(Q₁ + Q₂ + ... + Qₙ) / N-10 * log₁₀((P₁ + P₂ + ... + Pₙ) / N)Notice: Some tools treat quality scores less than 3 (Q < 3) as 3 to avoid instability in downstream metrics. For example, Q = 0 yields an error probability P = 1.0, Q = 1 gives P ≈ 0.794, and Q = 2 gives P ≈ 0.630. These low Q-scores can heavily skew error rate calculations (e.g., errQ), which is why they are often floored to 3. However, this adjustment can lead to results that are inconsistent with the original definition. Therefore, this tool preserves the original quality scores as-is.
comp Report the nucleotide composition of FASTA/Q
(Output: #A, #C, #G, #T, #2, #3, #4, #CG, #GC)
CG or GC: Number of CG/GC on the template strand qctrim Trims low-quality bases from a FASTQ data based on a quality threshold Q.
trimAdapter trim the adapter for FASTQ file