seqtk-rs

Crates.ioseqtk-rs
lib.rsseqtk-rs
version0.2.0
created_at2025-07-01 19:41:49.750668+00
updated_at2025-07-22 01:00:54.610281+00
descriptionThis is a sequence processing tool written in Rust for manipulating FASTA/FASTQ files. Pure rust version of seqtk.
homepagehttps://github.com/yenyen1/seqtk-rs
repositoryhttps://github.com/yenyen1/seqtk-rs
max_upload_size
id1733626
size122,239
Yenyen (yenyen1)

documentation

README

seqtk-rs

crate

This is a sequence processing tool written in Rust for manipulating FASTA/FASTQ files. I built this tool out of my passion for Rust. Its functionality and subcommand names are similar to those in seqtk, but I’ve made some changes based on my own design logic.

Installation

cargo install seqtk-rs
seqtk_rs -h

Current Features

  • seq Common transformation of FASTA/Q

  • sample Random Sampling by given seed and fraction

  • size Report the stats of sequence length

    (Output: #seq, #bases, avg_size, min_size, med_size, max_size, N50)

  • fqchk Report stats for sequence and quality by position

    (Output: POS, #bases, %A, %C, %G, %T, %N, avgQ, errQ, ...)

    • avgQ: Average quality score (Q₁ + Q₂ + ... + Qₙ) / N
    • errQ: Estimated error rate -10 * log₁₀((P₁ + P₂ + ... + Pₙ) / N)

    Notice: Some tools treat quality scores less than 3 (Q < 3) as 3 to avoid instability in downstream metrics. For example, Q = 0 yields an error probability P = 1.0, Q = 1 gives P ≈ 0.794, and Q = 2 gives P ≈ 0.630. These low Q-scores can heavily skew error rate calculations (e.g., errQ), which is why they are often floored to 3. However, this adjustment can lead to results that are inconsistent with the original definition. Therefore, this tool preserves the original quality scores as-is.

  • comp Report the nucleotide composition of FASTA/Q

    (Output: #A, #C, #G, #T, #2, #3, #4, #CG, #GC)

    • CG or GC: Number of CG/GC on the template strand
  • qctrim Trims low-quality bases from a FASTQ data based on a quality threshold Q.

TODO

  • trimAdapter trim the adapter for FASTQ file

Acknowledgements

Commit count: 0

cargo fmt