fastq_rs

Crates.iofastq_rs
lib.rsfastq_rs
version0.0.1
created_at2025-09-18 22:09:51.306937+00
updated_at2025-09-18 22:09:51.306937+00
descriptionMulti purpose fastq toolkit.
homepage
repositoryhttps://github.com/OscarAspelin95/fastq_rs
max_upload_size
id1845508
size185,854
Oscar Aspelin (OscarAspelin95)

documentation

README

fastq_rs

Fastq toolkit, aiming to an alternative to seqkit.

Requirements

  • Linux OS (Ubuntu 24.04.2)
  • Rust >= 1.88.0

Installation

Clone the repository or download the source code. Enter the fastq_rs directory and run:

cargo build --release

The generated binary is available in target/release/fastq_rs.

Plots

To enable plotting, compile with the plot feature.

cargo build --release --features plot

Plots will be automatically generated to applicable subcommands.

Usage

Run with:
fastq_rs <subcommand> <args>

Subcommands

🔴 Not implemented yet (but planning to).
🟡 Implemented but not tested/fully featured.
🟢 Beta-mode available!

fastq_rs stats

🟢 Calculate basic stats.

fastq_rs stats --fastq <reads.fastq.gz> <optional_args>

Note - if no file is provided, fastq_rs will read from stdin (plain FASTQ).

Optional arguments:

-o/--outfile [stats.json] - Output file.

fastq_rs sanitize

🟢 Attempt to sanitize malformatted reads.

fastq_rs sanitize --fastq <reads.fastq.gz> <optional_args>

Optional arguments:

-o/--outfile [stdout] - Output file.

fastq_rs head

🟢 Output the first n reads.

fastq_rs head --fastq <reads.fastq.gz> <optional_args>

Optional arguments:

-n/--num-seqs [5] - Number of reads to output.

-o/--outfile [stdout] - Output file.

fastq_rs grep

🟢 Search and filter read ids by regular expressions.

fastq_rs grep --fastq <reads.fastq.gz> --pattern <regex_string> <optional_args>

Optional arguments:

-o/--outfile [stdout] - Output file.

fastq_rs concat

🟢 Sanitize and concat fastq files. This feature is probably slower than using normal bash commands like zcat *.fastq.gz | pigs -f > concat.fastq.gz, but has the advantage of skipping malformatted records.

fastq_rs concat --fastqs <reads.fastq.gz> <...> <optional_args>

Optional arguments:

-o/--outfile [stdout] - Output file.

fastq_rs sort

🟢 Sort reads based on provided metric.

fastq_rs sort --fastq <reads.fastq.gz> <optional_args>

Optional arguments:


-b/--by [length] - {length, gc_content, mean_error, minimizers}

-r/--reverse [false] - Sort in descending order.

-w/--window-size [10] - Minimizer window size (number of consecutive kmers).

-k/--kmer-size [15] - Minimizer kmer size.

--max-read-error [0.05] - Minimizer max allowed read error. Reads with higher error rates will generate zero valid minimizers.

--max-minimizer-error [0.05] - Minimizer error probability cutoff.

-o/--outfile [stdout] - Output file.

fastq_rs fq2-fa

🟢 Convert FASTQ to FASTA.

fastq_rs fq2-fa --fastq <reads.fastq.gz> <optional_args>

Optional arguments:

-o/--outfile [stdout] - Output file.

fastq_rs fq2-tab

🟢 Convert FASTQ to a .tsv file with information about each read. If compiled with the plot feature, will generate a read scatter and boxplot.

fastq_rs fq2-tab --fastq <reads.fastq.gz> <optional_args>

Optional arguments:

-o/--outfile [stdout] - Output file.

fastq_rs filter

🟢 Filter reads.

fastq_rs filter --fastq <reads.fastq.gz> <optional_args>

Optional arguments:

--min-len [0] - Minimum allowed read length.

--max-len [usize::MAX] - Maximum allowed read length.

--min-err [0.0] - Minimum allowed mean read error.

--max-err [1.0] - Maximum allowed mean read error.

--min-softmasked [0] - Minimum allowed num softmasked bases.

--max-softmasked [usize::MAX] - Maximum allowed num softmasked bases.

--min-ambiguous [0] - Minimum allowed num ambiguous bases.

--max-ambiguous [usize::MAX] - Maximum allowed num ambiguous bases.

-o/--outfile [stdout] - Output file.

fastq_rs sample

🟢 Sample reads by fraction or number of reads.

fastq_rs sample --fastq <reads.fastq.gz> <optional_args>

Optional arguments:

-b/--by [1.0] - How to sample. Inputs <= 1.0 signifies sampling by fraction. Inputs > 1.0 signifies sampling by (whole) number of reads.

-o/--outfile [stdout] - Output file.

fastq_rs trim

🟢 Trim reads through fuzzy search with ambiguous nucleotide support.

fastq_rs trim --fastq <reads.fastq.gz> <optional_args>

Optional arguments:


--min-len [0] - Minimum read length for trimmed read to be outputted.

--trim-start [0] - Force trim this number of bases at the start of all reads.

--trim-end [0] - Force trim this number of bases at the end of all reads.

--barcode-forward [none] - Barcode(s) to trim at the start of the read. Must be provided in 5' -> 3' direction.

--barcode-reverse [none] - Barcode(s) to trim at the end of the read. Must be provided in 5' -> 3' direction.

--max-mismatches [2] - Allow this many mismatches between the barcode and the read.

--barcode-margin [10] - Allow the barcode to be located at most this number of bases from the start/end of the read.

-o/--outfile [stdout] - Output file.
Commit count: 37

cargo fmt