Crates.io | fastq_rs |
lib.rs | fastq_rs |
version | 0.0.1 |
created_at | 2025-09-18 22:09:51.306937+00 |
updated_at | 2025-09-18 22:09:51.306937+00 |
description | Multi purpose fastq toolkit. |
homepage | |
repository | https://github.com/OscarAspelin95/fastq_rs |
max_upload_size | |
id | 1845508 |
size | 185,854 |
Fastq toolkit, aiming to an alternative to seqkit.
Clone the repository or download the source code. Enter the fastq_rs directory and run:
cargo build --release
The generated binary is available in target/release/fastq_rs
.
To enable plotting, compile with the plot
feature.
cargo build --release --features plot
Plots will be automatically generated to applicable subcommands.
Run with:
fastq_rs <subcommand> <args>
🔴 Not implemented yet (but planning to).
🟡 Implemented but not tested/fully featured.
🟢 Beta-mode available!
stats
🟢 Calculate basic stats.
fastq_rs stats --fastq <reads.fastq.gz> <optional_args>
Note - if no file is provided, fastq_rs
will read from stdin (plain FASTQ).
Optional arguments:
-o/--outfile [stats.json] - Output file.
sanitize
🟢 Attempt to sanitize malformatted reads.
fastq_rs sanitize --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
-o/--outfile [stdout] - Output file.
head
🟢 Output the first n
reads.
fastq_rs head --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
-n/--num-seqs [5] - Number of reads to output. -o/--outfile [stdout] - Output file.
grep
🟢 Search and filter read ids by regular expressions.
fastq_rs grep --fastq <reads.fastq.gz> --pattern <regex_string> <optional_args>
Optional arguments:
-o/--outfile [stdout] - Output file.
concat
🟢 Sanitize and concat fastq files. This feature is probably slower than using normal bash commands like zcat *.fastq.gz | pigs -f > concat.fastq.gz
, but has the advantage of skipping malformatted records.
fastq_rs concat --fastqs <reads.fastq.gz> <...> <optional_args>
Optional arguments:
-o/--outfile [stdout] - Output file.
sort
🟢 Sort reads based on provided metric.
fastq_rs sort --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
-b/--by [length] - {length, gc_content, mean_error, minimizers} -r/--reverse [false] - Sort in descending order. -w/--window-size [10] - Minimizer window size (number of consecutive kmers). -k/--kmer-size [15] - Minimizer kmer size. --max-read-error [0.05] - Minimizer max allowed read error. Reads with higher error rates will generate zero valid minimizers. --max-minimizer-error [0.05] - Minimizer error probability cutoff. -o/--outfile [stdout] - Output file.
fq2-fa
🟢 Convert FASTQ to FASTA.
fastq_rs fq2-fa --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
-o/--outfile [stdout] - Output file.
fq2-tab
🟢 Convert FASTQ to a .tsv file with information about each read. If compiled with the plot
feature, will generate a read scatter and boxplot.
fastq_rs fq2-tab --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
-o/--outfile [stdout] - Output file.
filter
🟢 Filter reads.
fastq_rs filter --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
--min-len [0] - Minimum allowed read length. --max-len [usize::MAX] - Maximum allowed read length. --min-err [0.0] - Minimum allowed mean read error. --max-err [1.0] - Maximum allowed mean read error. --min-softmasked [0] - Minimum allowed num softmasked bases. --max-softmasked [usize::MAX] - Maximum allowed num softmasked bases. --min-ambiguous [0] - Minimum allowed num ambiguous bases. --max-ambiguous [usize::MAX] - Maximum allowed num ambiguous bases. -o/--outfile [stdout] - Output file.
sample
🟢 Sample reads by fraction or number of reads.
fastq_rs sample --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
-b/--by [1.0] - How to sample. Inputs <= 1.0 signifies sampling by fraction. Inputs > 1.0 signifies sampling by (whole) number of reads. -o/--outfile [stdout] - Output file.
trim
🟢 Trim reads through fuzzy search with ambiguous nucleotide support.
fastq_rs trim --fastq <reads.fastq.gz> <optional_args>
Optional arguments:
--min-len [0] - Minimum read length for trimmed read to be outputted. --trim-start [0] - Force trim this number of bases at the start of all reads. --trim-end [0] - Force trim this number of bases at the end of all reads. --barcode-forward [none] - Barcode(s) to trim at the start of the read. Must be provided in 5' -> 3' direction. --barcode-reverse [none] - Barcode(s) to trim at the end of the read. Must be provided in 5' -> 3' direction. --max-mismatches [2] - Allow this many mismatches between the barcode and the read. --barcode-margin [10] - Allow the barcode to be located at most this number of bases from the start/end of the read. -o/--outfile [stdout] - Output file.