| Crates.io | nucleaze |
| lib.rs | nucleaze |
| version | 1.4.2 |
| created_at | 2025-12-13 06:52:56.290088+00 |
| updated_at | 2026-01-25 08:03:23.675651+00 |
| description | Read filtering using k-mers. |
| homepage | |
| repository | https://github.com/jackdougle/nucleaze |
| max_upload_size | |
| id | 1982669 |
| size | 180,270 |
A high-performance Rust tool for filtering DNA/RNA reads based on a set of reference k-mers. Inspired by BBDuk by Brian Bushnell. Provides performant and memory-efficient read processing with support for both paired and unpaired FASTA/FASTQ files, with multiple files or in interleaved format.
K-mer based read filtering:
--minhits <int>Piping:
--in - explicitly)--outm/--outu/--outm2/--outu2 stdout.fa/stdout.fq to pipe results to stdoutPaired reads support:
--interinputMultithreading with Rayon:
--threads argumentMemory Limit:
--maxmem <String> (e.g., 5G for 5 gigabytes, 500M for 500 megabytes)Automatic Reference Indexing:
--binref <file> is provided from references provided with --ref <file>--binref <file> is includedIf using UNIX, run this command and follow the ensuing instructions:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
If using Windows, download the correct installer from Rustup.
brew install nucleaze
Nucleaze requires at least one of --ref or --binref to be provided. Input defaults to stdin if --in is not specified.
See more parameter documentation at ./src/main.rs
./nucleaze --in reads.fq --ref refs.fa --outm matched.fq --outu unmatched.fq --k 21
This command:
refs.fa sequencesreads.fq into chunks of size 10,000matched.fq and unmatched reads to unmatched.fqThis project is licensed under the MIT License, see LICENSE for details. There is lots of room for improvement here so new additions or suggestions are welcome!