| Crates.io | seqdupes |
| lib.rs | seqdupes |
| version | 0.2.0 |
| created_at | 2022-11-10 19:01:50.232808+00 |
| updated_at | 2024-06-24 18:44:22.01403+00 |
| description | Compress sequence duplicates |
| homepage | https://stevenweaver.org |
| repository | https://github.com/stevenweaver/seqdupes |
| max_upload_size | |
| id | 712298 |
| size | 67,863 |
Removes duplicates from FASTA files. Supports filtering based on sequence content or header information.
Download the source code and run:
cargo install
Run seqdupes to process FASTA files. You can specify whether to filter by sequence or by header.
seqdupes -f path/to/sequence.fastq -j path/to/output.json > no_dupes.fas
If you prefer to filter duplicates based on headers rather than sequences, use the --by-header flag.
seqdupes -f path/to/sequence.fastq -j path/to/output.json --by-header > no_dupes.fas
| Parameter | Default | Description |
|---|---|---|
| -f, --fasta | - | The path to the FASTQ file to use. |
| -j, --json | - | The output path for listing duplicates. |
| -b, --by-header | - | Enables filtering based on headers (optional). |
The tool outputs a FASTA file with duplicates removed to stdout and a JSON file containing details of the duplicates to the specified path.