seqdupes

Crates.ioseqdupes
lib.rsseqdupes
version0.2.0
sourcesrc
created_at2022-11-10 19:01:50.232808
updated_at2024-06-24 18:44:22.01403
descriptionCompress sequence duplicates
homepagehttps://stevenweaver.org
repositoryhttps://github.com/stevenweaver/seqdupes
max_upload_size
id712298
size67,863
Steven Weaver (stevenweaver)

documentation

README

seqdupes

Removes duplicates from FASTA files. Supports filtering based on sequence content or header information.

Installation

Source

Download the source code and run:

cargo install

Usage

Run seqdupes to process FASTA files. You can specify whether to filter by sequence or by header.

Filtering by Sequence (default)

seqdupes -f path/to/sequence.fastq -j path/to/output.json > no_dupes.fas

Filtering by Header

If you prefer to filter duplicates based on headers rather than sequences, use the --by-header flag.

seqdupes -f path/to/sequence.fastq -j path/to/output.json --by-header > no_dupes.fas

Arguments

Parameter Default Description
-f, --fasta - The path to the FASTQ file to use.
-j, --json - The output path for listing duplicates.
-b, --by-header - Enables filtering based on headers (optional).

The tool outputs a FASTA file with duplicates removed to stdout and a JSON file containing details of the duplicates to the specified path.

Commit count: 32

cargo fmt