# Background [FORGe](https://github.com/langmead-lab/FORGe) [[Pritt2018](https://doi.org/10.1101/311720)] is a model and a software tool for variant prioritisation and filtration to be included in the pangenome reference. It scores each variant's "expected positive and negative impacts on alignment accuracy and computational overhead" based on population frequency, graph repetitiveness, and/or variant proximity. Variants are then ranked by these scores, and a fraction of them is used to augment the reference genome. FORGe implementation is, by designed, compatible with HISAT2 or Bowtie workflows and cannot be integrated into other graph construction workflows, such as PGGB or vg out of the box. It also requires the input file describing the variants to be in 1ksnp format which is not as pervasive and straightforward as VCF and imposes an extra step to convert VCF to 1ksnp. The final ranking file generated by `rank.py` is not in a standard format either, such as a sorted or filtered VCF file. This is where `forgers` comes into play providing the necessary logic to incorporate the FORGe model into broader workflows. # Introduction This tool, named `forgers` (short for `forge-rs`), aims to apply FORGe model to input VCF files and support VCF manipulation operations based on FORGe ranking. One of the design decision for `forgers` is to work seamlessly work with tools such as `bcftools` enabling the user to pipe VCF output of these tools to `forgers` or vice versa to create a more complex variant filtration pipeline. # Usage Currently, forgers supports two subcommands: `filter`, and `resolve`. ## Filter Filter and/or annotate VCF records based on FORGe ranking USAGE: forgers filter [FLAGS] [OPTIONS] [input] FLAGS: -a, --annotate Annotate the filtered records with FORGe rank -g, --gzip Gzip output, detected by file extension by default -h, --help Prints help information -V, --version Prints version information -v, --verbose Enable verbose mode OPTIONS: -f, --forge-rank FORGe rank file [default: ordered.txt] -k, --info-key Annotate key for INFO field [default: FORGE] -o, --output Output file, stdout if not specified [default: -] -t, --top Top fraction of records to keep, keeps all by default [default: 1.0] ARGS: Input VCF file, stdin if not specified [default: -] ## Resolve Resolve overlapping variants based on FORGe ranking; i.e. remove a cluster of variants when they are conflicting and replace them with one with higher ranking. It considers the phasing information when available to determine whether two overlapping variants are co-occurrent in any sample. USAGE: forgers resolve [FLAGS] [OPTIONS] [input] FLAGS: -g, --gzip Gzip output, detected by file extension by default -h, --help Prints help information -V, --version Prints version information -v, --verbose Enable verbose mode OPTIONS: -f, --forge-rank FORGe rank file [default: ordered.txt] -o, --output Output file, stdout if not specified [default: -] ARGS: Input VCF file, stdin if not specified [default: -]