| Crates.io | check_build |
| lib.rs | check_build |
| version | 0.1.1 |
| created_at | 2025-02-27 23:06:54.736298+00 |
| updated_at | 2025-08-18 20:23:57.218899+00 |
| description | A tool to verify a VCF file against hg19 and hg38 references using a streaming, low-memory approach. |
| homepage | |
| repository | https://github.com/SauersML/check_build |
| max_upload_size | |
| id | 1572261 |
| size | 66,010 |
check_build is a command-line tool written in Rust that verifies a plain-text (non-gzipped) VCF file against two reference genomes (hg19 and hg38) using a streaming, low-memory approach. It compares the REF alleles in your VCF with the corresponding bases in the reference FASTA files while only loading one contig at a time.
clap for argument parsing, indicatif for progress indication, reqwest for HTTP downloads, and tempfile for managing temporary files.Install via Cargo:
cargo install check_build
Alternatively, clone the repository and build from source:
git clone https://github.com/SauersML/check_build.git
cd check_build
cargo build --release
check_build expects a plain-text VCF file (non-gzipped). To run the tool, simply execute:
check_build genome.vcf
During execution, the tool will:
hg19.fa and hg38.fa files in the working directory. If not found, it downloads them automatically.Verification Summary:
- hg19 => 4357415 lines, 3298348 mismatches
- hg38 => 4728611 lines, 0 mismatches
This indicates that the VCF records match hg38 perfectly. If a large number of records mismatch (or are out-of-bounds) on hg19, but are aligned well to hg38, it suggests the VCF is aligned to hg38.