Crates.io | maf2bed |
lib.rs | maf2bed |
version | 0.3.0 |
source | src |
created_at | 2023-11-02 02:50:27.810806 |
updated_at | 2023-12-21 17:07:25.180804 |
description | Converts multiple alignment format (MAF) files to a BED format for tabixing. Used with jbrowse-plugin-mafviewer |
homepage | |
repository | |
max_upload_size | |
id | 1022164 |
size | 4,482 |
Used to convert multiple alignment format (MAF) files to a bed tabix-y style format. Used in jbrowse mafviewer plugin https://github.com/cmdcolin/jbrowse-plugin-mafviewer
cargo install maf2bed
maf2bed --help
Make sure to specify the 'assembly name' being used as the reference for the bed file as the first argument to maf2bed
#non parallel compression/decompression
zcat file.maf.gz | maf2bed assembly_name | bgzip > file.bed.gz
#parallel compression/decompression
pigz -dc file.maf.gz | maf2bed assembly_name | bgzip -@8 > file.bed.gz
tabix file.bed.gz
Might also need a sort -k1,1 -k2,2n in some cases on the bed file before bgzip/tabix
Converted to rust from perl as a coding exercise mostly, gaining a modest speedup on the way https://twitter.com/cmdcolin/status/1719608993310486883
I wanted to try using the bigMaf (bigBed based) format ecosystem with large MAF files but bedToBigBed doesn't support streaming or reading compressed files(?), so that requires reading big files on disk and in memory. in contrast, MAF tabix type approach like implemented here can be compressed and streaming which allows much lower memory usage and disk space