exon-test

Crates.ioexon-test
lib.rsexon-test
version0.3.4-beta.9
sourcesrc
created_at2023-10-27 16:14:23.052674
updated_at2023-10-27 16:21:28.710462
descriptionExon test crate
homepagehttps://www.wheretrue.dev/docs/exon/
repositoryhttps://github.com/wheretrue/exon
max_upload_size
id1016178
size6,240
Trent Hauck (tshauck)

documentation

README

Exon

Exon is an analysis toolkit for life-science applications. It features:

  • Support for many file formats from bioinformatics, proteomics, and others
  • Local filesystem and object storage support
  • Arrow FFI primitives for multi-language support
  • SQL based access to bioinformatics data -- general DML and some DDL support

Please note Exon was recently excised from a larger library, so please be patient as we work to clean up after that. If you have a comment or question in the meantime, please file an issue.

Installation

Exon is available via crates.io. To install, run:

cargo add exon

Usage

Exon is designed to be used as a library. For example, to read a FASTA file:

use exon::context::ExonSessionExt;

use datafusion::prelude::*;
use datafusion::error::Result;

let ctx = SessionContext::new_exon();

let df = ctx.read_fasta("test-data/datasources/fasta/test.fasta", None).await?;

Please see the rust docs for more information.

File Formats

Format Compression(s) Inferred Extension(s)
BAM - .bam
BCF - .bcf
BED gz, zstd .bed
FASTA gz, zstd .fasta, .fa, .fna
FASTQ gz, zstd .fastq, .fq
GENBANK gz, zstd .gbk, .genbank, .gb
GFF gz, zstd .gff
GTF gz, zstd .gtf
HMMDOMTAB gz, zstd .hmmdomtab
MZML gz, zstd .mzml1
SAM - .sam
VCF gz2 .vcf

Related Projects

Settings

Exon using the following settings:

Setting Default Description
exon.vcf_parse_info true Parse VCF INFO fields. If False, INFO fields will be returned as a single string.
exon.vcf_parse_formats true Parse VCF FORMAT fields. If False, FORMAT fields will be returned as a single string.

You can update the settings by running:

SET <setting> = <value>;

For example, to disable parsing of VCF INFO fields:

SET exon.vcf_parse_info = false;

Benchmarks

Please see the benchmarks README for more information.

Footnotes

  1. mzML also works.

  2. Uses bgzip not gzip.

Commit count: 799

cargo fmt