| Crates.io | bitnuc-mismatch |
| lib.rs | bitnuc-mismatch |
| version | 0.1.2 |
| created_at | 2025-01-21 19:19:08.499335+00 |
| updated_at | 2025-04-16 23:00:30.164896+00 |
| description | Create unambiguous one-off mismatch hash tables from bitnuc scalars. |
| homepage | |
| repository | https://github.com/noamteyssier/bitnuc-mismatch |
| max_upload_size | |
| id | 1525275 |
| size | 43,277 |
Create unambiguous one-off mismatch hash tables from bitnuc scalars.
This library adapts my work in disambiseq to operating in 2-bit space.
Note that this is for sequences that are represented as bitnuc scalars.
By definition this limits the sequences to a maximum length of 32 nucleotides.
Future work will include support for longer sequences.
This is a library for generating unambiguous one-off mismatches for bitnuc scalars. The library provides a function to generate all possible one-off mismatches for a given bitnuc scalar. It also provides a function to build a mismatch table, which maps mismatches to their parent sequences. The library is designed to be used in the context of generating mismatches for a set of parent sequences while avoiding ambiguous mismatches. Ambiguous mismatches are mismatches that are within the one-off distance of multiple parent sequences.
This builds on the bitnuc library, which provides functions for converting nucleotide sequences to bitnuc scalars.
use bitnuc_mismatch::build_mismatch_table;
// Define a set of parent sequences
let parent_sequences = vec![
b"ACTG",
b"ACCG",
];
// Convert the parent sequences to bitnuc scalars
let parent_scalars: Vec<u64> = parent_sequences
.into_iter()
.map(|seq| bitnuc::as_2bit(seq).unwrap())
.collect();
// Build a mismatch table
let mismatch_table = build_mismatch_table(&parent_scalars, 4).unwrap();
// Test some expected mismatches
let gcta = bitnuc::as_2bit(b"GCTG").unwrap();
assert_eq!(mismatch_table.get(&gcta), Some(&parent_scalars[0]));
// Validate that unexpected mismatches are not present
let acgg = bitnuc::as_2bit(b"ACGG").unwrap();
assert!(mismatch_table.get(&acgg).is_none());