Crates.io | sigalign |
lib.rs | sigalign |
version | 0.4.0 |
source | src |
created_at | 2021-11-01 05:24:48.120337 |
updated_at | 2024-07-19 05:59:57.75439 |
description | A Similarity-Guided Alignment Algorithm |
homepage | |
repository | https://github.com/baku4/sigalign/ |
max_upload_size | |
id | 475023 |
size | 49,737 |
SigAlign
?SigAlign
is a library for biological sequence alignment, the process of matching two sequences to identify similarity, which is a crucial step in analyzing sequence data in bioinformatics and computational biology. If you are new to sequence alignment, a quick overview on Wikipedia will be helpful.
SigAlign
is a non-heuristic algorithm that outputs alignments satisfying two cutoffs:
In SigAlign
, the penalty is calculated based on a gap-affine scheme, which imposes different penalties on mismatches, gap openings, and gap extensions.
SigAlign
is designed to be:
SigAlign
is not intended to:
Rust
developercrates.io
: https://crates.io/crates/sigalign/use sigalign::{
Aligner,
algorithms::Local,
ReferenceBuilder,
};
// (1) Build `Reference`
let fasta =
br#">target_1
ACACAGATCGCAAACTCACAATTGTATTTCTTTGCCACCTGGGCATATACTTTTTGCGCCCCCTCATTTA
>target_2
TCTGGGGCCATTGTATTTCTTTGCCAGCTGGGGCATATACTTTTTCCGCCCCCTCATTTACGCTCATCAC"#;
let reference = ReferenceBuilder::new()
.set_uppercase(true) // Ignore case
.ignore_base(b'N') // 'N' is never matched
.add_fasta(&fasta[..]).unwrap() // Add sequences from FASTA
.add_target(
"target_3",
b"AAAAAAAAAAA",
) // Add sequence manually
.build().unwrap();
// (2) Initialize `Aligner`
let algorithm = Local::new(
4, // Mismatch penalty
6, // Gap-open penalty
2, // Gap-extend penalty
50, // Minimum length
0.2, // Maximum penalty per length
).unwrap();
let mut aligner = Aligner::new(algorithm);
// (3) Align query to reference
let query = b"CAAACTCACAATTGTATTTCTTTGCCAGCTGGGCATATACTTTTTCCGCCCCCTCATTTAACTTCTTGGA";
let result = aligner.align(query, &reference);
println!("{:#?}", result);
Python
developerpip
to install the package: pip install sigalign
from sigalign import Reference, Aligner
# (1) Construct `Reference`
reference = Reference.from_fasta_file("./YOUR_REFERENCE.fa")
# (2) Initialize `Aligner`
aligner = Aligner(4, 6, 2, 50, 0.2)
# (3) Execute Alignment
query = "CAAACTCACAATTGTATTTCTTTGCCAGCTGGGCATATACTTTTTCCGCCCCCTCATTTAACTTCTTGGA"
results = aligner.align_query(reference, query)
# (4) Display Results
for target_result in results:
print(f"# Target index: {target_result.index}")
for idx, alignment in enumerate(target_result.alignments):
print(f" - Result: {idx+1}")
print(f" - Penalty: {alignment.penalty}")
print(f" - Length: {alignment.length}")
print(f" - Query position: {alignment.query_position}")
print(f" - Target position: {alignment.target_position}")
Web
developernpm
, plans for web support are in the pipeline.example
directory. Below is a TypeScript example showcasing SigAlign's application via this WASM wrapper:import init, { Reference, Aligner, type AlignmentResult } from '../wasm/sigalign_demo_wasm';
async function run() {
await init();
// (1) Construct `Reference`
const fasta: string = `>target_1
ACACAGATCGCAAACTCACAATTGTATTTCTTTGCCACCTGGGCATATACTTTTTGCGCCCCCTCATTTA
>target_2
TCTGGGGCCATTGTATTTCTTTGCCAGCTGGGGCATATACTTTTTCCGCCCCCTCATTTACGCTCATCAC`;
const reference: Reference = await Reference.build(fasta);
// (2) Initialize `Aligner`
const aligner: Aligner = new Aligner(
4, // Mismatch penalty
6, // Gap-open penalty
2, // Gap-extend penalty
50, // Minimum aligned length
0.2, // Maximum penalty per length
);
// (3) Execute Alignment
const query: string = "CAAACTCACAATTGTATTTCTTTGCCAGCTGGGCATATACTTTTTCCGCCCCCTCATTTAACTTCTTGGA";
const result: AlignmentResult = await aligner.alignment(query, reference);
// (4) Parse and Display Results
const parsedJsonObj = JSON.parse(result.to_json());
console.log(parsedJsonObj);
}
run();
SigAlign
is released under the MIT License.
Bahk, K., & Sung, J. (2024). SigAlign: an alignment algorithm guided by explicit similarity criteria. Nucleic Acids Research, gkae607. https://doi.org/10.1093/nar/gkae607