| Crates.io | onecode |
| lib.rs | onecode |
| version | 0.1.0 |
| created_at | 2025-11-05 20:56:38.051294+00 |
| updated_at | 2025-11-05 20:56:38.051294+00 |
| description | Rust bindings for ONEcode - a data representation format for genomic data |
| homepage | |
| repository | https://github.com/pangenome/onecode-rs |
| max_upload_size | |
| id | 1918623 |
| size | 1,175,353 |
Rust bindings for ONEcode, a simple and efficient data representation format for genomic data.
ONEcode is a data representation framework designed primarily for genomic data, providing both human-readable ASCII and compressed binary file versions with strongly typed data.
This library provides safe, idiomatic Rust bindings to the ONEcode C library.
This library uses bindgen to generate Rust bindings from C headers, which requires clang/libclang:
Ubuntu/Debian:
sudo apt-get install llvm-dev libclang-dev clang
Fedora/RHEL:
sudo dnf install clang-devel llvm-devel
macOS:
xcode-select --install # Usually already installed
Arch Linux:
sudo pacman -S clang
For more details, see the bindgen requirements documentation.
Add this to your Cargo.toml:
[dependencies]
onecode = { git = "https://github.com/pangenome/onecode-rs" }
use onecode::OneFile;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut file = OneFile::open_read("data.1seq", None, None, 1)?;
// Read through the file
loop {
let line_type = file.read_line();
if line_type == '\0' {
break; // End of file
}
match line_type {
'S' => {
// Access DNA sequence data
println!("Sequence line");
},
'I' => {
// Access identifier string
println!("ID: {}", file.int(0));
},
_ => {}
}
}
Ok(())
}
use onecode::{OneFile, OneSchema};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a schema
let schema_text = "P 3 tst\nO T 1 3 INT\n";
let schema = OneSchema::from_text(schema_text)?;
// Open file for writing
let mut writer = OneFile::open_write_new(
"output.1tst",
&schema,
"tst",
false, // ASCII format
1 // single-threaded
)?;
// Add provenance
writer.add_provenance("myprogram", "1.0", "example command")?;
// Write data
writer.set_int(0, 42);
writer.write_line('T', 0, None);
// File is automatically closed on drop
Ok(())
}
use onecode::OneSchema;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Define schema inline
let schema_text = r#"
P 3 seq
O S 1 3 DNA
D I 1 3 INT
"#;
let schema = OneSchema::from_text(schema_text)?;
// Use schema for file operations
Ok(())
}
use onecode::OneFile;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file = OneFile::open_read("data.1seq", None, None, 1)?;
// Get statistics for a line type
let (count, max_length, total_length) = file.stats('S')?;
println!("Sequences: {}, Max length: {}, Total: {}",
count, max_length, total_length);
Ok(())
}
Alignment files can contain embedded genome database (GDB) information, mapping sequence IDs to names:
use onecode::OneFile;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut file = OneFile::open_read("alignments.1aln", None, None, 1)?;
// Get all sequence names (efficient for multiple lookups)
let seq_names = file.get_all_sequence_names();
println!("Found {} sequences", seq_names.len());
// Read alignments and resolve sequence names
loop {
let line_type = file.read_line();
if line_type == '\0' { break; }
if line_type == 'A' {
let query_id = file.int(0);
let target_id = file.int(3);
if let (Some(query_name), Some(target_name)) =
(seq_names.get(&query_id), seq_names.get(&target_id)) {
println!("Alignment: {} vs {}", query_name, target_name);
}
}
}
Ok(())
}
Or look up individual names on-demand:
let mut file = OneFile::open_read("alignments.1aln", None, None, 1)?;
// Get a specific sequence name by ID
if let Some(name) = file.get_sequence_name(5) {
println!("Sequence 5: {}", name);
}
Full API documentation is available via cargo doc:
cargo doc --open
Key types:
OneFile - Main file handle for reading/writing ONE filesOneSchema - Schema definition and validationOneError - Error typesOneType - Field type enumerationThe library uses bindgen to automatically generate bindings from the C headers and cc to compile the C library.
cargo build --release
All tests pass with full concurrent execution:
cargo test
Test suite includes:
✅ Fully thread-safe! The library supports concurrent operations without any restrictions.
The upstream ONEcode C library has been updated with thread-local storage for all global state, making it safe for concurrent use from multiple threads. All operations including schema creation, file reading, and error handling work correctly under concurrent load.
The library is organized into several modules:
ffi - Raw FFI bindings generated by bindgenerror - Rust error types and Result wrappertypes - Rust-friendly type definitionsfile - Safe OneFile wrapper with RAII resource managementschema - OneSchema management and validationThe C library is included as a git subtree in the ONEcode/ directory and compiled automatically during the build process.
To update the ONEcode subtree:
git subtree pull --prefix ONEcode https://github.com/thegenemyers/ONEcode.git main --squash
This Rust wrapper is licensed under MIT OR Apache-2.0.
The ONEcode C library has its own license - see ONEcode/ for details.
Contributions are welcome! Please ensure tests pass before submitting PRs:
cargo test
cargo clippy
cargo fmt
ONEcode was developed by Gene Myers and Richard Durbin. This Rust wrapper builds on their excellent work to provide safe, idiomatic Rust bindings.