| Crates.io | sakurs-core |
| lib.rs | sakurs-core |
| version | 0.1.1 |
| created_at | 2025-07-27 14:01:27.131576+00 |
| updated_at | 2025-07-27 15:29:09.451502+00 |
| description | High-performance sentence boundary detection using Delta-Stack Monoid algorithm |
| homepage | https://github.com/sog4be/sakurs |
| repository | https://github.com/sog4be/sakurs |
| max_upload_size | |
| id | 1770105 |
| size | 515,477 |
High-performance sentence boundary detection library using the Delta-Stack Monoid algorithm.
⚠️ API Stability Warning: This crate is in pre-release (v0.1.0). APIs may change significantly before v1.0.0. We recommend pinning to exact versions:
sakurs-core = "=0.1.0"
use sakurs_core::api::{SentenceProcessor, Input};
// Create processor with default configuration
let processor = SentenceProcessor::with_language("en")?;
// Process text
let text = "Hello world. This is a test.";
let output = processor.process(Input::from_text(text))?;
// Use the boundaries
for boundary in &output.boundaries {
println!("Sentence ends at byte offset: {}", boundary.offset);
}
use sakurs_core::api::{Config, Input, SentenceProcessor};
let config = Config::builder()
.language("ja")? // Japanese language rules
.threads(Some(4)) // Use 4 threads
.chunk_size_kb(Some(512)) // 512KB chunks
.build()?;
let processor = SentenceProcessor::with_config(config)?;
use sakurs_core::api::{Input, SentenceProcessor};
let processor = SentenceProcessor::new();
let output = processor.process(Input::from_file("document.txt"))?;
println!("Found {} sentences", output.boundaries.len());
println!("Processing took {:?}", output.metadata.processing_time);
use sakurs_core::api::{Config, Input, SentenceProcessor};
// Use streaming configuration for memory-efficient processing
let config = Config::streaming()
.language("en")?
.build()?;
let processor = SentenceProcessor::with_config(config)?;
let output = processor.process(Input::from_file("large_document.txt"))?;
Currently supported:
en)ja)Language rules are configured via TOML files. See the main repository for documentation on adding new languages.
This library implements the Delta-Stack Monoid algorithm, which represents parsing state as an associative monoid. This mathematical property enables:
For detailed algorithm documentation, see the main repository.
MIT License. See LICENSE for details.