| Crates.io | cicero-sophia |
| lib.rs | cicero-sophia |
| version | 0.6.5 |
| created_at | 2025-05-03 19:06:26.222924+00 |
| updated_at | 2025-09-21 11:05:51.919722+00 |
| description | High-performance NLU (natural language understanding) engine built in Rust for speed, accuracy, and privacy. |
| homepage | https://cicero.sh/sophia/ |
| repository | https://github.com/cicero-ai/cicero |
| max_upload_size | |
| id | 1659080 |
| size | 258,636 |
High-performance NLU (natural language understanding) engine built in Rust for speed, accuracy, and privacy.
(LICENSE)
Core Capabilities
Performance
Github: https://github.com/cicero-ai/cicero/
Typical dual license model, free and open source for individual use via the GPLv3 license, but premium license required for commercial use. For full details including online demo, please visit: https://cicero.sh/sophia/.
Add cicero-sophia to your project by including it in your Cargo.toml:
toml
[dependencies] cicero-sophia = "0.3.0"
To use Sophia, you must obtain the vocabulary data store, which is available free of charge. Simply visit https://cicero.sh/ register for a free account, and the vocabulary data store is available for download within the member's area.
Example 1: Tokenizing Text
use sophia::{Sophia, Error};
fn main() -> Result<(), Error> {
// Initialize Sophia
let datadir = "./vocab_data";
let sophia = Sophia::new(datadir, "en")?;
// Tokenize the input text
let output = sophia.tokenize("The quick brown fox jumps over the lazy dog")?;
// Print individual tokens
println!("Individual Tokens:");
for token in output.iter() {
println!(" Word: {} POS: {}", token.word, token.pos);
}
// Print MWEs
println!("\nMulti-Word Entities (MWEs):");
for token in output.mwe() {
println!(" Word: {} POS: {}", token.word, token.pos);
}
Ok(())
}
Example 2: Interpreting Text
use sophia::{Sophia, Error};
fn main() -> Result<(), Error> {
// Initialize Sophia
let datadir = "./vocab_data";
let sophia = Sophia::new(datadir, "en")?;
// Interpret the input text
let output = sophia.interpret("The quick brown fox jumps over the lazy dog")?;
// Print phrases
println!("Phrases:");
for phrase in output.phrases.iter() {
println!(" {:?}", phrase);
}
// Print individual tokens
println!("\nIndividual Tokens:");
for token in output.tokens.iter() {
println!(" Word: {} POS: {}", token.word, token.pos);
}
Ok(())
}
Example 3: Retrieve individual word / toekn
use sophia::{Sophia, Error};
fn main() -> Result<(), Error> {
// Initialize Sophia
let datadir = "./vocab_data";
let sophia = Sophia::new(datadir, "en")?;
// Get word
let token = sophia.get_word("future").unwrap();
println!("Got word {}, id {}, pos {}", token.word, token.index, token.pos);
// Get specific token
let token = sophia.get_token(82251).unwrap();
println!("Got word {}, id {}, pos {}", token.word, token.index, token.pos);
Ok(())
}
Example 4: Retrieve Category
use sophia::{Sophia, Error};
fn main() -> Result<(), Error> {
// Initialize Sophia
let datadir = "./vocab_data";
let sophia = Sophia::new(datadir, "en")?;
// Get category
let cat = sophia.get_category("verbs/action/travel/depart").unwrap();
println!("name {}", cat.name);
println!("fqn: {}", cat.fqn);
println!("word ids: {:?}", cat.words);
Ok(())
}
For all inquiries, please complete the contact form at: https://cicero.sh/contact