| Crates.io | edgebert |
| lib.rs | edgebert |
| version | 0.3.4 |
| created_at | 2025-08-25 23:09:00.73877+00 |
| updated_at | 2025-09-12 10:22:28.82275+00 |
| description | Fast local text embeddings library for Rust and WASM for BERT inference on native and edge devices with no dependencies. |
| homepage | |
| repository | https://github.com/olafurjohannssson/edgebert |
| max_upload_size | |
| id | 1810282 |
| size | 247,666 |
A pure Rust + WASM implementation for BERT inference with minimal dependencies
EdgeBERT is a lightweight Rust implementation of BERT inference focused on native, edge computing and WASM deployment. Run sentence-transformers models anywhere without Python or large ML runtimes.
All contributions welcome, this is very early stages.
Use this if you need:
Don't use this if you need:
For server-side or desktop applications, you can use the library directly.
Cargo.toml
[dependencies]
edgebert = "0.3.0"
anyhow = "1.0"
main.rs
use anyhow::Result;
use edgebert::{Model, ModelType};
fn main() -> Result<()> {
let model = Model::from_pretrained(ModelType::MiniLML6V2)?;
let texts = vec!["Hello world", "How are you"];
let embeddings = model.encode(texts.clone(), true)?;
for (i, embedding) in embeddings.iter().enumerate() {
let n = embedding.len().min(10);
println!("Text: {} == {:?}...", texts[i], &embedding[0..n]);
}
Ok(())
}
Output:
Text: Hello world == [-0.034439795, 0.030909885, 0.0066969804, 0.02608013, -0.03936993, -0.16037229, 0.06694216, -0.0065279473, -0.0474657, 0.014813968]...
Text: How are you == [-0.031447295, 0.03784213, 0.0761843, 0.045665547, -0.0012263817, -0.07488511, 0.08155286, 0.010209872, -0.11220472, 0.04075747]...
You can see the full example under examples/basic.rs - to build and run:
cargo run --release --example basic
# Install wasm-pack
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
# Build
./scripts/wasm-build.sh
# Serve
cd examples && npx serve
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>EdgeBERT WASM Test</title>
</head>
<body>
<script type="module">
import init, { WasmModel, WasmModelType } from './pkg/edgebert.js';
async function run() {
await init();
const model = await WasmModel.from_type(WasmModelType.MiniLML6V2);
const texts = ["Hello world", "How are you"];
const embeddings = model.encode(texts, true);
console.log("First 10 values:", embeddings.slice(0, 10));
}
run().catch(console.error);
</script>
</body>
</html>
Output:
First 10 values: Float32Array(10) [-0.034439802169799805, 0.03090989589691162, 0.006696964148432016, 0.02608015574514866, -0.03936990723013878, -0.16037224233150482, 0.06694218516349792, -0.006527911406010389, -0.04746570065617561, 0.014813981018960476, buffer: ArrayBuffer(40), byteLength: 40, byteOffset: 0, length: 10, Symbol(Symbol.toStringTag): 'Float32Array']
You can see the full example under examples/basic.html - to build run scripts/wasm-build.sh and go into examples/ and run a local server, npx serve can serve wasm.
You can look at examples/worker.html and examples/worker.js to see how to use web workers and web assembly, the library
handles both when window is defined, as with basic.html and also when it is not, web workers.
After compiling the WASM build, if you used the wasm-build.sh it should be inside examples/pkg, use npx serve
and open localhost:3000/worker
Clicking on generate embeddings after the model loads generates
Encoding texts: ["Hello world","How are you?"]
Embeddings shape: [2, 384]
'Hello world' vs 'How are you?': 0.305
First embedding norm: 1.000000
First 10 values: [-0.0344, 0.0309, 0.0067, 0.0261, -0.0394, -0.1604, 0.0669, -0.0065, -0.0475, 0.0148]
Small example from pytorch to encode and show similarity
from sentence_transformers import SentenceTransformer
import torch
import torch.nn.functional as F
texts = ["Hello world", "How are you"]
model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(texts, convert_to_tensor=True)
for i, emb in enumerate(embeddings):
print(f"Text: {texts[i]}")
print("First 10 values:", emb[:10].tolist())
print()
cos_sim = F.cosine_similarity(embeddings[0], embeddings[1], dim=0)
print(f"Cosine similarity ('{texts[0]}' vs '{texts[1]}'):", cos_sim.item())
Output from Python:
Text: Hello world == ['-0.0345', '0.0310', '0.0067', '0.0261', '-0.0394', '-0.1603', '0.0669', '-0.0064', '-0.0475', '0.0148']...
Text: How are you == ['-0.0314', '0.0378', '0.0763', '0.0457', '-0.0012', '-0.0748', '0.0816', '0.0102', '-0.1122', '0.0407']...
Cosine similarity ('Hello world' vs 'How are you'): 0.3624
EdgeBERT
use anyhow::Result;
use edgebert::{Model, ModelType};
fn main() -> Result<()> {
let model = Model::from_pretrained(ModelType::MiniLML6V2)?;
let texts = vec!["Hello world", "How are you"];
let embeddings = model.encode(texts.clone(), true)?;
for (i, embedding) in embeddings.iter().enumerate() {
let n = embedding.len().min(10);
println!("Text: {} == {:?}...", texts[i], &embedding[0..n]);
}
let dot: f32 = embeddings[0].iter().zip(&embeddings[1]).map(|(a, b)| a * b).sum();
let norm_a: f32 = embeddings[0].iter().map(|v| v * v).sum::<f32>().sqrt();
let norm_b: f32 = embeddings[1].iter().map(|v| v * v).sum::<f32>().sqrt();
let cos_sim = dot / (norm_a * norm_b);
println!("\nCosine similarity ('{}' vs '{}'): {:.4}", texts[0], texts[1], cos_sim);
Ok(())
}
Output
Text: Hello world == [-0.034439795, 0.030909885, 0.0066969804, 0.02608013, -0.03936993, -0.16037229, 0.06694216, -0.0065279473, -0.0474657, 0.014813968]...
Text: How are you == [-0.031447295, 0.03784213, 0.0761843, 0.045665547, -0.0012263817, -0.07488511, 0.08155286, 0.010209872, -0.11220472, 0.04075747]...
Cosine similarity ('Hello world' vs 'How are you'): 0.3623
Cosine similarity has 0.3623 for Rust and Python 0.3624, acceptable with around 99.97% accuracy, tiny discrepancy because of floating point rounding differences.