| Crates.io | yvdb |
| lib.rs | yvdb |
| version | 0.1.1 |
| created_at | 2025-12-13 16:35:44.617591+00 |
| updated_at | 2025-12-30 04:03:46.671215+00 |
| description | Small educational in-memory vector database with REST API, append-only durability, brute-force search, and adaptive heartbeat feat for RAG hallucination mitigation |
| homepage | |
| repository | https://github.com/thegreatbey/yvdb/tree/feature/heartbeat |
| max_upload_size | |
| id | 1983144 |
| size | 114,360 |
Small, educational vector database: single-node, in-memory store with append-only durability and brute-force top-k search.
# Windows PowerShell
# from repo root
$env:RUST_LOG="info"; cargo run
# Change bind address/port if needed (e.g., different port or LAN bind)
$env:YVDB_BIND_ADDR="127.0.0.1:8080"; cargo run
# For LAN access (will trigger Windows firewall prompt)
# $env:YVDB_BIND_ADDR="0.0.0.0:8080"; cargo run
Server listens on 127.0.0.1:8080 by default (configurable via YVDB_BIND_ADDR).
/collections/{name}/upsertRequest
{
"dimension": 3,
"metric": "cosine",
"records": [
{"id": "a", "vector": [1,0,0], "metadata": {"tag": "alpha"}}
]
}
Response
{"upserted": 1}
/collections/{name}/queryRequest
{
"vector": [0.9, 0.1, 0],
"k": 2,
"filter": {"key": "tag", "equals": "alpha"}
}
Range filter (numeric) example
{
"vector": [0.9, 0.1, 0],
"k": 10,
"filter": {"key": "price", "range": {"min": 9.0, "max": 15.0}}
}
Return L2 distance in results
{
"vector": [0.9, 0.1, 0],
"k": 3,
"return_distance": true
}
When the collection metric is L2, responses will include an optional distance field per result. For cosine metric, distance is omitted.
Response
{"results": [{"id":"a", "score":0.99, "metadata": {"tag":"alpha"}}]}
min_score)min_score; when left otu the server uses YVDB_DEFAULT_MIN_SCORE (default 0.0).YVDB_RELAX_DELTA (default 0.2) but never drops below YVDB_RELAX_FLOOR (default 0.5). THIS IS A RELAXATION. Relaxation only happens when the lowered cutoff is below the default requested one.warning string and every result echoes the applied_min_score plus a relaxed flag. distance is still included for L2 when return_distance is true./healthz surfaces a running success rate along with the current relaxation floor and default min_score so you can monitor whether the heartbeat is engaging.Errors (uniform shape)
{"code":"bad_request","message":"vector dimension mismatch"}
Codes used: bad_request, not_found, internal.
/collections/{name}/statsResponse
{"collection":"demo","count":1,"dimension":3,"metric":"cosine"}
/collections/{name}/records/{id}Response
{"deleted": true}
Idempotent: returns {"deleted": false} if the record was already absent.
When posting JSON from PowerShell, prefer sending a file to avoid quoting/encoding issues:
$json = @'
{
"records": [
{"id": "a", "vector": [1.0, 0.0, 0.0]},
{"id": "b", "vector": [0.0, 1.0, 0.0]}
],
"dimension": 3,
"metric": "cosine"
}
'@
# Write UTF-8 without BOM so the server reads clean JSON bytes
[System.IO.File]::WriteAllText("body.json", $json, (New-Object System.Text.UTF8Encoding($false)))
$abs = (Resolve-Path .\body.json).Path
# Using curl.exe and a file body (recommended)
curl.exe -X POST -H "Content-Type: application/json" --data-binary "@$abs" http://127.0.0.1:8080/collections/demo/upsert
# Or PowerShell's native client
Invoke-WebRequest -Uri http://127.0.0.1:8080/collections/demo/upsert -Method Post -ContentType "application/json" -InFile $abs
data/wal.log (JSON Lines). On startup, the server replays the log.data/snapshots/snapshot-<unix>.json to speed up restart.dimension and metric.YVDB_MAX_REQUEST_BYTES (default 1,048,576)YVDB_REQUEST_TIMEOUT_MS (default 2000)YVDB_SNAPSHOT_ON_SHUTDOWN (default false)cosine, l2.Scoring semantics
score is always larger-is-better.cosine: standard cosine similarity in [-1, 1].l2: score = -distance, so closer vectors have higher scores.k behavior
k == 0 is rejected with 400.k > count is allowed; results size is min(k, count).YVDB_DATA_DIR (default: data) — folder for WAL/snapshots.YVDB_MAX_DIMENSION (default: 4096) — upper bound for collection dimension.YVDB_MAX_BATCH (default: 1024) — max records per upsert request.YVDB_MAX_K (default: 1000) — max k per query.YVDB_SNAPSHOT_INTERVAL_SECS (default: 30) — snapshot frequency.YVDB_WAL_ROTATE_MAX_BYTES (default: 0) — rotate WAL when size exceeds this (0 disables).YVDB_SNAPSHOT_RETENTION (default: 3) — number of snapshots to keep on disk.YVDB_BIND_ADDR (default: 127.0.0.1:8080) — server bind address (use 0.0.0.0:8080 for LAN).YVDB_DEFAULT_MIN_SCORE (default: 0.0) — similarity floor used when a query omits min_score.YVDB_RELAX_DELTA (default: 0.2) — how much to lower min_score when the first pass returns no results.YVDB_RELAX_FLOOR (default: 0.5) — lowest cutoff allowed during relaxation; acts as a safety net against over-relaxing.