Crates.io | ann_dataset |
lib.rs | ann_dataset |
version | 0.1.3 |
source | src |
created_at | 2024-04-18 20:01:01.768726 |
updated_at | 2024-06-05 13:48:55.420564 |
description | A lightweight research library for managing Approximate Nearest Neighbor search datasets. |
homepage | |
repository | https://github.com/sbruch/ann-dataset |
max_upload_size | |
id | 1212850 |
size | 49,337 |
A lightweight research library for managing Approximate Nearest Neighbor search datasets.
It offers the following features:
Find out more on crates.io.
It is straightforward to read an ANN dataset. The code snippet below gives a concise example.
use ann_dataset::{AnnDataset, Hdf5File, InMemoryAnnDataset, Metric,
PointSet, QuerySet, GroundTruth};
// Load the dataset.
let dataset = InMemoryAnnDataset::<f32>::read(path_to_hdf5)
.expect("Failed to read the dataset.");
// Get a reference to the data points.
let data_points: &PointSet<_> = dataset.get_data_points();
// Get the test query set.
let test: &QuerySet<_> = dataset.get_test_query_set()
.expect("Failed to load test query set.");
let test_queries: &PointSet<_> = test.get_points();
let gt: &GroundTruth = test.get_ground_truth(&Metric::InnerProduct)
.expect("Failed to load ground truth for InnerProduct search.");
// Compute recall, assuming `retrieved_set` is &[Vec<usize>],
// where the `i`-th entry is a list of ids of retrieved points
// for the `i`-th query.
let recall = gt.mean_recall(retrieved_set);