hallucination-detection

Crates.io	hallucination-detection
lib.rs	hallucination-detection
version	0.1.5
source	src
created_at	2024-12-06 01:00:04.409402
updated_at	2024-12-08 02:09:48.122791
description	Extremely fast Hallucination Detection for RAG using BERT NER, noun, and number analysis
homepage
repository	https://github.com/devflowinc/trieve
max_upload_size
id	1473786
size	112,899

DNS (densumesh)

documentation

README

hallucination-detection

A high-performance Rust library for detecting hallucinations in Large Language Model (LLM) outputs using BERT Named Entity Recognition (NER), proper noun analysis, and numerical comparisons.

Features

Fast and accurate hallucination detection for RAG (Retrieval-Augmented Generation) systems
Numerical comparison and validation
Unknown word detection using comprehensive English word dictionary
Configurable scoring weights and detection options
Async/await support with Tokio runtime
Optional ONNX support for improved performance
Optional BERT-based Named Entity Recognition for proper noun analysis

Installation

Add this to your Cargo.toml:

[dependencies]
hallucination-detection = "^0.1.3"

If you want to use NER:

Download libtorch from https://pytorch.org/get-started/locally/. This package requires v2.4: if this version is no longer available on the "get started" page, the file should be accessible by modifying the target link, for example https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.4.0%2Bcpu.zip for a Linux version with CPU.
Extract the library to a location of your choice
Set the following environment variables

Linux:

export LIBTORCH=/path/to/libtorch
export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH

Windows

$Env:LIBTORCH = "X:\path\to\libtorch"
$Env:Path += ";X:\path\to\libtorch\lib"

[dependencies]
hallucination-detection = { version = "^0.1.3", features = ["ner"] }

If you want to use ONNX for the NER models, you need to either install the ort runtime or include it in your dependencies:

hallucination-detection = { version = "^0.1.3", features = ["ner", "onnx"] }
ort = { version = "...", features = [ "download-binaries" ] }

Quick Start

use hallucination_detection::{HallucinationDetector, HallucinationOptions};

#[tokio::main]
async fn main() {
    // Create detector with default options
    let detector = HallucinationDetector::new(Default::default())
        .expect("Failed to create detector");

    // Example texts
    let llm_output = String::from("Tesla sold 500,000 cars in Europe last quarter.");
    let references = vec![
        String::from("Tesla reported strong sales in European markets."),
        String::from("The company's global deliveries increased.")
    ];

    // Detect hallucinations
    let score = detector.detect_hallucinations(&llm_output, &references).await;

    println!("Hallucination Score: {:#?}", score);
}

Configuration

You can customize the detector's behavior using HallucinationOptions:

use hallucination_detection::{HallucinationOptions, ScoreWeights};

let options = HallucinationOptions {
    weights: ScoreWeights {
        proper_noun_weight: 0.4,
        unknown_word_weight: 0.1,
        number_mismatch_weight: 0.5,
    },
    use_ner: true,
};

let detector = HallucinationDetector::new(options)
    .expect("Failed to create detector");

Output

The detector returns a HallucinationScore struct containing:

pub struct HallucinationScore {
    pub proper_noun_score: f64,
    pub unknown_word_score: f64,
    pub number_mismatch_score: f64,
    pub total_score: f64,
    pub detected_hallucinations: Vec<String>,
}

Scores range from 0.0 (no hallucination) to 1.0 (complete hallucination)
detected_hallucinations contains specific elements that were flagged

Performance Considerations

The NER model is loaded once and reused across predictions
English word dictionary is cached locally for faster subsequent runs
Async operations allow for non-blocking execution
ONNX runtime provides optimized model inference

Features Flags

ner: Enables BERT Named Entity Recognition (default: disabled)
onnx: Uses ONNX runtime for improved performance (default: disabled)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Authors

Devflow Inc. humans@trieve.ai

Acknowledgments

Uses rust-bert for NER capabilities
English word list from dwyl/english-words

Commit count: 4235

hallucination-detection

documentation

README

hallucination-detection

Features

Installation

Linux:

Windows

Quick Start

Configuration

Output

Performance Considerations

Features Flags

License

Contributing

Authors

Acknowledgments

cargo fmt