pv_leopard

Crates.io	pv_leopard
lib.rs	pv_leopard
version	2.0.3
source	src
created_at	2022-03-19 16:58:53.284588
updated_at	2024-09-16 18:39:36.275687
description	The Rust bindings for Picovoice's Leopard library
homepage	https://picovoice.ai/platform/leopard/
repository	https://github.com/Picovoice/leopard
max_upload_size	52428800
id	553236
size	51,679,312

Albert Ho (albho)

documentation

README

Leopard Binding for Rust

Leopard Speech-to-Text Engine

Made in Vancouver, Canada by Picovoice

Leopard is an on-device speech-to-text engine. Leopard is:

Private; All voice processing runs locally.
Accurate
Compact and Computationally-Efficient
Cross-Platform:
- Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
- Android and iOS
- Chrome, Safari, Firefox, and Edge
- Raspberry Pi (3, 4, 5)

Compatibility

Rust 1.54+
Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), and Raspberry Pi (3, 4, 5).

Installation

First you will need Rust and Cargo installed on your system.

To add the leopard library into your app, add pv_leopard to your apps Cargo.toml manifest:

[dependencies]
pv_leopard = "*"

If you prefer to clone the repo and use it locally, first run copy.sh. (NOTE: on Windows, Git Bash or another bash shell is required, or you will have to manually copy the libs into the project). Then you can reference the local binding location:

[dependencies]
pv_leopard = { path = "/path/to/rust/binding" }

AccessKey

Leopard requires a valid Picovoice AccessKey at initialization. AccessKey acts as your credentials when using Leopard SDKs. You can get your AccessKey for free. Make sure to keep your AccessKey secret. Signup or Login to Picovoice Console to get your AccessKey.

Usage

Create an instance of the engine and transcribe an audio file:

use leopard::LeopardBuilder;

fn main() {
  let access_key = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

  let leopard: Leopard = LeopardBuilder::new()
    .access_key(access_key)
    .init()
    .expect("Unable to create Leopard");
  if let Ok(leopard_transcript) = leopard.process_file("${AUDIO_FILE_PATH}") {
      println!("{}", leopard_transcript.transcript);
  }
}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${AUDIO_FILE_PATH} to the path an audio file.

The model file contains the parameters for the Leopard engine. You may create bespoke language models using Picovoice Console and then pass in the relevant file.

Language Model

The Leopard Rust SDK comes preloaded with a default English language model (.pv file). Default models for other supported languages can be found in lib/common.

Create custom language models using the Picovoice Console. Here you can train language models with custom vocabulary and boost words in the existing vocabulary.

Pass in the .pv file via the .model_path() Builder argument:

let leopard: Leopard = LeopardBuilder::new()
    .access_key("${ACCESS_KEY}")
    .model_path("${MODEL_FILE_PATH}")
    .init()
    .expect("Unable to create Leopard");

Word Metadata

Along with the transcript, Leopard returns metadata for each transcribed word. Available metadata items are:

Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
Confidence: Leopard's confidence that the transcribed word is accurate. It is a number within [0, 1].
Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers. If speaker diarization is not enabled, the value will always be -1.

Demos

The Leopard Rust demo project is a Rust console app that allows for processing real-time audio (i.e. microphone) and files using Leopard.

Commit count: 233