| Crates.io | llm-latency-lens-core |
| lib.rs | llm-latency-lens-core |
| version | 0.1.0 |
| created_at | 2025-11-07 21:00:06.684636+00 |
| updated_at | 2025-11-07 21:00:06.684636+00 |
| description | Core types and timing engine for LLM Latency Lens |
| homepage | https://github.com/llm-devops/llm-latency-lens |
| repository | https://github.com/llm-devops/llm-latency-lens |
| max_upload_size | |
| id | 1922160 |
| size | 54,796 |
Core timing engine and types for LLM Latency Lens.
This crate provides the foundational infrastructure for high-precision measurement of LLM API latency. It includes:
quanta crateAdd this to your Cargo.toml:
[dependencies]
llm-latency-lens-core = "0.1.0"
use llm_latency_lens_core::{TimingEngine, SessionId, RequestId};
// Create a timing engine
let engine = TimingEngine::new();
let session_id = SessionId::new();
let request_id = RequestId::new();
// Record timing events
engine.record_request_start(session_id, request_id);
// ... make LLM API call ...
engine.record_first_token(session_id, request_id);
// ... process tokens ...
engine.record_request_end(session_id, request_id);
// Get timing data
let metrics = engine.get_request_metrics(session_id, request_id)?;
println!("TTFT: {:?}", metrics.ttft);
The main timing component that provides monotonic clock access and event recording:
pub struct TimingEngine {
clock: Clock,
}
Type-safe identifiers for tracking sessions and individual requests:
pub struct SessionId(Uuid);
pub struct RequestId(Uuid);
Abstract interface that all LLM providers must implement:
#[async_trait]
pub trait Provider: Send + Sync {
async fn send_request(&self, request: StreamingRequest)
-> Result<StreamingResponse>;
fn name(&self) -> &str;
}
The timing engine uses quanta for high-precision monotonic timing with minimal overhead:
Licensed under the Apache License, Version 2.0. See LICENSE file for details.
This crate is part of the LLM Latency Lens project. See the main repository for contribution guidelines.