consortium

Crates.io	consortium
lib.rs	consortium
version	0.0.1
created_at	2025-10-16 00:24:37.056931+00
updated_at	2025-10-16 00:24:37.056931+00
description	Evergreen best-of-best LLM generations at your fingertips
homepage
repository	https://github.com/KenAKAFrosty/consortium
max_upload_size
id	1885158
size	4,519

Ken aka Frosty (KenAKAFrosty)

documentation

Provide a list of prompts. Completions are not needed, but these prompts should come from actual use cases of your desired downstream task.
Embeddings are created for each of the prompts. Then a collection is created of the most-DISsimilar-prompts to the average of the then-current collection, to create an even fence around the latent space of the possible prompts.
The collection will grow up to a target number of samples (which, naturally, should be drastically fewer than the number of total prompts), or up to a select cosine similarity value which acts as a tripwire; when getting the next most-dissimilar-from-group prompt, if the cosine similarity is above this tripwire, no more are added and the collection is complete.
Then, best-of-the-best LLM responses will be generated against those inputs, through multi-model sampling followed by a multi-model judgement phase where the outputs are source-blind sorted according to each judgement model's preference, then scored in aggregate.
The final result is a highest-possible-quality data set of prompt -> completion pairs, across the most diverse possible prompts for your downstream task.
Also, this consortium style inference could be used as a traditional inference client/sdk as well. Naturally, this will be categorically slower - and expensive - but that may be fine for some use cases!

High focus on performance. Latency and throughput are first-class targets to optimize.
Resilience and retry behavior baked in.
The first passes at this may take direct/concrete/non-generic approaches, but the long-term intent is to genericify any of the key steps (for example: getting the embeddings, the system prompt for the judgement stage, etc.)
Also will offer a Typescript-friendly client with napi.
Should probably also consider Python-friendly client with pyo3.
This crate will also naturally end up being a great source for API clients for all sorts of model inference providers, just to serve the main purpose; as such, the other part of this crate's namesake is having the consortium of LLMs to choose from for whatever application needs you may have.
User-facing API surface should leverage Rust's strengths, and make invalid states unrepresentable. It is not a design goal for the user-facing clients to use an OpenAPI spec

Licensed under either of

at your option.

Commit count: 0