Crates.io | gbnf |
lib.rs | gbnf |
version | 0.1.7 |
source | src |
created_at | 2024-01-05 07:37:08.282474 |
updated_at | 2024-01-07 08:01:27.785739 |
description | A libray for working with GBNF |
homepage | |
repository | |
max_upload_size | |
id | 1089381 |
size | 145,997 |
A library for working with llama.cpp GBNF files. GBNF files represent grammar of AI output. This project is meant to help make it easier to constrain and guide GBNF driven AIs like llama.cpp using JSON schema ( a way to define the shape of JSON data ). The hope is make more useful outputs when combined with system prompting (that is hopefully also aware of JSON schema to some degree).
This library was primarily built for it's sister project, an LLM API epistemology.
cargo add gnbf
Currently this library can convert a limited but very useful subset of JSON schema:
Known issues:
Here's one of the most complext JSON schemas that can be handled right now:
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"type": "object",
"properties": {
"name": {
"description": "name of a computer user",
"type": "string"
},
"age": {
"description": "age of a computer user",
"type": "number"
},
"usesAI": {
"description": "do they use AI",
"type": "boolean"
},
"favoriteAnimal": {
"description": "favorite animal",
"enum": [
"dog",
"cat",
"none"
]
},
"currentAIModel": {
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"value": "hugging_face"
},
"name": {
"description": "name of hugging face model",
"type": "string"
}
}
},
{
"type": "object",
"properties": {
"type": {
"value": "openai"
}
}
}
]
},
"favoriteColors": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
fn simple_json_schema_basic_object_example() {
let schema = r#"
{
"$id": "https://example.com/enumerated-values.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Enumerated Values",
"type": "object",
"properties": {
"a": {
"type": "boolean"
},
"b": {
"type": "number"
},
"c": {
"type": "string"
}
}
}
"#;
let g = Grammar::from_json_schema(schema).unwrap();
let s = g.to_string();
pretty_assertions::assert_eq!(
s,
r#"################################################
# DYNAMICALLY GENERATED JSON-SCHEMA GRAMMAR
# $id: https://example.com/enumerated-values.schema.json
# $schema: https://json-schema.org/draft/2020-12/schema
# title: Enumerated Values
################################################
symbol1-a-value ::= boolean ws
symbol2-b-value ::= number ws
symbol3-c-value ::= string ws
root ::= "{" ws
"a" ws ":" ws symbol1-a-value
"b" ws ":" ws symbol2-b-value "," ws
"c" ws ":" ws symbol3-c-value "," ws
"}" ws
###############################
# Primitive value type symbols
###############################
null ::= "null" ws
boolean ::= "true" | "false" ws
string ::= "\"" ([^"\\] | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]))* "\"" ws
number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws
ws ::= [ ]
"#
)
}
I'm totally down for contributors, please add tests.
See the documentation.
Multiple MIT licensed examples of GBNF were used from the llama.cpp
examples for grammar for automated tests for compliance and general inspiration for this project from python JSON schema converter. Thank you.