Crates.io | ragit-pdl |
lib.rs | ragit-pdl |
version | |
source | src |
created_at | 2024-12-30 14:18:17.549094 |
updated_at | 2025-02-01 12:19:38.334878 |
description | pdl parser for ragit |
homepage | |
repository | |
max_upload_size | |
id | 1499191 |
Cargo.toml error: | TOML parse error at line 17, column 1 | 17 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
Pdl is a special file format used by ragit project to represent prompts. It allows you to
Pdl is basically a readable format of LLM messages. For example,
<|user|>
Hi, what's your name?
<|assistant|>
I'm Llama.
<|user|>
How old are you?
is converted to
[
{
"role": "user",
"content": "Hi, what's your name?",
}, {
"role": "assistnat",
"content": "I'm Llama",
}, {
"role": "user",
"content": "How old are you?",
},
]
Each turn must starts with a turn-separator: <|user|>
, <|assistant|>
, <|system|>
or <|schema|>
. A turn-separator must be following and followed by a newline character. If a content comes before any turn-separator, that's an error.
<|schema|>
is a special type of a turn. I'll talk about it later.
You can write a pragmatic prompt with tera template engine. When the engine parses a pdl file, the file first goes through the engine. That means tera syntax is applied before any pdl syntax. You can create or remove a turn using tera syntax, or create a templated schema. You can also write comments with its syntax.
There'a special syntax in pdl that allows you to embed images.
TODO: write doc
You can force LLMs to output a json value with a schema. You can set the schema with a <|schema|>
turn. If it's not given, it doesn't check anything. If it's given more than once, that's an error.
<|schema|>
{ name: str, age: int }
<|user|>
Tell me about you.
The above pdl forces LLMs to output a json like { "name": "Llama", "age": 4 }
. It's not a magic. It's just a prompt-enhancement. So I recommend you to
<|schema|>
turn does not reach the LLM.You can add constraints to schema. For example, { name: str, age: int { min: 0, max: 100 } }
forces the age
value to be between 0 and 100 (both inclusive).
Basically, pdl engine first extracts json-looking string from LLM output, then parses it. For example, if the schema is a json object, the engine tries to match a curly brace using regular expression. If it fails to parse json, that's an error.
There are 2 cases where it doesn't parse json.
str
, it just treats the entire output as the string. It doesn't look for quotation marks, and it doesn't run the parser. You can also add constraints to str
. For example, if the schema is str { min: 100 }
, it makes sure that the length of the entire output is at least 100 characters.yesno
, it makes sure that the LLM's output is either yes
or no
. You cannot mix it with other json schema because yes
and no
are not valid json values. If yes/no is all you need, yesno
is better than bool
because LLMs are usually better at English than json.