simplify_baml_macros

Crates.iosimplify_baml_macros
lib.rssimplify_baml_macros
version0.2.0
created_at2025-10-17 17:04:45.093383+00
updated_at2025-12-28 09:31:33.316188+00
descriptionProcedural macros for simplify_baml - automatic IR generation from Rust types
homepage
repositoryhttps://github.com/catethos/simplify_baml
max_upload_size
id1887933
size37,762
Caleb (catethos)

documentation

README

Simplified BAML Runtime

A minimal implementation of the BAML runtime that demonstrates core concepts by reducing the original ~50K line codebase to ~5K lines.

What is BAML?

BAML is a language for defining and calling LLM functions with structured outputs. The runtime:

  1. Converts type definitions into human-readable schemas
  2. Injects schemas into Jinja2 templates
  3. Calls LLM APIs
  4. Parses and validates LLM responses

Quick Start

[dependencies]
simplify_baml = "0.1.0"
tokio = { version = "1.0", features = ["full"] }
use simplify_baml::*;
use simplify_baml_macros::{BamlSchema, BamlClient};
use std::collections::HashMap;

// 1. Define types
#[derive(BamlSchema)]
#[baml(description = "Information about a person")]
struct Person {
    #[baml(description = "Full name")]
    name: String,
    #[baml(description = "Age in years")]
    age: i64,
}

// 2. Define function
#[baml_function(client = "openai")]
fn extract_person(
    #[baml(description = "Text containing person information")]
    text: String
) -> Person {
    "Extract person info from: {{ text }}"
}

// 3. Define client
#[derive(BamlClient)]
#[baml(provider = "OpenAI", model = "gpt-4o-mini")]
struct OpenAIClient;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let ir = BamlSchemaRegistry::new()
        .register::<Person>()
        .build_with_functions(vec![extract_person()]);

    let client = OpenAIClient::new(std::env::var("OPENAI_API_KEY")?);

    let runtime = RuntimeBuilder::new()
        .ir(ir)
        .client("openai", client)
        .build();

    let mut params = HashMap::new();
    params.insert("text".to_string(), BamlValue::String("John is 30 years old".to_string()));

    let result = runtime.execute("ExtractPerson", params).await?;
    println!("{:?}", result);
    Ok(())
}

Core Components

Component File Purpose
IR src/ir.rs Type definitions (Class, Enum, Field, BamlValue)
Macros simplify_baml_macros/ #[derive(BamlSchema)], #[baml_function], #[derive(BamlClient)]
Registry src/registry.rs Collects types and builds IR
Schema src/schema.rs Converts IR to human-readable schema strings
Renderer src/renderer.rs Jinja2 template rendering with schema injection
Parser src/parser.rs Lenient JSON parsing with type coercion
Runtime src/runtime.rs Orchestrates all components

Macro System

#[derive(BamlSchema)] — Type Definitions

#[derive(BamlSchema)]
#[baml(description = "A person entity")]
struct Person {
    #[baml(description = "Full name")]
    name: String,
    
    #[baml(rename = "yearOfBirth")]
    year_of_birth: i64,
    
    email: Option<String>,  // Automatically optional
}

#[derive(BamlSchema)]
enum Role { Engineer, Manager, Designer }

Type Mapping:

  • StringFieldType::String
  • i64, i32FieldType::Int
  • f64, f32FieldType::Float
  • boolFieldType::Bool
  • Option<T> → optional field
  • Vec<T>FieldType::List(T)
  • Custom types → FieldType::Class or FieldType::Enum

#[baml_function] — Function Definitions

#[baml_function(client = "openai")]
fn summarize(
    #[baml(description = "Text to summarize")]
    text: String,
    #[baml(description = "Max sentences")]
    max_sentences: i64
) -> Summary {
    r#"Summarize in {{ max_sentences }} sentences: {{ text }}"#
}

#[derive(BamlClient)] — Client Configuration

// OpenAI
#[derive(BamlClient)]
#[baml(provider = "OpenAI", model = "gpt-4o-mini")]
struct MyOpenAI;

// Anthropic via OpenRouter
#[derive(BamlClient)]
#[baml(provider = "Custom", base_url = "https://openrouter.ai/api/v1", model = "anthropic/claude-3.5-sonnet")]
struct ClaudeClient;

// Local endpoint
#[derive(BamlClient)]
#[baml(provider = "Custom", base_url = "http://localhost:8080/v1", model = "local-model")]
struct LocalClient;

Multi-Provider Support

Use OpenRouter for access to 100+ models with one API key:

#[derive(BamlClient)]
#[baml(provider = "Custom", base_url = "https://openrouter.ai/api/v1", model = "anthropic/claude-3.5-sonnet")]
struct ClaudeClient;

#[derive(BamlClient)]
#[baml(provider = "Custom", base_url = "https://openrouter.ai/api/v1", model = "openai/gpt-4-turbo")]
struct GPT4Client;

let runtime = RuntimeBuilder::new()
    .ir(ir)
    .client("claude", ClaudeClient::new(api_key.clone()))
    .client("gpt4", GPT4Client::new(api_key.clone()))
    .build();

Running Examples

export OPENAI_API_KEY="your-key"

cargo run --example with_macros           # Complete macro example
cargo run --example extract_person_macro  # Basic extraction
cargo run --example nested_macro          # Nested structures
cargo run --example openrouter_example    # Multi-provider (needs OPENROUTER_API_KEY)

Streaming

See STREAMING.md for streaming support with partial JSON parsing.

How It Works

┌─────────────────────────────────────────────────────────────┐
│ Rust Struct + #[derive(BamlSchema)]                         │
└────────────────────┬────────────────────────────────────────┘
                     │ Procedural Macro
                     ▼
┌─────────────────────────────────────────────────────────────┐
│ IR (Intermediate Representation)                             │
│ Class { name: "Person", fields: [...] }                      │
└────────────────────┬────────────────────────────────────────┘
                     │ SchemaFormatter
                     ▼
┌─────────────────────────────────────────────────────────────┐
│ Human-Readable Schema                                        │
│ "Answer in JSON: { name: string, age: int }"                 │
└────────────────────┬────────────────────────────────────────┘
                     │ PromptRenderer (Jinja2)
                     ▼
┌─────────────────────────────────────────────────────────────┐
│ Final Prompt → LLM → JSON Response                           │
└────────────────────┬────────────────────────────────────────┘
                     │ Parser (lenient, type coercion)
                     ▼
┌─────────────────────────────────────────────────────────────┐
│ BamlValue::Map({ "name": String("John"), "age": Int(30) })   │
└─────────────────────────────────────────────────────────────┘

What's Different from Full BAML?

Omitted: BAML language parser, CLI, WASM, advanced tracing, test runner, multi-language codegen, VS Code extension, complex retry policies.

Kept: Core IR types, schema formatting, Jinja2 rendering, lenient parsing, basic runtime.

Added: Native Rust macros for type-safe IR generation.

License

Educational implementation. For production, use the official BAML runtime.

Commit count: 0

cargo fmt