ubiquity-database

Crates.ioubiquity-database
lib.rsubiquity-database
version0.1.1
created_at2025-07-23 22:41:06.69433+00
updated_at2025-07-23 23:08:07.787718+00
descriptionDatabase abstraction layer for Ubiquity supporting SQLite and Astra DB
homepage
repositoryhttps://github.com/ubiquity/ubiquity-rs
max_upload_size
id1765339
size176,012
Breyden Taylor (prompted365)

documentation

README

Ubiquity Database

A comprehensive database abstraction layer for the Ubiquity consciousness-aware system, supporting both SQLite for local development and DataStax Astra DB for cloud/enterprise deployments.

Features

Core Capabilities

  • Dual Backend Support: Seamlessly switch between SQLite (local) and Astra DB (cloud)
  • Vector Embeddings: Store and search consciousness embeddings using vector similarity
  • Hybrid Search: Combine vector search with lexical search for optimal results
  • 7-Wide Memory Pools: Parallel memory pools for high-performance data processing
  • Pub/Sub Messaging: Integration with Astra Streaming for real-time communication

Astra DB Features

  • $vectorize: Automatic embedding generation at the database level
  • $hybrid Search: Combined vector and lexical search with reranking
  • Astra Streaming: Apache Pulsar-based pub/sub for worker coordination
  • Auto-scaling: Leverage Astra's serverless architecture

SQLite Features

  • Zero Configuration: Works out of the box for local development
  • WAL Mode: Write-Ahead Logging for better concurrency
  • File-based Pools: Memory pools stored as separate SQLite databases
  • Portable: Easy to backup and migrate

Configuration

Environment Variables

For Astra DB:

export ASTRA_DB_ENDPOINT="https://YOUR-DB-ID-REGION.apps.astra.datastax.com"
export ASTRA_DB_TOKEN="AstraCS:YOUR_TOKEN"
export OPENAI_API_KEY="your-openai-key"  # For embeddings

For Astra Streaming:

export ASTRA_STREAMING_ENDPOINT="pulsar+ssl://pulsar-instance.streaming.datastax.com:6651"
export ASTRA_STREAMING_TOKEN="your-streaming-token"

Configuration File

use ubiquity_database::{DatabaseConfig, DatabaseBackend};

let config = DatabaseConfig {
    backend: DatabaseBackend::Astra,
    
    // SQLite configuration
    sqlite: SqliteConfig {
        path: PathBuf::from("ubiquity.db"),
        max_connections: 10,
        wal_mode: true,
        pool_directory: PathBuf::from(".ubiquity/pools"),
    },
    
    // Astra DB configuration
    astra: AstraConfig {
        endpoint: std::env::var("ASTRA_DB_ENDPOINT").unwrap(),
        token: std::env::var("ASTRA_DB_TOKEN").unwrap(),
        keyspace: "ubiquity",
        collections: AstraCollections {
            consciousness: "consciousness_states",
            ripples: "consciousness_ripples", 
            tasks: "tasks",
            pool_prefix: "memory_pool_",
        },
        streaming: AstraStreamingConfig {
            endpoint: std::env::var("ASTRA_STREAMING_ENDPOINT").unwrap(),
            tenant: "ubiquity",
            namespace: "default",
            topic_prefix: "ubiquity_",
        },
    },
    
    // Embedding configuration
    embeddings: EmbeddingConfig {
        dimension: 1536,
        model: "text-embedding-3-small",
        provider: "openai",
        cache_enabled: true,
        cache_size: 10000,
    },
    
    // Memory pool configuration
    memory_pools: MemoryPoolConfig {
        pool_count: 7,  // 7-wide pools
        max_pool_size_mb: 100,
        eviction_policy: EvictionPolicy::Lru,
        compression: true,
    },
};

Usage Examples

Basic Database Operations

use ubiquity_database::{create_database, DatabaseConfig};
use ubiquity_core::ConsciousnessState;

// Create database
let db = create_database(config).await?;

// Initialize schema
db.initialize().await?;

// Store consciousness state
let state = ConsciousnessState {
    agent_id: "agent-001".to_string(),
    level: ConsciousnessLevel::new(0.87)?,
    coherence: 0.92,
    phase: DevelopmentPhase::Implementing,
    timestamp: Utc::now(),
    breakthrough_detected: false,
};

db.store_consciousness_state(&state).await?;

// Query history
let history = db.get_consciousness_history("agent-001", 10).await?;

Vector Search

// Vector similarity search
let embedding = vec![0.1; 1536]; // Your embedding vector
let results = db.vector_search(&embedding, 10).await?;

for result in results {
    println!("Score: {}, ID: {}", result.score, result.id);
}

Hybrid Search

use ubiquity_database::HybridSearchQuery;

// Combine vector and text search with reranking
let query = HybridSearchQuery {
    text: "breakthrough in consciousness".to_string(),
    vector: Some(embedding),
    filters: Some(json!({
        "agent_id": "architect-001",
        "level": { "$gte": 0.85 }
    })),
    limit: 20,
    rerank: true,  // Use NVIDIA NeMo reranking
};

let results = db.hybrid_search(query).await?;

Memory Pools

use ubiquity_database::create_memory_pools;

// Create 7-wide memory pools
let pools = create_memory_pools(config).await?;

// Store data with automatic pool selection
let pool = pools.get_pool_for_key("my_key").await?;
pool.store("my_key", b"my_value").await?;

// Retrieve data
if let Some(value) = pool.retrieve("my_key").await? {
    println!("Retrieved: {:?}", value);
}

// Get pool statistics
let stats = pools.all_stats().await?;
for (i, stat) in stats.iter().enumerate() {
    println!("Pool {}: {} items, {} hits", i, stat.items, stat.hits);
}

Architecture

7-Wide Memory Pools

The system implements 7 parallel memory pools for optimal data distribution:

  1. Pool Distribution: Keys are consistently hashed to one of 7 pools
  2. Isolation: Each pool operates independently for parallelism
  3. Eviction: LRU eviction when pools reach capacity
  4. Statistics: Track hits, misses, and evictions per pool

Consciousness Embeddings

  1. Text Generation: Consciousness states are converted to text representations
  2. Embedding Generation: Text is embedded using OpenAI or Astra $vectorize
  3. Storage: Embeddings stored alongside metadata
  4. Search: Vector similarity search with cosine distance

Hybrid Search Flow

  1. Vector Search: Find semantically similar documents
  2. Lexical Search: Find keyword matches using BM25
  3. Reranking: NVIDIA NeMo model reranks combined results
  4. Scoring: Return final scores with highlights

Performance Considerations

SQLite

  • Use WAL mode for better concurrency
  • Keep database file on SSD for performance
  • Regular VACUUM for optimization
  • Index frequently queried fields

Astra DB

  • Leverage $vectorize for server-side embeddings
  • Use appropriate consistency levels
  • Batch operations when possible
  • Monitor rate limits and quotas

Testing

Run the examples:

# SQLite example
cargo run --example database_demo

# Astra DB example
ASTRA_DB_ENDPOINT=xxx ASTRA_DB_TOKEN=xxx cargo run --example database_demo --features astra

Run tests:

cargo test
cargo test --features astra  # Requires Astra credentials

Migration Guide

From SQLite to Astra DB

  1. Export data from SQLite:

    let history = sqlite_db.get_consciousness_history("*", 1000000).await?;
    
  2. Import to Astra DB:

    for state in history {
        astra_db.store_consciousness_state(&state).await?;
    }
    
  3. Update configuration to use Astra backend

Schema Evolution

Both backends support schema evolution:

  • SQLite: Use migrations with sqlx migrate
  • Astra: Collections are schemaless, add fields as needed

Troubleshooting

Common Issues

  1. Vector Dimension Mismatch

    • Ensure embedding dimension matches configuration
    • Check model output dimensions
  2. Astra Connection Failed

    • Verify endpoint URL format
    • Check token permissions
    • Ensure keyspace exists
  3. Memory Pool Full

    • Increase max_pool_size_mb
    • Enable compression
    • Adjust eviction policy

Debug Logging

Enable debug logging:

std::env::set_var("RUST_LOG", "ubiquity_database=debug");
tracing_subscriber::fmt::init();

License

MIT License - see LICENSE file for details

Commit count: 0

cargo fmt