mecha10-diagnostics

Crates.iomecha10-diagnostics
lib.rsmecha10-diagnostics
version0.1.25
created_at2025-11-25 02:20:25.05206+00
updated_at2026-01-01 01:53:48.118869+00
descriptionDiagnostics and metrics collection for Mecha10 robotics framework
homepage
repositoryhttps://github.com/mecha10/mecha10
max_upload_size
id1949088
size153,440
Peter C (PeterChauYEG)

documentation

README

mecha10-diagnostics

Topic-based diagnostics and performance monitoring service for Mecha10.

Overview

The diagnostics service provides comprehensive performance monitoring through the framework's pub/sub system. All diagnostics are published to topics under the /diagnostics namespace, allowing real-time monitoring via CLI, dashboard, or custom subscribers.

Features

  • Topic-based architecture: Seamlessly integrates with Mecha10's pub/sub system
  • Streaming pipeline metrics: Frame pipeline, latency, encoding performance, bandwidth
  • WebRTC metrics: Connection stats, quality metrics (RTT, packet loss, jitter)
  • WebSocket metrics: Connection tracking, message rates
  • Redis metrics: Connection pool, operation latency and throughput
  • Docker metrics: Container CPU, memory, network, and I/O stats
  • System metrics: Host CPU, memory, disk, and network usage
  • Low overhead: Atomic counters on hot paths, background aggregation

Diagnostic Topics

All diagnostics use the /diagnostics namespace:

/diagnostics/streaming/pipeline     # Frame pipeline metrics
/diagnostics/streaming/latency      # Latency measurements
/diagnostics/streaming/encoding     # Encoding performance
/diagnostics/streaming/bandwidth    # Bandwidth usage
/diagnostics/webrtc/connections     # WebRTC peer connections
/diagnostics/webrtc/quality         # RTT, packet loss, jitter
/diagnostics/websocket/connections  # WebSocket connections
/diagnostics/websocket/messages     # WebSocket message stats
/diagnostics/redis/pool             # Redis connection pool
/diagnostics/redis/operations       # Redis operation metrics
/diagnostics/docker/containers      # Docker container stats
/diagnostics/godot/performance      # Godot FPS, frame time, physics
/diagnostics/godot/scene            # Godot node count, memory
/diagnostics/godot/connection       # Godot WebSocket health
/diagnostics/system/resources       # System-wide resources
/diagnostics/node/{id}/health       # Per-node health

Usage

Publishing Streaming Diagnostics

use mecha10_diagnostics::prelude::*;
use mecha10_core::prelude::*;

// Create collector
let collector = StreamingCollector::new("simulation-bridge");

// Record metrics on hot path (minimal overhead)
collector.record_frame_received();
collector.record_frame_encoded(5000); // 5ms encode time
collector.record_frame_sent(10000);   // 10KB frame

// Publish aggregated metrics periodically (every 1-5 seconds)
collector.publish_all(&ctx, 400_000).await?;

Subscribing to Diagnostics

use mecha10_diagnostics::prelude::*;
use mecha10_core::prelude::*;

// Subscribe to streaming pipeline metrics
let mut rx = ctx.subscribe(
    Topic::<DiagnosticMessage<StreamingPipelineMetrics>>::new(
        TOPIC_DIAGNOSTICS_STREAMING_PIPELINE
    )
).await?;

while let Some(msg) = rx.recv().await {
    println!("FPS: {}, Frames dropped: {}",
        msg.payload.fps,
        msg.payload.frames_dropped
    );
}

Docker Metrics

// Create Docker collector
let docker = DockerCollector::new("diagnostics-service").await;

// Collect metrics for all containers
docker.collect_all_containers(&ctx).await?;

System Metrics

// Create system collector
let system = SystemCollector::new("diagnostics-service");

// Collect and publish system metrics
system.collect_metrics(&ctx).await?;

Godot Metrics

// Create Godot collector
let godot = GodotCollector::new(
    "simulation-bridge",
    "ws://localhost:11008".to_string(),  // Control URL
    "ws://localhost:11009".to_string(),  // Camera URL
);

// Track connection events
godot.set_control_connected(true);
godot.set_camera_connected(true);

// Record messages
godot.record_control_message(now_micros());
godot.record_camera_frame(now_micros());

// Publish connection health
godot.publish_connection_metrics(&ctx).await?;

// Publish performance metrics (data from Godot)
godot.publish_performance_metrics(
    &ctx,
    60.0,   // fps
    60.0,   // target_fps
    16.6,   // frame_time_ms
    2.5,    // physics_time_ms
    10.0,   // render_time_ms
    4.1,    // idle_time_ms
).await?;

CLI Integration

Monitor diagnostics in real-time:

# View all diagnostics
mecha10 diagnostics

# Filter by category
mecha10 diagnostics --category streaming
mecha10 diagnostics --category docker

# Live streaming metrics only
mecha10 diagnostics --streaming

# Historical query
mecha10 diagnostics --from "2024-01-01 12:00" --to "2024-01-01 13:00"

# Export to file
mecha10 diagnostics --export metrics.json

Dashboard Integration

The diagnostics service integrates with the dashboard via WebSocket:

  • Real-time charts and visualizations
  • Bottleneck detection
  • Alerting on anomalies
  • Historical analysis

Access at: http://localhost:3000/dashboard/diagnostics

Metric Types

Counter

Monotonically increasing value (e.g., frames received)

let counter = Counter::new();
counter.inc();
counter.add(5);
let total = counter.get();

Gauge

Current value that can go up or down (e.g., queue depth)

let gauge = Gauge::new();
gauge.set(10);
gauge.inc();
gauge.dec();

Histogram

Distribution tracking with percentiles (e.g., encoding latency)

let mut hist = Histogram::for_latency()?;
hist.record(5000); // Record 5ms
let p95 = hist.p95();
let p99 = hist.p99();

Performance

The diagnostics system is designed for minimal overhead:

  • Atomic counters: Zero-cost increment on hot paths
  • Background aggregation: Expensive operations (percentiles) run off hot path
  • Rate limiting: Automatic 1-second intervals for publishing
  • Zero-copy: Efficient Arc-based data sharing

Overhead on streaming pipeline: < 1% CPU, < 1MB memory

Architecture

┌─────────────────────────────────────────┐
│        Diagnostic Publishers            │
│  (Simulation Bridge, Nodes, Services)   │
└────────────────┬────────────────────────┘
                 │
                 ▼
         ┌───────────────┐
         │ Redis Pub/Sub │
         └───────────────┘
                 │
        ┌────────┴────────┐
        ▼                 ▼
   ┌────────┐      ┌──────────┐
   │  CLI   │      │Dashboard │
   └────────┘      └──────────┘
        │                 │
        ▼                 ▼
┌──────────────────────────────┐
│    Telemetry Service         │
│    (Historical Archive)      │
└──────────────────────────────┘

Future Enhancements

  1. Redis instrumentation: Add detailed operation tracking to mecha10-core::Context
  2. WebRTC stats: Extract metrics from WebRTC peer connections
  3. WebSocket tracking: Instrument WebSocket connections
  4. Custom metrics: Allow nodes to publish custom diagnostic metrics
  5. Alerting: Automatic alerts on threshold violations
  6. Profiling integration: Integration with performance profiling tools

License

MIT

Commit count: 0

cargo fmt