| Crates.io | ringkernel-metal |
| lib.rs | ringkernel-metal |
| version | 0.4.0 |
| created_at | 2025-12-03 16:16:49.595481+00 |
| updated_at | 2026-01-25 21:24:49.977308+00 |
| description | Metal backend for RingKernel - Apple GPU support |
| homepage | https://github.com/mivertowski/RustCompute |
| repository | https://github.com/mivertowski/RustCompute |
| max_upload_size | |
| id | 1964464 |
| size | 110,742 |
Apple Metal backend for RingKernel.
Implemented - Full Metal backend with event-driven execution model.
This crate provides GPU compute support for RingKernel on Apple platforms using the Metal framework. It targets macOS, iOS, and Apple Silicon devices with unified memory architecture.
metal feature enabledRingKernelRuntime trait implementationMetal does not support CUDA-style cooperative groups, so persistent kernels use event-driven execution with host-side dispatch loops. This provides equivalent functionality with different performance characteristics:
| Aspect | CUDA Persistent | Metal Event-Driven |
|---|---|---|
| Kernel lifetime | Long-running | Per-message dispatch |
| Grid synchronization | grid.sync() |
Host-driven barriers |
| Command latency | ~0.03µs (mapped memory) | ~10-50µs (dispatch) |
| Best for | Interactive commands | Batch compute |
use ringkernel_metal::{MetalRuntime, is_metal_available};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
if !is_metal_available() {
eprintln!("Metal not available");
return Ok(());
}
// Create runtime
let runtime = MetalRuntime::new().await?;
// Launch kernel with options
let kernel = runtime.launch("compute", Default::default()).await?;
kernel.activate().await?;
// Send messages (triggers compute dispatch)
kernel.send_envelope(envelope).await?;
// Receive results
let response = kernel.receive_timeout(Duration::from_secs(1)).await?;
// Clean shutdown
kernel.terminate().await?;
runtime.shutdown().await?;
Ok(())
}
MetalDevice - Wrapper around metal::Device with capability queriesMetalBuffer - GPU buffer with StorageModeShared for unified memoryMetalKernel - Full KernelHandleInner implementation with:
new_library_with_sourceMetalRuntime - RingKernelRuntime implementation with K2K brokerKernel-to-kernel messaging uses a routing table and inbox system:
// K2K structures
MetalK2KInboxHeader // 64 bytes - inbox state with lock/sequence
MetalK2KRouteEntry // 32 bytes - route to neighbor
MetalK2KRoutingTable // Routing with 2D (4-neighbor) or 3D (6-neighbor) support
For stencil computations, the MetalHaloExchange manager handles:
use ringkernel_metal::{MetalHaloExchange, HaloExchangeConfig};
// Create 2D grid with halo exchange
let config = HaloExchangeConfig::new_2d(
8, 8, // grid: 8x8 tiles
64, 64, // tile: 64x64 cells
1, // halo: 1 cell
);
let mut exchange = MetalHaloExchange::new(config);
// Initialize on device
exchange.initialize(&device)?;
// Perform exchange cycle
exchange.exchange(&tile_buffers)?;
The crate provides MSL templates for common patterns:
RING_KERNEL_MSL_TEMPLATE - Base ring kernel with control block, message queues, and HLCK2K_HALO_EXCHANGE_MSL_TEMPLATE - K2K communication for stencil computations| Platform | Status |
|---|---|
| macOS (Apple Silicon) | ✅ Full support |
| macOS (Intel + AMD) | ✅ Full support |
| iOS | ✅ Supported (untested) |
| Linux/Windows | ❌ Stub only |
On non-macOS platforms, MetalRuntime::new() returns BackendUnavailable.
Apache-2.0