conveyor-etl-operator

Crates.ioconveyor-etl-operator
lib.rsconveyor-etl-operator
version0.1.0
created_at2025-12-23 03:00:19.34507+00
updated_at2025-12-23 03:00:19.34507+00
descriptionKubernetes operator for Conveyor ETL
homepage
repository
max_upload_size
id2000676
size236,533
Alex Choi (alexchoi0)

documentation

README

conveyor-operator

Kubernetes operator for managing Conveyor resources.

Overview

The operator watches Custom Resource Definitions (CRDs) and reconciles them with the actual cluster state. It manages the lifecycle of router clusters, pipelines, and data pipeline services.

Custom Resources

EtlRouterCluster

Deploys and manages a Raft cluster of router nodes.

apiVersion: conveyor.etl/v1
kind: EtlRouterCluster
metadata:
  name: my-cluster
spec:
  replicas: 3
  image: conveyor-router:latest
  raft:
    electionTimeoutMs: 300
    heartbeatIntervalMs: 100
  resources:
    requests:
      memory: "256Mi"
      cpu: "100m"
    limits:
      memory: "512Mi"
      cpu: "500m"
  storage:
    size: 10Gi
    storageClassName: standard

EtlPipeline

Defines a data pipeline from source through transforms to sink.

apiVersion: conveyor.etl/v1
kind: EtlPipeline
metadata:
  name: user-analytics
spec:
  source: kafka-users
  steps:
    - filter-active
    - enrich-geo
  sink: clickhouse-analytics
  dlq:
    enabled: true
    maxRetries: 3

EtlSource / EtlTransform / EtlSink

Define individual data pipeline services.

apiVersion: conveyor.etl/v1
kind: EtlSource
metadata:
  name: kafka-users
spec:
  grpc:
    endpoint: kafka-source-svc:50051

Architecture

┌─────────────────────────────────────────────────┐
│                  conveyor-operator              │
├─────────────────────────────────────────────────┤
│  ┌───────────────────────────────────────────┐  │
│  │              Controller Manager           │  │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────────┐  │  │
│  │  │ Cluster │ │Pipeline │ │  Resource   │  │  │
│  │  │  Ctrl   │ │  Ctrl   │ │   Ctrls     │  │  │
│  │  └────┬────┘ └────┬────┘ └──────┬──────┘  │  │
│  └───────┼───────────┼─────────────┼─────────┘  │
│          │           │             │            │
│          ▼           ▼             ▼            │
│  ┌───────────────────────────────────────────┐  │
│  │              Kubernetes API               │  │
│  │   (StatefulSets, Services, ConfigMaps)    │  │
│  └───────────────────────────────────────────┘  │
│                      │                          │
│                      ▼                          │
│  ┌───────────────────────────────────────────┐  │
│  │           Conveyor Cluster (gRPC)         │  │
│  └───────────────────────────────────────────┘  │
└─────────────────────────────────────────────────┘

Installation

Install CRDs

kubectl apply -f crates/conveyor-operator/deploy/crds/crds.yaml

Deploy Operator

# Using kustomize
kubectl apply -k crates/conveyor-operator/deploy/operator/

# Using Helm
helm install conveyor-operator crates/conveyor-operator/deploy/helm/conveyor-operator/

Controllers

ClusterController

  • Creates StatefulSet for router nodes
  • Manages headless Service for Raft communication
  • Creates ConfigMap with cluster configuration
  • Polls cluster health and updates status

PipelineController

  • Validates pipeline references (source, transforms, sink exist)
  • Submits pipeline to router cluster via gRPC
  • Updates pipeline status with assignment info

ResourceControllers

  • SourceController, TransformController, SinkController
  • Register/deregister services with router cluster

Development

# Run locally against a cluster
cargo run -p conveyor-operator

# Build Docker image
docker build -f crates/conveyor-operator/Dockerfile -t conveyor-operator:dev .

Helm Values

replicaCount: 1
image:
  repository: conveyor-operator
  tag: latest
rbac:
  create: true
serviceAccount:
  create: true
resources:
  limits:
    cpu: 200m
    memory: 256Mi
Commit count: 0

cargo fmt