# RBIR [![](https://img.shields.io/discord/1283371436773212212?logo=discord&label=discord)](https://discord.gg/SshxvYpn) `RBIR` stands for **Rewrite Bigdata in Rust**. RBIR aims to create a big data ecosystem using Rust. This project declares our manifesto and serves as a collection of RBIR projects and posts for anyone interested in joining this journey. ## Projects ### New Projects - [tikv](https://github.com/tikv/tikv): Distributed transactional key-value database - [databend](https://github.com/datafuselabs/databend/): A Rust data warehouse, alternative to [Snowflake](https://www.snowflake.com/en/) - [quickwit](https://github.com/quickwit-oss/quickwit): A Rust search engine, alternative to [Elasticsearch](https://www.elastic.co/elasticsearch) - [risingwave](https://github.com/risingwavelabs/risingwave): Rust streaming processing engine - [Apache DataFusion](https://github.com/apache/datafusion): Fast, extensible query engine built in rust - [influxdb](https://github.com/influxdata/influxdb): Scalable datastore for metrics, events, and real-time analytics - [greptimedb](https://github.com/GreptimeTeam/greptimedb): Time series database for metrics, logs and events - [Apache HoraeDB (incubating)](https://github.com/apache/horaedb): High-performance, distributed, cloud native time-series database - [paradedb](https://github.com/paradedb/paradedb): Postgres for Search and Analytics - [glaredb](https://github.com/GlareDB/glaredb): An analytics DBMS for distributed data - [fluvio](https://github.com/infinyon/fluvio): Lean and mean distributed stream processing system - [lancedb](https://github.com/lancedb/lancedb): Developer-friendly, database for multimodal AI - [slatedb](https://github.com/slatedb/slatedb): Cloud-native embedded storage engine built on object storage - [daft](https://github.com/Eventual-Inc/Daft): Distributed DataFrame for Python designed for the cloud, powered by Rust - [arroyo](https://github.com/ArroyoSystems/arroyo): Distributed stream processing engine written in Rust, designed to perform stateful computations on data ### New implementations - [arrow-rs](https://github.com/apache/arrow-rs): Rust implementation of [Apache Arrow](https://arrow.apache.org/) - [iceberg-rust](https://github.com/apache/iceberg-rust/): Rust implementation of [Apache Iceberg](https://iceberg.apache.org/) - [paimon-rust](https://github.com/apache/paimon-rust): Rust implementation of [Apache Paimon](https://paimon.apache.org/) - [hudi-rs](https://github.com/apache/hudi-rs): Rust implementation of [Apache Hudi](https://hudi.apache.org/) - [parquet-rs](https://github.com/apache/arrow-rs/tree/master/parquet): Rust implementation of [Apache Parquet](https://parquet.apache.org/) - [avro-rust](https://github.com/apache/avro/tree/main/lang/rust): Rust implementation of [Apache Avro](https://avro.apache.org/) - [orc-rs](https://github.com/datafusion-contrib/datafusion-orc): Rust implementation of [Apache ORC](https://orc.apache.org/) ### New integrations - [Apache OpenDAL](https://github.com/apache/opendal) provides python, nodejs, java, go bindings - Apache Iceberg is now working on [building rust core for pyiceberg](https://github.com/apache/iceberg-rust/pull/518) - Apache Paimon is going to [build paimon-py by its rust core](https://lists.apache.org/thread/q3zxcomfq441t6o8y8dslos1qvb984j0) - [Apache DataFusion Comet](https://github.com/apache/datafusion-comet) is a high-performance accelerator for Apache Spark - [blaze](https://github.com/kwai/blaze): The Blaze accelerator for [Apache Spark](https://spark.apache.org/) leverages native vectorized execution to accelerate query processing - Apache Uniffle is working on it's rust shuffle server at [incubator-uniffle/rust/experimental/server](https://github.com/apache/incubator-uniffle/tree/master/rust/experimental/server) ## Blog - [Rewrite Bigdata in Rust](https://xuanwo.io/2024/07-rewrite-bigdata-in-rust/)