Crates.io | rldb |
lib.rs | rldb |
version | 0.1.13 |
source | src |
created_at | 2024-05-27 08:04:07.162999 |
updated_at | 2024-07-20 21:01:06.626737 |
description | A dynamo-like key/value database written in rust |
homepage | https://github.com/rcmgleite/rldb |
repository | https://github.com/rcmgleite/rldb |
max_upload_size | |
id | 1253169 |
size | 478,065 |
RLDB (Rusty Learning Dynamo Database) is an educational project that provides a Rust implementation of the Amazon dynamo paper. This project aims to help developers and students understand the principles behind distributed key value data stores.
Feature | Description | Status | Resources |
---|---|---|---|
InMemory Storage Engine | A simple in-memory storage engine | $${\textsf{\color{green}Implemented}}$$ | Designing Data-Intensive applications - chapter 3 |
LSMTree | An LSMTree backed storage engine | $${\textsf{\color{yellow}TODO}}$$ | Designing Data-Intensive applications - chapter 3 |
Log Structured HashTable | Similar to the bitcask storage engine | $${\textsf{\color{yellow}TODO}}$$ | Bitcask intro paper |
TCP server | A tokio backed TCP server for incoming requests | $${\textsf{\color{green}Implemented}}$$ | tokio |
PUT/GET/DEL client APIs | TCP APIs for PUT GET and DELET | $${\textsf{\color{greenyellow}WIP}}$$ | N/A |
PartitioningScheme via Consistent Hashing | A functional consistent-hashing implementation | $${\textsf{\color{green}Implemented}}$$ | Designing Data-Intensive applications - chapter 6, Consistent Hashing by David Karger |
Leaderless replication of partitions | Replicating partition data using the leaderless replication approach | $${\textsf{\color{green}Implemented}}$$ | Designing Data-Intensive applications - chapter 5 |
Quorum | Quorum based reads and writes for tunnable consistenty guarantees | $${\textsf{\color{green}Implemented}}$$ | Designing Data-Intensive applications - chapter 5 |
Node discovery and failure detection | A gossip based mechanism to discover cluster nodes and detect failures | $${\textsf{\color{green}Implemented}}$$ | Dynamo Paper |
re-sharding/rebalancing | Moving data between nodes after cluster state changes | $${\textsf{\color{yellow}TODO}}$$ | Designing Data-Intensive applications - chapter 6 |
Data versioning | Versioning and conflict detection / resolution (via VersionVectors) | $${\textsf{\color{green}Implemented}}$$ | Vector clock wiki, Lamport clock paper (not that easy to parse) |
Reconciliation via Read repair | GETs can trigger repair in case of missing replicas | $${\textsf{\color{yellow}TODO}}$$ | Dynamo Paper |
Active anti-entropy | Use merkle trees to detect missing replicas and trigger reconciliation | $${\textsf{\color{yellow}TODO}}$$ | Dynamo Paper |
cargo run --bin rldb-server -- --config-path conf/node_1.json
cargo run --bin rldb-server -- --config-path conf/node_2.json
cargo run --bin rldb-server -- --config-path conf/node_3.json
In this example, we assume node on port 3001 to be the initial cluster node and we add the other nodes to it.
cargo run --bin rldb-client join-cluster -p 3002 --known-cluster-node 127.0.0.1:3001
cargo run --bin rldb-client join-cluster -p 3003 --known-cluster-node 127.0.0.1:3001
cargo run --bin rldb-client put -p 3001 -k foo -v bar
{"message":"Ok"}%
cargo run --bin rldb-client get -p 3001 -k foo
{"values":[{"value":"bar","crc32c":179770161}],"context":"00000001527bd0d79bdb065196e93b951879b64300000000000000000000000000000001"}
cargo run --bin rldb-client get -p 3001 -k foo2
{"values":[{"value":"bar2","crc32c":1093081014},{"value":"bar1","crc32c":1383588930},{"value":"bar3","crc32c":3008140469}],"context":"00000003296f248aff807cf05f4bcd0d05a45cc500000000000000000000000000000001527bd0d79bdb065196e93b951879b64300000000000000000000000000000001e975274170197c04e92166baff4d20c900000000000000000000000000000001"}
The context
key of the response is what allows subsequent PUTs to resolve the given conflicts. For example:
cargo run --bin rldb-client put -p 3002 -k foo2 -v conflicts_resolved -c 00000003296f248aff807cf05f4bcd0d05a45cc500000000000000000000000000000001527bd0d79bdb065196e93b951879b64300000000000000000000000000000001e975274170197c04e92166baff4d20c900000000000000000000000000000001
{"message":"Ok"}%
cargo run --bin rldb-client get -p 3001 -k foo2
{"values":[{"value":"conflicts_resolved","crc32c":3289643150}],"context":"00000003296f248aff807cf05f4bcd0d05a45cc500000000000000000000000000000001527bd0d79bdb065196e93b951879b64300000000000000000000000000000002e975274170197c04e92166baff4d20c900000000000000000000000000000001"}%
The all-in-one jeager docker image can be used to export rldb traces locally. The steps to get there are:
$ ./local_jaeger.sh
cargo run --bin rldb-server -- --config-path conf/node_1.json --tracing-jaeger
See rldb docs
This project is licensed under the MIT license. See License for details
This project was inspired by the original Dynamo paper but also by many other authors and resources like:
and many others. When modules in this project are based on specific resources, they will be included as part of the module documentation