Crates.io | ddcp |
lib.rs | ddcp |
version | 0.2.4 |
source | src |
created_at | 2023-09-23 04:43:41.88022 |
updated_at | 2023-12-23 23:03:46.44115 |
description | Distributed decentralized database-to-database copy |
homepage | https://gitlab.com/cmars232/ddcp |
repository | https://gitlab.com/cmars232/ddcp.git |
max_upload_size | |
id | 981057 |
size | 11,885,847 |
Database-to-Database Copy (DDCP) over Veilid.
No servers, no cloud. No gods, no masters.
DDCP (Database-to-Database Copy) is two things:
Both operate over Veilid for simple, private, and secure p2p networking with strong cryptographic identities.
Rapid development of local-first distributed applications. Cut out the boilerplate and yak-shaving of APIs and focus on the data.
Convert traditional RDBMS apps into local-first without having to re-architect around low-level CRDTs.
DDCP is still in the early stages of development but is already pretty interesting. Here's what you can do with it:
ddcp init
This registers a Veilid DHT public key for your database.
2023-09-27T16:50:03.646448Z INFO ddcp: Registered database at DHT key="VLD0:73iT7NuqplS2nab1AH4P7lJYiQROR_k4NyUhag_K4DY"
ddcp serve
2023-09-27T16:50:36.319921Z INFO ddcp: Serving database at DHT key="VLD0:73iT7NuqplS2nab1AH4P7lJYiQROR_k4NyUhag_K4DY"
2023-09-27T16:50:36.321171Z INFO ddcp: Database changed, updated status db_version=2 key="VLD0:73iT7NuqplS2nab1AH4P7lJYiQROR_k4NyUhag_K4DY"
This process must remain running in the background in order for remotes to be able to fetch changes from this database.
This process also prints the Veilid public key to give to your peers.
Create some VLCN CRRs (Conflict-free Replicated Relations) in your database. This can be done while DDCP is running and publishing your database. Changes will be automatically picked up by remote subscribers.
ddcp shell < fixtures/cr_tables.sql
ddcp shell < fixtures/ins_test_schema_alice.sql
Add remote peer databases to synchronize with:
ddcp remote add alice 73iT7NuqplS2nab1AH4P7lJYiQROR_k4NyUhag_K4DY
The VLD0:
prefix is optional.
Changes from remotes are automatically applied to the local database while publishing changes.
ddcp serve
2023-09-27T17:01:30.615367Z INFO ddcp: Pulled changes from remote database remote_name="alice" db_version=5
CRRs have certain restrictions in order to work as expected, or even at all:
ddcp shell
does this automatically for you. Something to keep in mind when using cr-sqlite from other applications though.Probably the easiest way to evaluate DDCP. OCI image build is based on Debian Bookworm.
docker build -t ddcp .
Usage:
docker run -v $(pwd)/data:/data ddcp:latest init
docker run -v $(pwd)/data:/data ddcp:latest remote add alice 73iT7NuqplS2nab1AH4P7lJYiQROR_k4NyUhag_K4DY
cat fixtures/cr_test_schema.sql | docker run -v $(pwd)/data:/data -i ddcp:latest shell
docker run -v $(pwd)/data:/data ddcp:latest pull
DDCP currently builds in a Nix flake devshell. If you Nix,
nix develop
cargo build
This might probably work, with compilers and development libraries installed:
# Install serde tooling; used in cargo build (for now)
./scripts/install_capnproto.sh
./scripts/install_protoc.sh
cargo build --release
You can also try cargo install ddcp
but the build will still require the serde tooling to be installed.
DDCP is still a tech preview. Wire protocol, schemas, and APIs are all unstable and subject to breaking changes.
Several areas where DDCP still needs improvements:
DHT addresses are currently publicly accessible; if you know the address, you can pull from the database. This is OK for public distribution of data sets (blogs, "everybody edits", stuff like that), but not for private apps.
Support sharing for authenticated DHTs.
More control over how changes are merged. In some use cases, you want to merge all peers' changes in a consistent state. In others, you probably want to keep peers' content separate but linked. Primitives that support these different synchronization patterns. Filters & transformations on crsql_changes
. Authn and authz (who can change or pull what).
Overcome app_call
message size limits. Currently there's no checks on this so sending large changesets (like pictures of cats) will likely fail in uncontrolled ways. Veilid messages are typically limited to 32k.
32k is probably enough for a row without large column data. Large column data should probably reference block storage, like PostgreSQL's Toast.
Replace polling with a watch on DHT changes (currently not implemented AFAICT).
Veilid blockstore integration when it's ready.
Ideas for kinds of data that could be shared:
Name takes inspiration from UUCP and NNCP.
Some of the Veilid core integration was derived from examples in vldpipe.