Crates.io | binsync |
lib.rs | binsync |
version | 0.0.4 |
source | src |
created_at | 2021-10-29 03:09:14.401043 |
updated_at | 2024-06-21 20:40:20.497037 |
description | A library for syncing binary files between two locations. |
homepage | |
repository | https://github.com/dristic/binsync |
max_upload_size | |
id | 473776 |
size | 93,099 |
A library for syncing large binary data between two locations. Utilizes content defined chunking to break files down into chunks and transfer only the deltas between two locations. The library uses a static manifest format as the source meaning you can generate the manifest and chunks ahead of time and serve them to a large number of destinations.
$ cargo build
$ cargo test --test integration_test
The library has three major components: the manifest, the chunk provider, and the syncer.
The manifest describes the files and chunks from a given source directory. It has a list of all files and which chunks appear in which order in each file. This is meant to be generated ahead of time and the same manifest can be used for a large number of destinations. Once a manifest is generated for a given folder it can be serialized into any number of formats as long as your destination, remote or local, understands how to parse the manifest for use in syncing.
The chunk provider fetches chunk contents for the syncer. This is a trait that can be implemented to suit your needs. There are a few implementations provided:
CachingChunkProvider
is optimal for local syncing on the same machine. It attempts to read ahead and cache chunks as quickly as possible to maximize memory and disk I/O usage.RemoteChunkProvider
is useful when the source and destination are on different machines. It works similar to the caching provider but fetches chunks from a remote base URL.Your application may have different needs i.e. fetching from S3, making authenticated requests, fetching from multiple sources, etc.
The syncer takes a manifest and chunk provider and runs the syncing logic to transform the target destination into an exact binary replica of the source. It reuses chunks that already exist in the destination folder to reduce the amount of data needed to transfer over the network.
use binsync::{Manifest, CachingChunkProvider, Syncer};
let manifest = Manifest::from_path("foo/source");
let basic_provider = CachingChunkProvider::new("foo/source");
let mut syncer = Syncer::new("foo/destination", basic_provider, manifest);
syncer.sync().unwrap();
The integration tests are a great way to understand how things work. They generate a test folder with in/out folders and sync between the two. Inside tests/common.rs
you can comment out the Drop
logic to prevent the tests from cleaning up the folders after running.
Pull requests and issues are welcome. Currently the response time is best effort.
For working on the crate itself, there is a simple cargo make task to format, build, and test with all features:
$ cargo make development