Crates.io | mongo_sync |
lib.rs | mongo_sync |
version | 0.1.0 |
source | src |
created_at | 2021-10-08 07:37:57.903127 |
updated_at | 2021-10-08 07:37:57.903127 |
description | instant coding answers via the command line(just like howdoi) |
homepage | |
repository | https://github.com/WindSoilder/mongo_sync |
max_upload_size | |
id | 462221 |
size | 177,349 |
Mongodb realtime synchronizer, which is similar to py-mongo-sync
--collection-concurrent
to define how many threads to sync a database, use --doc-concurrent
to define how many threads to sync a collection.--log-path
option. Or else log information will be output to stdout.Mongodb 3.6+ (because of official mongodb driver only support mongodb 3.6+)
The recommended way to install mongo_sync is using cargo
:
cargo +nightly install mongo_sync
You can download released binary as well.
To running integration tests, you need to config SYNCER_TEST_SOURCE
to a testing mongodb uri, or mongodb://localhost:27017
will be used.
To run synchronizer, you need to start oplog_syncer to make a realtime mongodb oplog sync first.
Then you can run db_sync
to sync database in realtime.
./target/release/oplog_syncer --src-uri "mongodb://localhost:27017" --oplog-storage-uri "mongodb://localhost:27018/"
db_sync --src-uri "mongodb://localhost:27017/?authSource=admin" --oplog-storage-uri "mongodb://localhost:27018/?authSource=admin" --target-uri "mongodb://localhost:27019" --db test_db
Note that the --oplog-storage-uri
in oplog_syncer and db_sync must be the same.
USAGE:
oplog_syncer [OPTIONS] --src-uri <src-uri> --oplog-storage-uri <oplog-storage-uri>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--log-path <log-path>
log file path, if not specified, all log information will be output to stdout
-o, --oplog-storage-uri <oplog-storage-uri> target oplog storage uri
-s, --src-uri <src-uri> source database uri, must be a mongodb cluster
USAGE:
db_sync [OPTIONS] --src-uri <src-uri> --target-uri <target-uri> --oplog-storage-uri <oplog-storage-uri> --db <db>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--collection-concurrent <collection-concurrent> how many threads to sync a database
-c, --colls <colls>...
collections to sync, default sync all collections inside a database
-d, --db <db> database to sync
--doc-concurrent <doc-concurrent> how many threads to sync a collection
--log-path <log-path>
log file path, if no specified, all log information will be output to stdout
-o, --oplog-storage-uri <oplog-storage-uri>
mongodb uri which save oplogs, it's saved by `oplog_syncer` binary
-s, --src-uri <src-uri> source mongodb uri
-t, --target-uri <target-uri> target mongodb uri
┌───────────────┐
│ target db │
└───┬───────────┘
│
xxxx ▼ xxxxxxx
xx ┌───────┐ x
xx │db_sync│
x └───────┘ x
xxxxxx ▲ ▲ x
xx │ │ xxxx
│ │
│ │
Full dump │ │Incr dump
│ │(Real time)
│ │
│ │
│ │
│ │ ┌─────────────────────┐
│ └─────────┤oplog storage db │
│ └──────▲──────────────┘
│ xxxxxxx │ xxxxx
│ x ┌──────┴──────┐x
│ x │Oplog syncer │x Sync oplog from source cluster
│ x └──────▲──────┘x to oplog storage in real time
│ xxxxxxxxx│ xxxxxxx
│ │
│ │
│ │
│ ┌──────┴───────┐
└────────────┤Source cluster│
└──────────────┘
According to the diagram, you can find that there are 2 basic programs provided by mongo_sync
oplog storage db
.source cluster
to target db
.It's not strictly benchmark test, I just test it manually.
Scenario:
When source cluster insert 50,000 records, how long the target_db
can synchronizer these new 50,000 insert.
My testing result:
db_sync
takes about 50 seconds to sync these update, and py-mongo-sync
takes about 225 seconds to sync these update. In general, it's about 3.5x faster than py-mongo-sync
.
And, please note that 50 seconds
is not accrutely, it highly depends on your database and running machine performance.
oplog_syncer
, oplog storage db
will create and using databse named source_oplog
, and create and using collection named source_oplog
. For now this is hardcoded.db_sync
, target databse will create a new collection named oplog_records
, it saves the latest oplog timestamp applied to the database.