ATE Datachain ============= ## What is an ATE Datachain? ATE Datachain (database) is a distributed redo log server that the ATE projects can use to remotely store their redo logs. ATE is designed to connect to remotely hosted repositories of state for to meet integrity and availability non-functional requirements. ## What is ATE [See here](../README.md) ## Summary ATE Datachains run a server daemon that listens for connections from clients and serves as a distributed redo log. Other projects use this backend for persistent storage - projects such as - [tokfs](../tokfs/README.md) ## How it works - AteDB will store its log files on a local file-system using a naming convention that follows the URL path of connecting clients. - Log files are loaded and updated on-demand as consumers connect, feed events and disconnect - AteDB will listen on a TCP port (default: 5000) with both IPv4 and IPv6 protocols unless otherwise specified - When a client attempts to access a chain-key that does not exist AteDB will create a new chain and associated log files. - When AteDB creates new chains it will will query an authentication server for a public key of who should have access to write root events into the chain based on a naming convention. e.g. ws://server.com/db/domain.com/name/mydb => mydb chain for user with email name@domain.com. - If you specify the --no-auth command then chains will be created without making any authentication checks which means anyone can create new chains on a first-come-first-served basis and no authentcation server is required. By default all wire messages are clear text but the events that are confidential are encrypted thus giving a good balance between security and speed however you may also increase the security of datachains by using double-encryption. For instance if you specify --wire-encryption 128 then AteDB will negotiate AES 128bit symetric keys using a quantum resistant key exchange that supports perfect forward secrecy symentics. There are two modes that AteDB can run in which have very different characteristics of trust and speed. ### Centralized (default) Repreresents a **centralized** trust model meaning that the communication link itself with a central server becomes a trusted conversation. When the server or client sends its events it will only provide proof of ownership of authorization to write 'once' per connection. This means that writes will be much faster as they only need to compute an asymetric signature once for each key that is used however it also means that you are trusting that the server is doing its job properly and has not been compromised by an attacker who has added their own events. When running in centralized mode it is highly recommended you use wire-encryption to prevent an attacker from injecting fake events into your communication channel. Use the --wire-encryption parameter to enable this. For this reason AteDB will default to using 128bit wire encryption when running in centralized mode which can be disabled using the --no-wire-encryption parameter. ### Distributed Represents **distributed** trust model where the only thing clients and servers really trust is the signatures of each individual event. In the scenario that a server is compromised it is not possible for an attacker to inject their own events into the chains as they do not have ownership of the signing private keys. This mode has a fairly significant impact on write operations as it means events need a asynmetric signature computed on all writes, batching attempts to minimize this cost however if writes are individually committed then a worst case scenario of every IO equals a signature will eventulate. In this mode it is not nessasary to also run wire encryption as all events that require confidentiallity and integrity are protected individually however one can reduce the changes of side channel attacks and denial of service risks through double-encryption. When running in 'distributed' mode the datachain will not make an authentication server requests as there is nothing to gain from this or to validate using. All integrity is provided as a part of the events themselves. Hence in this mode the --auth setting has no effect ## QuickStart ```sh # Installation and upgrade AteDB and auth-server apt install cargo make pkg-config openssl libssl-dev cargo install atedb cargo install auth-server ``` ```sh # Launch AteDB with all the defaults which is a good balance of security, performance # and simplfied setup. This instance will use the default authentication when it creates # new chains setting the root write key to that of the owner. The authentication server # that is queried will default to ws://tokera.sh/auth. # The instance will listen on all ports and all network addresses. atedb solo ``` ```sh # Starts a single instance datachain on the default port (5000) . This mode will use a # distributed trust model which means it does not need an authentication server and will # default to 128bit AES wire encryption atedb --trust distributed solo ``` ```sh # Launch AteDB with the added protection of double-encryption with all communication to this # server protected by 256bit AES encryption using quantum resistant key exchange. This mode # of operation is the most secure but also the least performant. atedb --wire-encryption 256 --trust distributed solo ``` ```sh # Load and store log files in a different path than the default of ~/ate atedb solo ~/another-path/ ``` ```sh # Starts AteDB without an authentication server which will listen on a localhost address for connections. # Wire encryption is disabled even while this instance is running as a centralized trust mode as there # is a low risk attackers could reach the loopback device. atedb --no-auth --no-wire-encryption solo -l 127.0.0.1 -p 5555 ``` ```sh # Starts AteDB using a different DNS server and authentication address that is hosted locally # Also by specifying that we listen on address 0.0.0.0 we purposely limited ourselves to IPv4 auth-server run -l 0.0.0.0 -p 5555 atedb --dns 8.8.4.4 --auth ws://localhost:5555/auth solo -l 0.0.0.0 ``` ```sh # Show log information of the running AteDB RUST_LOG=info atedb solo ``` ## Manual ``` USAGE: atedb [FLAGS] [OPTIONS] FLAGS: -d, --debug Logs debug info to the console --dns-sec Determines if ATE will use DNSSec or just plain DNS -h, --help Prints help information --no-auth Indicates no authentication server will be used meaning all new chains created by clients allow anyone to write new root nodes --no-wire-encryption Disbles wire encryption which would otherwise be turned on when running in 'centralized' mode -v, --verbose Sets the level of log verbosity, can be used multiple times -V, --version Prints version information OPTIONS: -a, --auth URL where the user is authenticated [default: ws://tokera.sh/auth] --dns-server Address that DNS queries will be sent to [default: 8.8.8.8] -t, --trust Trust mode that the datachain server will run under - valid values are either 'distributed' or 'centralized'. When running in 'distributed' mode the server itself does not need to be trusted in order to trust the data it holds however it has a significant performance impact on write operations while the 'centralized' mode gives much higher performance but the server needs to be protected [default: centralized] --wire-encryption Indicates if ATE will use quantum resistant wire encryption (possible values are 128, 192, 256). When running in 'centralized' mode wire encryption will default to 128bit however when running in 'distributed' mode wire encryption will default to off unless explicitly turned on SUBCOMMANDS: help Prints this message or the help of the given subcommand(s) solo Runs a solo ATE datachain and listens for connections from clients -------------------------------------------------------------------------- Runs a solo ATE datachain and listens for connections from clients USAGE: atedb solo [OPTIONS] [logs-path] ARGS: Path to the log files where all the file system data is stored [default: /opt/ate] FLAGS: -h, --help Prints help information -V, --version Prints version information OPTIONS: --compact-mode Mode that the compaction will run under (valid modes are 'never', 'modified', 'timer', 'factor', 'size', 'factor-or-timer', 'size-or-timer') [default: growth-or-timer] --compact-threshold-factor Factor growth in the log file which will trigger compaction - this argument is ignored if you select a compact_mode that has no growth trigger [default: 0.4] --compact-threshold-size Size of growth in bytes in the log file which will trigger compaction (default: 100MB) - this argument is ignored if you select a compact_mode that has no growth trigger [default: 104857600] --compact-timer Time in seconds between compactions of the log file (default: 1 hour) - this argument is ignored if you select a compact_mode that has no timer [default: 3600] -l, --listen IP address that the datachain server will isten on [default: 0.0.0.0] -p, --port Port that the datachain server will listen on [default: 5000] ``` ## Contribution If you would like to help setup a community to continue to develop this project then please contact me at [johnathan.sharratt@gmail.com](johnathan.sharratt@gmail.com)