Crates.io | yama |
lib.rs | yama |
version | 0.4.0 |
source | src |
created_at | 2021-06-16 17:08:08.70214 |
updated_at | 2021-06-16 17:08:08.70214 |
description | Deduplicated, compressed and encrypted content pile manager |
homepage | |
repository | https://bics.ga/reivilibre/yama |
max_upload_size | |
id | 410975 |
size | 186,476 |
note: this readme is not yet updated to reality…
yama
[-w|--with [user@host:]path] [--with-encrypted true|false]
In yama.toml
, you can configure remotes:
[remote.bob]
encrypted = true
host = "bobmachine.xyz"
user = "bob"
path = "/home/bob/yama"
check
: Check repository for consistencyVerifies the full repository satisfies the following consistency constraints:
Usage: yama check [--gc]
The amount of space occupied and occupied by unused chunks is reported.
If --gc
is specified, unused chunks will be removed.
lsp
: List tree pointersUsage: yama lsp
rmp
: Remove tree pointersUsage: yama rmp pointer/path [--force]
If --force
is not specified and the pointer is depended upon by another, then deletion is aborted with an error.
store
: Store tree into repositoryUsage: yama store [--dry-run] [ssh://user@host]/path/to/dir pointer/path [--exclusions path/to/exclusions.txt] [--differential pointer/parent]
The pointer must not exist and it will be created. If --differential
is specified with an existing parent pointer, then the diretory listing is specified as a differential list to the parent.
The intention of this is to reduce the size of the directory list.
Exclusion lists have pretty much the same format as .gitignore
, one glob per line of files to not include, relative to the tree root.
extract
: Extract file(s) from repositoryUsage: yama extract [--dry-run] pointer/path[:path] [ssh://user@host]/path/to/local/dir[/]
If no path specified, extract root /. Trailing slash means that the file will be extracted as a child of the specified directory.
remote
: Run operations on a remote repositoryUsage: yama remote ssh://user@host/path/to/repo <subcommand>
store
: Store local tree into remote repositoryUsage is identical to yama store
except store path must be local.
extract
: Extract remote repository into local treeUsage is identical to yama extract
except target path must be local.
slave
: Remote-controlled yamaCommunicates over stdin/stdout to perform specified operations. Used when a yama command involves SSH.
Pointers are stored in pointers.lmdb
and chunks are stored in chunks.lmdb
.
It is expected that exclusion files will be kept in the same directory with the repository, if they are to be used
on a recurring basis.
Chunks are compressed with zstd
. It must first be trained and a training dictionary placed in repo root/zstd.dict
.
This dictionary file must not be lost or altered after chunks have been made using it. Doing so will void the integrity of the entire repository.
Chunks are hashed with BLAKE256, and chunks will have their xxHash calculated before being deduplicated away. (Collision being detected will result in abortion of the backup. It is expected to never happen but nevertheless we may not be sure.)
Compression is performed on the host where the data resides.
Only required chunks are compressed and diffused across the SSH connection.
There needs to be some mechanism to offer, decline and accept chunks, without buffers overflowing and bringing hosts down.
zstd --train FILEs -o zstd.dict
find ~/Programming -size -4k -size +64c -type f -exec grep -Iq . {} \; -printf "%s\n" | jq -s 'add'
find ~/Programming -size -4k -size +64c -type f -exec grep -Iq . {} \; -exec cp {} -t /tmp/d/ \;
du -sh
find > file.list
wc -l < file.list
→ gives a № linesshuf -n 4242 file.list | xargs -x zstd --train -o zstd.dict
for 4242 files. Chokes if it receives a filename with a space, just re-run until you get a working set.