Crates.io | arx |
lib.rs | arx |
version | 0.3.1 |
source | src |
created_at | 2022-10-13 14:10:18.475858 |
updated_at | 2024-09-08 18:32:37.00429 |
description | A fast, mountable file archive based on Jubako container. Replacement of tar and zip. |
homepage | https://github.com/jubako/arx |
repository | https://github.com/jubako/arx |
max_upload_size | |
id | 687207 |
size | 89,362 |
Arx is a file archive format based on the jubako container format.
It allow you to create, read, extract file archive (as zip or tar does).
Arx (and Jubako) is in active development. While it works pretty well, I do not recommand to use it to do backups. However, you can use it to transfer data or explore archives.
Jubako is a versatile container format, allowing to store data, compressed or not in a structured way. It main advantage (apart from its versability) is that is designed to allow quick retrieval of data fro the archive without needing to uncompress the whole archive.
Arx use the jubako format and create arx archive which:
Binaries for Windows, MacOS and Linux are available for every release. You can also install arx using Cargo:
cargo install arx
Creating an archive is simple :
arx create -o my_archive.arx -r my_directory
It will one file : my_archive.arx
, which will contains the my_directory
directory.
Extracting (decompressing) an archive is done with :
arx extract my_archive.arx -C my_out_dir
You can list the content of (the list of files in) the archive with :
arx list my_archive.arx
And if you want to access to the content of only one file :
arx dump my_archive.arx my_directory/path/to/my_file > my_file
# or
arx dump my_archive.arx my_directory/path/to/my_file -o my_file
On linux, you can mount the archive using fuse.
mkdir mount_point
arx mount my_archive.arx mount_point
If you don't provide a mount_point
, arx will create a temporary one for you
arx mount my_archive.arx # Will create my_archive.arx.xxxxxx
arx
will be running until you unmount mount_point
.
zip2arx -o my_archive.arx my_zip_archive.zip
tar2arx -o my_archive.arx my_tar_archive.tar.gz
or
tar2arx -o my_archive.arx https://example.com/my_tar_archive.tar.gz
The following compare the performance of Arx to different archive formats.
cp -a
.Tests has been done on different data sets :
Source directory is stored on a sdd. All test are run on a tmpfs (archive and extracted files are stored in memory).
Mount diff time is the time to diff the mounted archive with the source directory
arx mount archive.arx mount_point &
time diff -r mount_point/linux-5.19 linux-5.19
umount mount_point
Mounting the tar and zip archive is made with archivemount
tool.
Squashfs is mounted using kernel. SquashfsFuse is mounted using fuse API.
Arx mount is implemented using fuse API.
Documentation directory only of linux source code:
Type | Creation | Size | Extract | Listing | Mount diff | Dump |
---|---|---|---|---|---|---|
Arx | 150ms963μs | 11.10 MB | 038ms395μs | 004ms051μs | 299ms764μs | 005ms618μs |
FS | 150ms639μs | 38.45 MB | 106ms821μs | 006ms962μs | 077ms414μs | 498μs |
Squashfs | 103ms076μs | 10.60 MB | 098ms787μs | 005ms365μs | 261ms533μs | 002ms088μs |
SquashfsFuse | 097ms863μs | 10.60 MB | - | - | 748ms597μs | - |
Tar | 141ms079μs | 9.68 MB | 065ms744μs | 041ms015μs | 02m41s | 042ms143μs |
Zip | 01s083ms | 15.22 MB | 388ms720μs | 037ms044μs | 03m06s | 014ms088μs |
This is the ratio
Type | Creation | Size | Extract | Listing | Mount diff | Dump |
---|---|---|---|---|---|---|
FS | 100% | 346% | 278% | 172% | 26% | 9% |
Squashfs | 68% | 95% | 257% | 132% | 87% | 37% |
SquashfsFuse | 65% | 95% | - | - | 250% | - |
Tar | 93% | 87% | 171% | 1012% | 53997% | 750% |
Zip | 718% | 137% | 1012% | 914% | 62350% | 251% |
Driver directory only of linux source code:
Type | Creation | Size | Extract | Listing | Mount diff | Dump |
---|---|---|---|---|---|---|
Arx | 01s060ms | 98.23 MB | 241ms699μs | 009ms516μs | 01s290ms | 007ms193μs |
FS | 778ms095μs | 799.02 MB | 523ms191μs | 021ms578μs | 467ms559μs | 495μs |
Squashfs | 829ms886μs | 121.70 MB | 435ms851μs | 012ms289μs | 01s629ms | 002ms190μs |
SquashfsFuse | 829ms237μs | 121.70 MB | - | - | 03s823ms | - |
Tar | 911ms042μs | 97.96 MB | 515ms178μs | 472ms060μs | - | 504ms231μs |
Zip | 20s498ms | 141.91 MB | 03s665ms | 098ms194μs | - | 034ms481μs |
This is the ratio
Type | Creation | Size | Extract | Listing | Mount diff | Dump |
---|---|---|---|---|---|---|
FS | 73% | 813% | 216% | 227% | 36% | 7% |
Squashfs | 78% | 124% | 180% | 129% | 126% | 30% |
SquashfsFuse | 78% | 124% | - | - | 296% | - |
Tar | 86% | 100% | 213% | 4961% | - | 7010% |
Zip | 1932% | 144% | 1516% | 1032% | - | 479% |
Type | Creation | Size | Extract | Listing | Mount diff | Dump |
---|---|---|---|---|---|---|
Arx | 02s104ms | 170.97 MB | 435ms846μs | 022ms238μs | 02s829ms | 010ms613μs |
FS | 01s605ms | 1.12 GB | 01s046ms | 043ms358μs | 943ms546μs | 493μs |
Squashfs | 01s430ms | 201.43 MB | 725ms532μs | 024ms050μs | 03s272ms | 002ms374μs |
SquashfsFuse | 01s417ms | 201.43 MB | - | - | 13s864ms | - |
Tar | 01s479ms | 168.77 MB | 938ms758μs | 799ms550μs | - | 802ms427μs |
Zip | 31s810ms | 252.96 MB | 06s260ms | 256ms137μs | - | 045ms722μs |
This is the ratio
Type | Creation | Size | Extract | Listing | Mount diff | Dump |
---|---|---|---|---|---|---|
FS | 76% | 674% | 240% | 195% | 33% | 5% |
Squashfs | 68% | 118% | 166% | 108% | 116% | 22% |
SquashfsFuse | 67% | 118% | - | - | 490% | - |
Tar | 70% | 99% | 215% | 3595% | - | 7561% |
Zip | 1511% | 148% | 1436% | 1152% | - | 431% |
The kernel compilation is the time needed to compile the whole kernel with the default configuration (-j8). For arx, we are compiling the kernel using the source in the archive mounted in mount_point.
Kernel compilation is made is "real" condition. Source or arx archive are stored on ssd.
Type | Compilation |
---|---|
Arx | 40m |
FS | 32m |
Arx archive are a bit bigger (about 1%) than tar.zst archive but 15% smaller that squashfr. Creation and full extraction time are a bit longer for arx but times are comparable.
Listing files ar accessing individual files from the archive is far more rapid using arx or squash. Access time is almost constant indpendently of the size of the archive. For tar however, time to access individual file is greatly increasing when the archive size is increasing.
Mounting a arx archive make the archive usable without extracting it.
A simple diff -r
takes 4 more time than a plain diff between two directories but
it is a particular use case (access all files "sequentially" and only once).
But for linux documentation arx is 444 time quicker than tar (several hours). The bigger the tar archive is the bigger is this ratio. I haven't try to do a mount-diff for the full kernel.
For kernel compilation, the overhead is about 25%. But on the opposite side, you can compile the kernel without storing 1.3GB of source on your hard drive.