Crates.io | xorsum |
lib.rs | xorsum |
version | 4.0.0 |
source | src |
created_at | 2022-07-10 05:39:03.855371 |
updated_at | 2022-10-26 05:09:43.572476 |
description | Get XOR hash/digest with this command-line tool |
homepage | |
repository | https://github.com/Rudxain/xorsum |
max_upload_size | |
id | 622962 |
size | 28,291 |
It uses the XOR-cipher to compute a checksum digest. Basically, it splits the data in chunks whose length is the same as the digest size (padding with 0), and XOR
s all chunks between each other into a new chunk that's used as output.
This isn't a good hash function. It lacks the Avalanche Effect, because flipping 1 input bit flips 1 output bit.
The raw digest size is 64bit (8Byte) by default, but can be set to any valid usize
value with the --length
option. The actual size is bigger because the raw digest is expanded to hexadecimal by default. I choose 8, because CRC32 uses 4 and MD5 uses 16, and to make it easier for downgrade implementations to replicate, because 64b fits within a CPU register and can be emulated using 2 u32
s.
The initialization-vector is hardcoded to be 0.
Name and behavior are heavily influenced by cksum
, md5sum
, and b3sum
.
To install latest release from crates.io registry:
cargo install xorsum
This isn't guaranteed to be the latest version, but it will never throw compilation errors.
To install latest dev crate from GH:
cargo install --git https://github.com/Rudxain/xorsum.git
This is the most recent version. Compilation isn't guaranteed. Semver may be broken. And --help
may not reflect actual program behavior.
To get already-compiled non-dev executables, go to GH releases. *.elf
s will only be compatible with GNU-Linux x64. *.exe
s will only be compatible with Windows x64. These aren't setup/installer programs, these are the same executables cargo
would install, so you should run them from a terminal CLI, not click them.
For a Llamalab Automate implementation, visit XOR hasher.
Argument "syntax":
xorsum [OPTIONS] [FILE]...
For ℹinfo about options, run:
xorsum --help
# let's create an empty file named "a"
echo -n > a
xorsum --length 4 a
# output will be "00000000 a" (without quotes)
# write "aaaa" to this file and rehash it
echo -n aaaa > a
xorsum a -l 4
#out: "61616161 a"
# because "61" is the hex value of the UTF-8 char "a"
# same result when using stdin
echo -n aaaa | xorsum -l4
#61616161 -
xorsum a --brief #`-l 8` is implicit
#6161616100000000
Note:
echo -n
has different behavior depending on OS and binary version, it might include line endings like\n
(LF) or\r\n
(CR-LF). The outputs shown in the example are the (usually desired) result of NOT including an EOL.PowerShell will ignore
-n
becauseecho
is an alias ofWrite-Output
and therefore can't recognize-n
.Write-Host -NoNewline
can't be piped nor redirected, so it's not a good alternative.
--length
doesn't truncate the output:
xorsum some_big_file -bl 3 #"00ff55"
xorsum some_big_file -bl 2 #"69aa" NOT "00ff"
As you can see, -l
can return very different hashes from the same input. This property can be exploited to emulate the Avalanche Effect (to some extent).
If you have 2 copies of a file and 1 is corrupted, you can attempt to "🔺️triangulate" the index of a corrupted byte, without manually searching the entire file. This is useful when dealing with big raw-binary files
xorsum a b
#6c741b7863326b2c a
#6c74187863326b2c b
# the 0-based index is 2 when using `-l 8`
# mathematically, i mod 8 = 2
xorsum a b -l 3
#3d5a0a a
#3d590a b
# i mod 3 = 1
xorsum a b -l 2
#7f12 a
#7c12 b
# i mod 2 = 0
# you can repeat this process with different `-l` values, to solve it easier.
# IIRC, using primes gives you more info about the index
There are programs (like diff
) that compare bytes for you, and are much more efficient and user-friendly. But if you are into math puzzles, this is a good way to pass the time by solving systems of linear modular equations 🤓.
I was surprised that I couldn't find any implementation of a checksum algorithm completely based on the XOR
op. So I posted this for the sake of completeness, and because I'm learning Rust. I also made this for people with low-power devices.
sbox
will (probably) have enough bytes to "mix well".0.x.y
to reflect the incompleteness of the code. I'm sorry for the inconvenience and potential confusion.