Crates.io | ciid |
lib.rs | ciid |
version | |
source | src |
created_at | 2020-02-20 04:36:43.085353 |
updated_at | 2024-12-12 12:56:52.983379 |
description | `ciid` is a utility to derive a chronologically sortable, unique identifier for images. |
homepage | https://github.com/pablosichert/ciid |
repository | https://github.com/pablosichert/ciid |
max_upload_size | |
id | 210820 |
Cargo.toml error: | TOML parse error at line 24, column 1 | 24 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
ciid
is a utility to derive a chronologically sortable, unique
identifier for images.
Usually, digital cameras and phones assign file names to images with a sequence
of only 4 digits (e.g. IMG_1234.dng
). Those names will easily clash for any
sufficiently large amount of images.
ciid
tackles this problem by deriving a hash from the image buffer.
Additionally to being able to derive an identifier that is very unlikely to
clash, this hash can later be used to check the integrity of the image content.
Some image processing programs update metadata of files (e.g inline JPEG-
previews, tags, modified date). The resulting ciid
will be unaffected from
those changes, since only the actual image buffer is hashed. This has the nice
side-effect that proprietary camera RAW file formats and converted .dng
files
will yield the same identifier most of the time.
Here's how a resulting identifier looks like:
01234567890123-a0b1c2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f9a0b1
└─────┬──────┘ └──────────────────────────────┬───────────────────────────────┘
timestamp hash of image buffer
The first part of the identifier encodes the creation date of the image (a unix timestamp with millisecond precision), while the second part is a hash (SHA-256) based on the contents of the image buffer.
Following criteria were considered when choosing the identifier:
Download and run the installation script:
$ curl -s https://raw.githubusercontent.com/pablosichert/ciid/master/bin/install.sh | bash
For help with installing the dependencies, have a look at the install script.
Install the ciid
binary onto your system via
cargo
:
$ cargo install ciid
$ ciid [FLAGS] <file path>...
Short | Long | Description |
---|---|---|
-h | --help | Prints help information |
--no-hash | If provided, the raw image will not be hashed, and no hash will be appended to the file name | |
--rename-file | Renames the file to the derived identifier. Preserves the file extension | |
-V | --version | Prints version information |
--verify-name | Verifies if the provided file name is equal to the derived identifier |
Short | Long | Description |
---|---|---|
--print <template> | Prints provided template to stdout, substituting variables with file information. Available variables: ${file_path}, ${identifier}, ${date_time}, ${timestamp} | |
--timestamp-digits <timestamp digits> | Minimum number of digits the timestamp should carry. Will be padded with zeros from the left |
Name | Description |
---|---|
<file path>... | Path to image file |
Why do we encode the timestamp as 01234567890123
instead of e.g.
2009-02-13 23:31:30.123
? The timestamp represents an
unambiguous1 single point in
time, whereas the date string needs to be contextualized with a time zone. That
means that you would either need to annotate the date string with a time zone or
change the file name every time you are on a system which uses a different time
zone.
Apart from that, the former encoding is more compact.
While unfortunately it's not easy to derive the actual date from the timestamp just by looking at it, you can compare two timestamps chronologically by sorting them by value.
1 ignoring leap-seconds.
The changelog format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
01
, 0a
, 10
, a0
should preserve
the order, but get presented in order 0a
, 01
, 10
, a0
.--timestamp-digits
can be used to control how many digits
the timestamp should carry at a minimum. Missing zeros will be padded from the
left.