Crates.io | sinteflake |
lib.rs | sinteflake |
version | 0.1.0 |
source | src |
created_at | 2024-08-06 08:27:02.834695 |
updated_at | 2024-08-06 08:27:02.834695 |
description | A 64 bits ID generator inspired by Snowflake, but generating very distinct numbers |
homepage | |
repository | https://github.com/SINTEF/sinteflake |
max_upload_size | |
id | 1327026 |
size | 54,714 |
SINTEFlake is a 64 bits ID generator, inspired by Twitter's Snowflake and Sony's Sonyflake.
It generates identifiers that start with a hash or a pseudo-random number instead of a timestamp. Identifiers are not roughly time-ordered but are very distinct numbers.
A SINTEFlake ID is composed of:
That adds up to 63 bits, to have only positive numbers when using signed 64 bits integers.
Add this to your Cargo.toml
:
[dependencies]
sinteflake = "0.1"
use sinteflake::{next_id, next_id_with_hash, set_instance_id, update_time};
set_instance_id(42)?;
let id_a = next_id()?;
let id_b = next_id()?;
let id_c = next_id_with_hash(&[1, 2, 3])?;
let id_d = next_id_with_hash(&[1, 2, 3])?;
update_time()?;
[dependencies]
sinteflake = { version = "0.1", features = ["async"] }
tokio = { version = "1", features = ["full"] }
use sinteflake::{next_id_async, next_id_with_hash_async, set_instance_id_async, update_time_async};
set_instance_id_async(42).await?;
let id = next_id_async().await?;
let id = next_id_with_hash_async(&[1, 2, 3]).await?;
update_time_async().await?;
Please note that the async
feature is not enabled by default, and that set_instance_id_async
is not setting the instance ID of the non async version.
You can create a custom SINTEFlake instance with your own settings:
use sinteflake::sinteflake::SINTEFlake;
use time::OffsetDateTime;
let mut instance = SINTEFlake::custom(
42, // instance_id
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], // hash key
123, // counter hash key
OffsetDateTime::from_unix_timestamp(1719792000)?, // epoch
)?;
let id_a = instance.next_id()?;
// ...
Unlike Snowflake (and Sonyflake), SINTEFlake does not intend to be ordered roughly in time. A sequence of IDs generated by SINTEFlake will have very different values. This can be useful for working with zone maps in vertical databases, for example.
The timestamp precision is only 8 seconds. Moreover, permutations of the timestamp bits prevent the numbers from being stable. So, using the identifier for ordering is not possible. It will overflow after about 544 years, which should be long enough.
This design choice involves slightly higher memory usage and complexity compared to Snowflake, as more numbers need to be tracked for collisions. Not being roughly time-ordered is also a disadvantage in many cases.
You can't be cryptographically secure with only 64 bits. SINTEFLake identifiers are not safe on their own because they are not long enough and can easily be brute-forced.
For reference, the hashing algorithm is SIPHash 2-4. The timestamp permutation table is using digits of π and e which should be nothing-up-my-sleeve enough.
UUIDs are great but somewhat big. Sometimes, you prefer to work with 64 bits instead of 128 bits. This can be useful for making small performance improvements or for working with systems that do not natively support 128-bit numbers. 64-bit numbers are often computed much faster than strings or byte arrays.
However, UUIDs are almost always a better choice and should be preferred.
cargo test
cargo llvm-cov # Coverage report
cargo bench # Benchmark
cargo bench --bench=bench -- --quick # Quick benchmark
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.