# shabottom ## Abstract > adslhghglfskhfsdfdfgkhlgkdfdsdsg > > ~ some cutie on the internet This document specifies the shabottom algorithm, version 0.2. The shabottom algorithm is a hashing algorithm not suited for cryptographic purposes, but designed to produce output that looks like plausible keysmashes. ## Status of This Memo This document is not an Internet Standards Track specification nor an official IETF publication, RFC or otherwise. It is merely written in the style of an RFC because in my opinion that is a good specification format to use. ## Introduction The original shabottom algorithm [^shabottom] was written by axtlos xenia for personal use, among other things in computer keyboard firmware. The algorithm was designed to produce output that looks like plausible keysmashes, while retaining some useful properties as a hashing algorithm. _Keysmashes_ (also called key smashes or keymashes) are output produced by uncontrollable and semi-random mashing of a computer keyboard, physical or otherwise. Usually keysmashing happens on the home row of a QUERTY keyboard layout or similar, making the letters A, S, D, F, G, H, J, K, and L most common, and producing a degree of bias corresponding to the natural movement patterns of the human hand and the degree of randomness achievable by human movements. Keysmashes are not an accident, but rather an intentional component in online communication. [^Wikipedia] The original version of shabottom is considered version 0.1 in the context of this document. However, that version is not quite platform-independent, as it relies on platform-defined behavior (and sometimes undefined behavior) of the C programming language [^C]. This document instead describes version 0.2 of the shabottom algorithm, which is a platform- and language-independent and fully defined variant that produces similiarly-looking output. All the previously well-defined parts of the algorithm are kept unchanged. ### Terminology Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [^RFC2119] [^RFC8174] when, and only when, they appear in all capitals, as shown here. ## The shabottom algorithm The shabottom algorithm takes as input any number of 8-bit bytes. It produces a 32 character string as output. The character set of the output is undefined, as the mechanisms of the algorithm do not require the use of a specific encoding. However, ASCII [^RFC20] is recommended, as all characters fit within the ASCII character set. The general working principle of the algorithm is as follows: 1. Computation of a simple input checksum. 2. Seeding of a pseudo random number generator with this checksum. 3. Creation of the output string by use of the random number generator. ### Input Checksum The input checksum is computed by adding up all the bytes of the input into one 32-bit unsigned integer. On arithmetic overflow, the carry is simply discarded, yielding normal wrapping behavior. The encoding of text input depends on the file, network stream or other that is being checksummed, and is of no concern to shabottom itself. However, shabottom implementations MUST NOT change the encoding of text input before computing the checksum. ### Pseudo Random Number Generation The pseudo random number generation is performed by a simple linear congruential generator (LCG), which produces the next value `y` from the previous value `x` by `y = (ax + c) mod p`. Here `a`, `c`, and `p` are constant parameters, and the choice of these determines various statistical properties of the generator. For the purposes of shabottom, the generator need only be moderately good and produce natural numbers below 2^32. For this reason, we choose `p = 2^32`, one of the suitable `a`, and an arbitrary choice of odd `c`. These criteria and values are directly taken from L'Ecuyer's tables on good LCG parameters [^LCGTables]. - p = 2^32 = `4294967296` - a = `32310901` - c = `9` The initial value of the generator, the seed, is the input checksum. Any further calls to the `random()` function, as found in the algorithmic description, will advance the random number generator by using the previous output (or the seed) as `x` in the formula `(32310901 * x + 9) mod 4294967296`. In particular, this means that the output of the first call to `random()` is not the seed itself. ### Creation of Output String The creation of the output string is an algorithm best described in pseudocode. Before it runs, the PRNG described in the previous section is seeded with the input checksum, and calls to the `random()` function refer to this PRNG (and modify its internal state). The output string is produced by repeatedly appending characters to it, here represented with the `append()` function. The output characters are taken from the two tables START and END, which are of length 10 and 11 respectively: | Table | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | | ----- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | START | s | s | s | d | d | d | f | f | f | f | | | END | g | g | g | h | h | h | j | k | l | l | l | ``` // the previous character, since no characters should be repeated variable previous of type character := 'a' // whether to use the start table (false) or the end table (true) variable use_start_or_end of type boolean := true // controls when to switch between the two tables variable switchup of type integer [-3, +3] := 0 // the string always starts with 'a' append(previous) while length of output < 32: // if the length suffices, two characters are appended variable new_char_count := if length of output + 2 < 32 then 2 else 1 if use_start_or_end = false: variable i of type integer [0, 2] := 0 while i < new_char_count: // retrieve a random character from the start table variable r of type integer [0, 9] := random() mod 10 // only use that character if it is different from the last character // (otherwise i is not incremented and we try again) if previous != START[r]: append(START[r]) previous = START[r]; i := i + 1 end if end while // increment switchup while using the start table switchup := if switchup < 0 then 0 else switchup + 1 else: variable i of type integer [0, 2] := 0 while i < new_char_count: // retrieve a random character from the end table variable r of type integer [0, 10] := random() mod 11 // only use that character if it is different from the last character if previous != END[r]: append(END[r]) previous := END[r] i := i + 1 end if end while // decrement switchup while using the end table switchup := if switchup > 0 then 0 else switchup - 1 end if // if no two consecutive characters were from the same table, switch to a random table if switchup < 2 and switchup > -2: use_start_or_end := random() mod 2 = 1 // otherwise, switch to the other table than the one that was just used to break up the chain else: use_start_or_end := switchup >= 0 end if end while ``` ### Binary Output As defined above, shabottom produces text output. If binary output is required, the 32-character output string can be Base-64 [^RFC4648] decoded into a 256-bit number. ## Examples The empty string hashes to `afdfsgjfdlkfdghfdfsklkgfsdfhlgkl`. The string '1234567890' hashes to `afskgfdfshlfsljsdkldfglglgksdglg`. The source code of the Wikipedia page on LGCs [^WikipediaLGC] (using line feeds only) hashes to `adfglfdkgdsghglhlsdfdhljksdsfdfl`. ## Security Considerations shabottom is not a cryptographically secure hashing algorithm. Its output is designed to be aesthetically pleasing, and the degree of randomness required is only so that distinct inputs produce distinct outputs as much as possible and they look "random enough". shabottom MUST NOT be used for cryptographical applications, such as verifying the integrity of a message or file, or deriving a symmetric key for encryption. There are established hashing algorithms known to be secure under a wide variety of attacks, such as the Keccak (SHA-3) [^Keccak] hash functions. ## References ### Normative References [^RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [^RFC8174]: Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [^shabottom]: xenia, axtlos, "the most cryptographically secure checksumming algorithm", July 2024, . [^RFC20]: Cerf, V., "ASCII format for network interchange", STD 80, RFC 20, DOI 10.17487/RFC0020, October 1969, . [^LCGTables]: L'Ecuyer, Pierre (January 1999). "Tables of Linear Congruential Generators of Different Sizes and Good Lattice Structure". Mathematics of Computation. 68 (225): 249–260. doi:10.1090/S0025-5718-99-00996-5, . [^WikipediaLGC]: Wikipedia, "Linear Congruential Generator", . [^RFC4648]: Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, . ### Informative References [^Wikipedia]: Wikipedia, "Keysmash", . [^Keccak]: G. Bertoni, J. Daemen, M. Peeters and G. Van Assche, The Keccak reference, SHA-3 competition (round 3), 2011. ## Author's Addresses what are you, a cop?