subslay

Crates.iosubslay
lib.rssubslay
version0.1.9
created_at2025-07-12 17:15:13.019626+00
updated_at2025-07-13 21:42:40.294566+00
descriptionSubSlay: Text β†’ emoji πŸ’…πŸ» Powered by Rust.
homepage
repositoryhttps://github.com/8ria/subslay
max_upload_size
id1749518
size7,413,001
AndriaK (8ria)

documentation

README

SubSlay converts plain text into emoji sequences using semantic similarity with pre-embedded GloVe vectors and emoji keyword mappings β€” all bundled inside a pure Rust crate.

use subslay::EmojiStylist;

fn main() {
    let stylist = EmojiStylist::new().unwrap();
    let emojis = stylist.slay("Hello, world!");
    println!("{:?}", emojis); // ["πŸ‘‹", "🌍"]
}

🧾 What is SubSlay?

SubSlay is a Rust library that transforms ordinary text into emoji sequences that capture the semantic meaning of each word. It uses pre-trained GloVe word embeddings and maps them to emojis using embedded semantic vectors β€” all packed into the crate itself.

This enables fast, expressive emoji translation fully offline with zero API calls or external resources.

It’s ideal for:

  • ✨ Chat apps
  • πŸ€– Discord bots
  • πŸ“± Social media tools
  • πŸ’» Terminal utilities

πŸš€ Features

  • πŸ¦€ Pure Rust β€” no runtime dependencies
  • πŸ“¦ Embedded embeddings β€” GloVe + emoji vectors bundled as a single embeddings.bin file
  • 🧠 Cosine similarity with emoji semantics for contextual matching
  • πŸ†˜ Fallback handling for unknown or out-of-vocab words
  • 🌐 WASM-compatible β€” works in browser/Edge without filesystem or network access
  • πŸ’¨ Blazing fast & offline β€” powered by bincode and quantized vector loading
  • πŸͺΆ Tiny footprint β€” uses f16 (half-precision) vectors for compact size

πŸ“¦ Installation

Add SubSlay to your Cargo.toml:

[dependencies]
subslay = "0.1.9"

πŸ› οΈ Usage

use subslay::EmojiStylist;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let stylist = EmojiStylist::new()?;
    let emojis = stylist.slay("happy pizza");
    println!("{:?}", emojis); // e.g. ["😊", "πŸ•"]
    Ok(())
}

Example:

let stylist = EmojiStylist::new()?;
let sentence = "I love pizza and cats";
let emojis = stylist.slay(sentence);
println!("Emojis: {:?}", emojis);
// Possible output: ["❀️", "πŸ•", "😺"]

βš™οΈ How It Works (Under the Hood)

SubSlay embeds a pre-serialized binary file (embeddings.bin) using include_bytes!, which includes:

  • πŸ”€ A HashMap<String, Vec<f16>> of words mapped to quantized 50D GloVe embeddings
  • 😎 A HashMap<String, Vec<f16>> of emojis mapped to semantic emoji vectors

When .slay(&str) is called:

  1. The input is cleaned (only letters and spaces).

  2. Words are lowercased and split.

  3. Each word:

    • Tries to fetch a GloVe embedding
    • Computes cosine similarity with all emoji vectors
    • Selects the closest emoji
    • Falls back to "@" if no match is found

Cosine similarity is calculated like this:

fn cosine_similarity(a: &[f16], b: &[f16]) -> f32 {
    let dot = a.iter().zip(b.iter()).map(|(x, y)| x.to_f32() * y.to_f32()).sum::<f32>();
    let norm_a = a.iter().map(|x| x.to_f32().powi(2)).sum::<f32>().sqrt();
    let norm_b = b.iter().map(|x| x.to_f32().powi(2)).sum::<f32>().sqrt();
    if norm_a == 0.0 || norm_b == 0.0 {
        return -1.0;
    }
    dot / (norm_a * norm_b)
}

🎯 API Overview

Method Description
EmojiStylist::new() Initializes the struct by deserializing the embedded binary file
stylist.slay(&str) Translates each word in the input into a matching emoji, returns Vec<String>

πŸ§ͺ Testing

SubSlay includes unit tests for:

  • βœ… Correct output for known words like "pizza"
  • βœ… Full sentence translation
  • βœ… Unknown/fallback behavior
  • βœ… Empty input

Run tests:

cargo test

πŸ“ Package Contents

  • src/lib.rs β€” Main logic for translation and similarity
  • data/embeddings.bin β€” Serialized, quantized GloVe and emoji vectors
  • Cargo.toml β€” Dependency config
  • README.md β€” You’re reading it

πŸ§‘β€πŸ’» Developer Notes

  • Embeddings are stored using quantized half-precision floats (f16) to reduce file size
  • All data is serialized using bincode
  • Pre-trained GloVe 50d embeddings are used for English vocabulary
  • The emoji vectors are handcrafted or generated using centroid-like heuristics

πŸ’‘ Future Plans

  • Weighted phrase-to-emoji aggregation
  • Fuzzy matching for slang and typos
  • Custom user dictionaries or theme packs
  • Parallel or batch .slay() calls
  • Embedding compression + streaming load

πŸ“œ License

MIT Β© AndriaK GitHub Repository Live Demo

Commit count: 0

cargo fmt