compress-json-rs

Crates.iocompress-json-rs
lib.rscompress-json-rs
version0.3.2
created_at2025-04-21 16:49:49.273604+00
updated_at2026-01-02 03:42:15.415531+00
descriptionStore JSON data in a space-efficient compressed form, with round-trip compression/decompression.
homepagehttps://github.com/web-mech/compress-json-rs#readme
repositoryhttps://github.com/web-mech/compress-json-rs
max_upload_size
id1642953
size157,819
Kenny Hash (web-mech)

documentation

https://docs.rs/compress-json-rs

README

compress-json-rs

crates.io docs.rs License

AI-driven Rust port of the JavaScript compress-json library by Beenotung. Store JSON data in a space-efficient compressed form with lossless round-trip compression and decompression.

Table of Contents

Features

  • Full JSON Support: Objects, arrays, strings, numbers, booleans, and null
  • Special Values Support: Infinity, -Infinity, and NaN with dedicated encodings (v3.2.0+)
  • Value Deduplication: Repeated values stored once with reference keys
  • Schema Deduplication: Objects with identical keys share schemas
  • Compact Encoding: Numbers encoded in base-62 format
  • Type Safety: Zero-copy round-trip using serde_json::Value
  • UTF-8 Safe: Full Unicode support for strings
  • No Dependencies on Disk/Network: Fast in-memory compression
  • Cross-Platform Compatible: Data compatible with JavaScript and Python implementations

Installation

Add to your Cargo.toml:

[dependencies]
compress-json-rs = "0.1.0"
serde_json = "1.0"

Quick Start

use compress_json_rs::{compress, decompress};
use serde_json::json;

fn main() {
    // Original JSON data
    let data = json!({
        "user": "Alice",
        "active": true,
        "roles": ["admin", "user"]
    });

    // Compress
    let compressed = compress(&data);
    
    // Decompress
    let restored = decompress(compressed);
    
    assert_eq!(data, restored);
}

Usage Examples

Basic Object Compression

use compress_json_rs::{compress, decompress};
use serde_json::json;

let user = json!({
    "id": 12345,
    "name": "Alice",
    "email": "alice@example.com",
    "settings": {
        "theme": "dark",
        "notifications": true
    }
});

let (values, root) = compress(&user);

// The compressed form is a tuple of:
// - values: Vec<String> - deduplicated value store
// - root: String - key pointing to the root value

println!("Values: {:?}", values);
println!("Root key: {}", root);

// Restore original
let restored = decompress((values, root));
assert_eq!(user, restored);

Array with Repeated Objects

use compress_json_rs::{compress, decompress};
use serde_json::json;

// Arrays of objects with similar schemas benefit most from compression
let data = json!([
    { "type": "fruit", "name": "apple", "color": "red" },
    { "type": "fruit", "name": "banana", "color": "yellow" },
    { "type": "fruit", "name": "cherry", "color": "red" },
]);

let compressed = compress(&data);
let restored = decompress(compressed);

assert_eq!(data, restored);

Serialization for Storage/Transmission

use compress_json_rs::{compress, decompress, Compressed};
use serde_json::json;

let data = json!({
    "items": [
        { "id": 1, "status": "active" },
        { "id": 2, "status": "active" },
        { "id": 3, "status": "pending" }
    ]
});

// Compress
let compressed = compress(&data);

// Serialize to JSON string for storage
let json_str = serde_json::to_string(&compressed).unwrap();
println!("Compressed JSON: {}", json_str);

// Later: deserialize and decompress
let loaded: Compressed = serde_json::from_str(&json_str).unwrap();
let restored = decompress(loaded);

assert_eq!(data, restored);

Working with Files

use compress_json_rs::{compress, decompress, Compressed};
use serde_json::json;
use std::fs;

fn save_compressed(data: &serde_json::Value, path: &str) -> std::io::Result<()> {
    let compressed = compress(data);
    let json = serde_json::to_string(&compressed)?;
    fs::write(path, json)
}

fn load_compressed(path: &str) -> std::io::Result<serde_json::Value> {
    let json = fs::read_to_string(path)?;
    let compressed: Compressed = serde_json::from_str(&json)?;
    Ok(decompress(compressed))
}

// Usage
let data = json!({"key": "value"});
save_compressed(&data, "data.compressed.json").unwrap();
let restored = load_compressed("data.compressed.json").unwrap();

Using Helper Functions

use compress_json_rs::{trim_undefined, trim_undefined_recursively};
use serde_json::{json, Map, Value};

// Remove null values from objects before compression
let mut data: Map<String, Value> = serde_json::from_value(json!({
    "name": "Alice",
    "middleName": null,  // Will be removed
    "age": 30
})).unwrap();

trim_undefined(&mut data);
// data now only contains "name" and "age"

// Recursively remove nulls from nested objects
let mut nested: Map<String, Value> = serde_json::from_value(json!({
    "user": {
        "name": "Bob",
        "extra": null  // Will be removed
    },
    "meta": null  // Will be removed
})).unwrap();

trim_undefined_recursively(&mut nested);

API Reference

Core Functions

/// Compressed representation: (values array, root key)
pub type Compressed = (Vec<String>, String);

/// Key type for value references
pub type Key = String;

/// Compress a JSON value into its compressed form
pub fn compress(o: &serde_json::Value) -> Compressed;

/// Decompress a compressed form back into JSON
pub fn decompress(c: Compressed) -> serde_json::Value;

/// Decode a single key from the values array
pub fn decode(values: &Vec<String>, key: &str) -> serde_json::Value;

Lower-Level API

/// Memory structure for compression state
pub struct Memory { /* internal */ }

/// Create a new memory instance for compression
pub fn make_memory() -> Memory;

/// Add a value to memory, returns its reference key
pub fn add_value(mem: &mut Memory, o: &serde_json::Value) -> Key;

/// Convert memory to the values array
pub fn mem_to_values(mem: &Memory) -> Vec<String>;

Helper Functions

/// Remove keys with null values from an object (shallow)
pub fn trim_undefined(object: &mut Map<String, Value>);

/// Recursively remove keys with null values from nested objects
pub fn trim_undefined_recursively(object: &mut Map<String, Value>);

Configuration

/// Global configuration for compression behavior
pub struct Config {
    /// Whether to sort object keys (default: false)
    pub sort_key: bool,
    /// Whether to error on NaN values (default: false)
    pub error_on_nan: bool,
    /// Whether to error on infinite values (default: false)
    pub error_on_infinite: bool,
}

pub const CONFIG: Config;

Special Values

Updated in v3.4.0 with preserve options

The library supports special floating-point values that are not part of the JSON specification.

Default Behavior (v3.4.0+)

By default, preserve_nan and preserve_infinite are false, so special values become null (like JSON.stringify):

Value Default (preserve_* = false) With preserve_* = true
Infinity null N|+
-Infinity null N|-
NaN null N|0

Config Options

use compress_json_rs::CONFIG;

// Default configuration (v3.4.0+)
// preserve_nan: false       - NaN becomes null
// preserve_infinite: false  - Infinity becomes null  
// error_on_nan: false       - Don't panic on NaN
// error_on_infinite: false  - Don't panic on Infinity

Decoding Special Values

When receiving data from JavaScript/Python implementations that have preserve_* enabled:

use compress_json_rs::{decode_special, is_special_value};

// Check if a value is special
assert!(is_special_value("N|+"));
assert!(is_special_value("N|-"));
assert!(is_special_value("N|0"));

// Decode special values back to f64
let inf = decode_special("N|+");
assert!(inf.is_infinite() && inf.is_sign_positive());

let neg_inf = decode_special("N|-");
assert!(neg_inf.is_infinite() && neg_inf.is_sign_negative());

let nan = decode_special("N|0");
assert!(nan.is_nan());

JSON Compatibility Note

Since JSON doesn't natively support Infinity, -Infinity, or NaN, these values become null when decompressed to serde_json::Value. However, when preserve_* options are enabled, the compressed format preserves the original special values, enabling:

  • Cross-platform data exchange with JavaScript and Python implementations
  • Lossless storage of special values in the compressed form
  • Re-encoding with preserved semantics

String Escaping

Strings that look like special value encodings are automatically escaped:

use compress_json_rs::{compress, decompress};
use serde_json::json;

// String "N|+" is preserved as a string, not treated as Infinity
let data = json!({ "value": "N|+" });
let compressed = compress(&data);
let restored = decompress(compressed);

assert_eq!(restored["value"], "N|+");

How It Works

The compression algorithm works by deduplicating values and encoding references using base-62 keys.

Compression Flow

flowchart TD
    subgraph Input
        A[JSON Value]
    end
    
    subgraph Compression Process
        B[Create Memory Store]
        C{Value Type?}
        D[Encode Boolean]
        E[Encode Number]
        F[Encode String]
        G[Process Array]
        H[Process Object]
        I[Check Value Cache]
        J{Cached?}
        K[Return Existing Key]
        L[Generate New Key]
        M[Store Value]
    end
    
    subgraph Output
        N[Compressed Tuple]
        O["(Vec&lt;String&gt;, Key)"]
    end
    
    A --> B
    B --> C
    C -->|bool| D
    C -->|number| E
    C -->|string| F
    C -->|array| G
    C -->|object| H
    D & E & F --> I
    G --> |each element| C
    H --> |schema + values| C
    I --> J
    J -->|yes| K
    J -->|no| L
    L --> M
    M --> K
    K --> N
    N --> O

Decompression Flow

flowchart TD
    subgraph Input
        A["Compressed (values, root)"]
    end
    
    subgraph Decompression Process
        B[Parse Root Key]
        C[Lookup Value in Store]
        D{Value Prefix?}
        E[Decode Boolean]
        F[Decode Number]
        G[Decode String]
        H[Decode Array]
        I[Decode Object]
        J[Recursive Decode]
    end
    
    subgraph Output
        K[JSON Value]
    end
    
    A --> B
    B --> C
    C --> D
    D -->|"b|"| E
    D -->|"n|"| F
    D -->|"s|" or none| G
    D -->|"a|"| H
    D -->|"o|"| I
    H --> J
    I --> J
    J --> C
    E & F & G --> K
    H & I --> K

Architecture

Module Structure

graph TB
    subgraph Public API
        LIB[lib.rs]
    end
    
    subgraph Core Modules
        CORE[core.rs<br/>compress/decompress]
        MEM[memory.rs<br/>value storage]
        ENC[encode.rs<br/>type encoding]
    end
    
    subgraph Support Modules
        NUM[number.rs<br/>base-62 conversion]
        BOOL[boolean.rs<br/>bool encoding]
        HELP[helpers.rs<br/>utility functions]
        CFG[config.rs<br/>configuration]
        DBG[debug.rs<br/>error handling]
    end
    
    LIB --> CORE
    LIB --> MEM
    LIB --> HELP
    LIB --> CFG
    
    CORE --> MEM
    CORE --> ENC
    
    MEM --> ENC
    MEM --> NUM
    MEM --> CFG
    MEM --> DBG
    
    ENC --> NUM
    ENC --> BOOL

Memory Structure

classDiagram
    class Memory {
        -Vec~String~ store
        -HashMap~String, String~ value_cache
        -HashMap~String, String~ schema_cache
        -usize key_count
    }
    
    class Compressed {
        +Vec~String~ values
        +String root_key
    }
    
    Memory --> Compressed : produces
    
    note for Memory "Stores encoded values with<br/>deduplication via caches"
    note for Compressed "Final output format:<br/>(values, root)"

Compression Format

Value Encoding Prefixes

Prefix Type Example Encoded Original Value
b|T Boolean true b|T true
b|F Boolean false b|F false
n| Number n|42.5 42.5
N|+ Infinity N|+ Infinity
N|- -Infinity N|- -Infinity
N|0 NaN N|0 NaN
s| Escaped string s|n|123 "n|123"
a| Array a|0|1|2 Array with refs 0,1,2
o| Object o|0|1|2 Object with schema ref
(none) Plain string hello "hello"
"" or _ Null `` null

Key Encoding (Base-62)

Keys are encoded using base-62 for compact representation:

Characters: 0-9 A-Z a-z (62 total)

Examples:
  0 -> "0"
  9 -> "9"
  10 -> "A"
  35 -> "Z"
  36 -> "a"
  61 -> "z"
  62 -> "10"
  124 -> "20"

Example Compression

graph LR
    subgraph Original JSON
        A["{ 
  'name': 'Alice',
  'role': 'admin'
}"]
    end
    
    subgraph Compressed Values Array
        B["0: 'name,role'  (schema)
1: 'Alice'
2: 'admin'  
3: 'o|0|1|2'  (object)"]
    end
    
    subgraph Compressed Output
        C["(['name,role', 'Alice', 
  'admin', 'o|0|1|2'], '3')"]
    end
    
    A --> B
    B --> C

Schema Sharing Example

graph TD
    subgraph "Input: Array of Objects"
        A["[
  { id: 1, type: 'A' },
  { id: 2, type: 'B' },
  { id: 3, type: 'A' }
]"]
    end
    
    subgraph "Compressed Values"
        B["0: 'a|id,type'     // shared schema
1: 'n|1'
2: 'A'
3: 'o|0|1|2'       // obj 1
4: 'n|2'
5: 'B'
6: 'o|0|4|5'       // obj 2
7: 'n|3'
8: 'o|0|7|2'       // obj 3 (reuses 'A')
9: 'a|3|6|8'       // root array"]
    end
    
    subgraph Benefits
        C["✓ Schema 'id,type' stored once
✓ Value 'A' stored once
✓ Minimal storage for repetitive data"]
    end
    
    A --> B
    B --> C

Configuration

The library uses a compile-time configuration:

pub const CONFIG: Config = Config {
    sort_key: false,           // Don't sort object keys
    preserve_nan: false,       // Convert NaN to null (like JSON.stringify)
    error_on_nan: false,       // Don't panic on NaN (only when preserve_nan=false)
    preserve_infinite: false,  // Convert Infinity to null (like JSON.stringify)
    error_on_infinite: false,  // Don't panic on Infinity (only when preserve_infinite=false)
};

Behavior Notes (v3.4.0+)

  • NaN and Infinity handling depends on config options:

    Value preserve_* = true preserve_* = false, error_* = true Both false (default)
    NaN Encoded as N|0 Panic Becomes null
    Infinity Encoded as N|+ Panic Becomes null
    -Infinity Encoded as N|- Panic Becomes null
  • Key Order: Object keys maintain insertion order unless sort_key is enabled

  • Unicode: Full UTF-8 support for all string values

Helper Functions

trim_undefined

Removes keys with null values from an object (shallow operation):

use compress_json_rs::trim_undefined;
use serde_json::{json, Map, Value};

let mut obj: Map<String, Value> = serde_json::from_value(json!({
    "a": 1,
    "b": null,
    "c": 3
})).unwrap();

trim_undefined(&mut obj);
// obj = { "a": 1, "c": 3 }

trim_undefined_recursively

Removes null values from nested objects:

use compress_json_rs::trim_undefined_recursively;
use serde_json::{json, Map, Value};

let mut obj: Map<String, Value> = serde_json::from_value(json!({
    "user": {
        "name": "Alice",
        "middleName": null
    },
    "extra": null
})).unwrap();

trim_undefined_recursively(&mut obj);
// obj = { "user": { "name": "Alice" } }

Performance Considerations

Best Use Cases

graph LR
    subgraph "High Compression Ratio"
        A[Arrays of similar objects]
        B[Repeated string values]
        C[Nested objects with shared schemas]
    end
    
    subgraph "Lower Compression Ratio"
        D[Unique primitive values]
        E[Deeply nested unique data]
        F[Large binary-like strings]
    end
    
    A --> G[Excellent]
    B --> G
    C --> G
    
    D --> H[Moderate]
    E --> H
    F --> H

Memory Usage

  • Compression builds an in-memory store with hash maps for deduplication
  • For very large JSON documents, consider streaming or chunked processing
  • The compressed format itself is typically 30-70% smaller for repetitive data

Compression Ratio Examples

Data Type Typical Ratio
API response arrays 40-60% of original
Configuration files 50-70% of original
Unique data 90-100% of original
Highly repetitive 20-40% of original

Testing

Run the test suite:

cargo test

The library includes comprehensive tests covering:

  • Number encoding edge cases
  • Special values (Infinity, -Infinity, NaN)
  • Unicode string handling
  • Empty objects and arrays
  • Null value handling
  • Deeply nested structures
  • Schema deduplication

License

Licensed under the BSD-2-Clause license. See LICENSE for details.


Compatibility

This library is compatible with compress-json v3.4.0+:

Feature JavaScript Python Rust
Basic types
preserve_nan ✅ v3.4.0+ ✅ v3.4.0+
preserve_infinite ✅ v3.4.0+ ✅ v3.4.0+
error_on_nan
error_on_infinite
Schema dedup
Value dedup

Data compressed with any implementation can be decompressed by any other.

Related Projects

Commit count: 12

cargo fmt