unity-asset

Crates.iounity-asset
lib.rsunity-asset
version0.2.0
created_at2025-08-27 16:21:58.714004+00
updated_at2025-12-26 02:18:56.162309+00
descriptionA comprehensive Rust library for parsing Unity asset files (YAML and binary formats)
homepagehttps://github.com/Latias94/unity-asset
repositoryhttps://github.com/Latias94/unity-asset
max_upload_size
id1812896
size198,006
Latias94 (Latias94)

documentation

https://docs.rs/unity-asset

README

Unity Asset Parser

A Rust implementation of Unity asset parsing, inspired by and learning from UnityPy. This project focuses on parsing Unity YAML and binary formats with Rust's memory safety and performance characteristics.

Project Status

Early Development: This is a learning project and reference implementation. It is not production-ready and has significant limitations compared to mature tools like UnityPy.

What This Project Is

  • Learning Exercise: Understanding Unity's file formats through Rust implementation
  • Parser Focus: Emphasis on parsing and data extraction rather than manipulation
  • Rust Exploration: Exploring how Rust's type system can help with binary parsing
  • Reference Implementation: Code that others can learn from and build upon

What This Project Is NOT

  • UnityPy Replacement: UnityPy remains the most mature Python solution
  • Asset Editor: This is a read-only parser, not an asset creation/editing tool

Architecture

The project uses a workspace structure to organize different parsing capabilities:

unity-asset/
├── unity-asset-core/      # Core data structures and traits
├── unity-asset-yaml/      # YAML file parsing (stable)
├── unity-asset-binary/    # Binary asset parsing (advanced, WIP)
├── unity-asset-decode/    # Optional decode/export helpers (Texture/Audio/Sprite/Mesh)
├── unity-asset-lib/       # Main library crate (published as `unity-asset`)
├── unity-asset-cli/       # CLI tools
│   ├── main.rs           # Synchronous CLI tool
│   └── main_async.rs     # Asynchronous CLI tool (--features async)
└── tests/                # Integration tests and sample files

Crates

  • unity-asset (library): main user-facing API. Start here for Environment (YAML + binary) and YamlDocument.
  • unity-asset-binary (parser): low-level binary parsers (AssetBundle/SerializedFile/WebFile) plus fast object helpers (ObjectHandle).
  • unity-asset-decode (decode/export): optional decode/export helpers behind feature flags (Texture/Audio/Sprite/Mesh).
  • unity-asset-yaml (YAML): YAML-specific parsing/serialization; also re-exported via unity-asset.
  • unity-asset-core (core): shared data structures, errors, and dynamic UnityValue types.
  • unity-asset-cli (CLI): command-line tools (not required for library integration).

Examples are maintained per-crate and are built in CI. For instance: cargo run -p unity-asset --example env_load_and_list -- tests/samples cargo run -p unity-asset-binary --example sniff_kind -- tests/samples/char_118_yuki.ab See docs/EXAMPLES.md for a curated list of runnable recipes.

Current Capabilities

YAML Processing (Complete)

  • Unity YAML format parsing for common file types (.asset, .prefab, .unity)
  • Multi-document parsing support
  • Reference resolution and cross-document linking
  • Filtering and querying capabilities
  • Serialization back to YAML format

Binary Asset Processing (Advanced, WIP)

  • AssetBundle structure parsing (UnityFS format)
  • SerializedFile parsing with full object extraction
  • TypeTree structure parsing and dynamic object reading
  • Compression support (LZ4, LZMA, Brotli)
  • Metadata extraction and analysis (experimental; includes dependency graph, best-effort hierarchy/component mapping, and external reference resolution via externals)
  • Performance monitoring and basic statistics

Object Processing (Partial)

  • AudioClip: Full format support (Vorbis, MP3, WAV, AAC) via unity-asset-decode (Symphonia-based decoder)
  • Texture2D: Complete parsing + best-effort decoding + PNG export via unity-asset-decode
  • Sprite: Full metadata extraction + atlas support + image cutting via unity-asset-decode
  • Mesh: Structure parsing + vertex data extraction + basic export via unity-asset-decode
  • GameObject/Transform: Basic TypeTree-based hierarchy & component mapping (best-effort; still WIP)

CLI Tools (Usable, WIP)

  • Synchronous CLI for file inspection and batch processing
  • Asynchronous CLI with concurrent processing and progress tracking
  • Export capabilities (PNG, OGG, WAV, basic mesh formats)
  • Comprehensive metadata analysis and reporting
  • Basic progress reporting
    • list-objects: list objects from SerializedFiles/bundles (path_id/class_id/type/name) using TypeTree fast paths
    • export-serialized: export objects from .asset/.assets by scanning objects directly (raw .bin, optional best-effort decode)

Known Limitations

  • Some advanced Unity asset types not yet implemented (MonoBehaviour scripts, complex shaders)
  • Object manipulation is read-only (no writing back to Unity formats)
  • Some edge cases in LZMA decompression may fail on corrupted data
  • Advanced texture formats require unity-asset-decode texture-advanced feature (DXT, ETC, ASTC)
  • Audio decoding requires unity-asset-decode audio feature (Symphonia integration)
  • Large file performance could be optimized further
  • Error messages could be more user-friendly
  • Some metadata/dependency/hierarchy analyses are currently simplified placeholders
  • Object data is lazily accessed by default; use SerializedFileParser::from_bytes_with_options(data, true) if you explicitly need per-object preloaded buffers
  • For large objects without TypeTree, raw bytes are not expanded into _raw_data for performance; use UnityObject::raw_data()
  • TypeTree byte-like fields are represented as UnityValue::Bytes (not Array<Integer>) for performance: TypelessData and vector<UInt8/SInt8/char>. Use UnityValue::as_bytes() to access them.

Quick Start

Requirements

  • Rust stable (edition = "2024")
  • cargo-nextest (recommended for running tests)

Installation

Note: crates.io publishing is planned, but not guaranteed yet. For now, to try it out:

# Clone and build from source
git clone https://github.com/Latias94/unity-asset.git
cd unity-asset

# Build the library
cargo build --workspace

# Run tests (recommended)
cargo nextest run --workspace

# Try the CLI tools
cargo run --bin unity-asset -- --help
cargo run --features async --bin unity-asset-async -- --help

For maintainers: the tag-driven publish/release process is documented in docs/RELEASING.md.

Once v0.2.0 is published to crates.io, you'll be able to install it with:

# Add to your Cargo.toml
[dependencies]
unity-asset = "0.2.0"
# Install CLI tools
cargo install unity-asset-cli

By default, unity-asset-cli is built without decode/export helpers to keep builds lighter. To enable --decode support and best-effort exporters:

cargo install unity-asset-cli --features decode

Testing Status

We have basic tests for core functionality, but this is not a comprehensive test suite. Some tests pass, others reveal limitations in our implementation.

Comparison with UnityPy

UnityPy is a mature, feature-complete Python library for Unity asset manipulation. This Rust project is:

  • Much less mature: UnityPy has years of development and community contributions
  • More limited: We focus on parsing, not manipulation or export
  • Learning-oriented: This project helps understand Unity formats through Rust
  • Experimental: Many features are incomplete or missing

If you need a production tool for Unity asset processing, use UnityPy instead.

Basic Usage Examples

YAML File Parsing

use unity_asset::{YamlDocument, UnityDocument};

// Load a Unity YAML file
let (doc, warnings) = YamlDocument::load_yaml_with_warnings("ProjectSettings.asset", false)?;
for w in warnings {
    eprintln!("warning: {}", w);
}

// Get basic information
println!("Found {} entries", doc.entries().len());

// Try to find specific objects (may not work for all files)
if let Ok(settings) = doc.get(Some("PlayerSettings"), None) {
    println!("Found PlayerSettings");
}

UnityPy-like Environment (YAML + Binary)

use unity_asset::environment::{Environment, EnvironmentObjectRef};
use unity_asset_binary::typetree::JsonTypeTreeRegistry;
use std::sync::Arc;

let mut env = Environment::new();

// Optional: provide an external TypeTree registry for stripped assets (best-effort).
// This can improve coverage when `enableTypeTree = false` in serialized files.
// let registry = JsonTypeTreeRegistry::from_path("typetree_registry.json")?;
// env.set_type_tree_registry(Some(Arc::new(registry)));

env.load("tests/samples")?;

// `path_id` is only unique within a single SerializedFile.
// Use `BinaryObjectKey` when you need a globally-unique handle you can round-trip later.
let sources = env.binary_sources();
if let Some((_kind, source_path)) = sources.first() {
    let keys = env.find_binary_object_keys_in_source(source_path, 1);
    if let Some(key) = keys.first() {
        let _parsed = env.read_binary_object_key(key)?;
    }
}

// Unity `PPtr` resolution needs a context object because `fileID` indexes into the context file's externals.
// For the common case `fileID=0`, it points to the same SerializedFile as the context.
if let Some(obj_ref) = env.find_binary_object(1) {
    let _pptr_obj = env.read_binary_pptr(&obj_ref, 0, 1)?;
}

// AssetBundles often expose a container mapping from asset paths to objects.
// This is the primary discovery mechanism in UnityPy.
// Note: This is best-effort; when TypeTree is stripped, we fall back to a raw binary parser for `m_Container`.
let container = env.find_binary_object_keys_in_bundle_container("Assets/");
for (asset_path, key) in container.into_iter().take(10) {
    let _obj = env.read_binary_object_key(&key)?;
    println!("{} -> path_id={}", asset_path, key.path_id);
}

for obj in env.objects() {
    match obj {
        EnvironmentObjectRef::Yaml(class) => {
            let _ = &class.class_name;
        }
        EnvironmentObjectRef::Binary(obj_ref) => {
            // Parse on-demand (best-effort)
            let _parsed = obj_ref.read()?;
            let _key = obj_ref.key();
        }
    }
}

CLI Usage

# Parse a single YAML file
cargo run --bin unity-asset -- parse-yaml -i ProjectSettings.asset

# List bundle nodes (files) for debugging/inspection
cargo run --bin unity-asset -- list-bundle -i tests/samples/char_118_yuki.ab --filter "CAB-" --verbose

# Find objects via AssetBundle `m_Container` (discovery)
cargo run --bin unity-asset -- find-object -i tests/samples/char_118_yuki.ab --pattern "Assets/" --limit 20 --verbose

# Filter by type (useful for scripts/batch workflows)
cargo run --bin unity-asset -- find-object -i tests/samples/char_118_yuki.ab --class-name "Texture2D" --limit 20

# Filter by object name (best-effort; requires TypeTree and a name field)
cargo run --bin unity-asset -- find-object -i tests/samples/char_118_yuki.ab --name "yuki" --limit 20 --verbose

# Dump an external TypeTree registry (best-effort fallback for stripped assets)
cargo run --bin unity-asset -- dump-typetree-registry -i tests/samples -o typetree_registry.json --version-prefix

# Use an external TypeTree registry during discovery/inspection (best-effort)
cargo run --bin unity-asset -- --typetree-registry typetree_registry.json find-object -i tests/samples --pattern "Assets/" --limit 20 --verbose

# `--typetree-registry` also accepts a UnityPy-compatible `.tpk` file and can be repeated:
cargo run --bin unity-asset -- \
    --typetree-registry typetree_registry.json \
    --typetree-registry unitypy.tpk \
    find-object -i tests/samples --pattern "Assets/" --limit 20 --verbose

# Inspect a single object (TypeTree / Null-field debugging)
# - Easiest: copy/paste the `key=bok2|...` line from `find-object --verbose` and use `--key`.
# - Or pass the location fields explicitly (use `--kind serialized` for standalone `.assets` files).
cargo run --bin unity-asset -- inspect-object -i tests/samples --key 'bok2|bundle|0|1|<outer_len>|tests/samples/char_118_yuki.ab|0|' \
    --max-depth 6 --max-items 200 --filter "m_StreamData"
# If you suspect TypeTree mismatches, enable fail-fast parsing and print warnings:
# `--strict` (fail-fast) and `--show-warnings` (print TypeTree warnings)

# Scan PPtr references without fully parsing objects (fast dependency/graph workflows)
cargo run --bin unity-asset -- scan-pptr -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --limit 5
cargo run --bin unity-asset -- scan-pptr -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --class-id 114 --json

# Build a best-effort dependency graph via TypeTree PPtr scanning
cargo run --bin unity-asset -- deps -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --format summary
cargo run --bin unity-asset -- deps -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --format dot --max-edges 2000 > graph.dot

# Export objects from AssetBundles via `m_Container` (UnityPy-like workflow)
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --limit 50
# Decode known types (best-effort):
# - `AudioClip`: export embedded/streamed audio bytes (e.g. `.ogg`) or decode to `.wav` fallback
# - `Texture2D`: decode and export as `.png`
# - `Sprite`: resolve referenced `Texture2D`, crop sprite rect, and export as `.sprite.png`
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode --limit 50
#
# Parallelize export/decode work:
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode --jobs 0
#
# Filter by type (reduces work on large bundles):
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --class-name "Texture2D" --decode --jobs 0
#
# Re-runs: keep output deterministic and resumable via a manifest
# - `--continue-on-error` records failures as `status=failed` with an error string
# - `--resume` skips already-exported entries (when the previous output path still exists)
# - `--retry-failed-from` re-exports only entries that failed previously
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode \\
    --manifest out/manifest.json --continue-on-error --jobs 0
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode \\
    --retry-failed-from out/manifest.json --manifest out/manifest.retry.json --continue-on-error --jobs 0
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode \\
    --resume out/manifest.retry.json --manifest out/manifest.resume.json --jobs 0
#
# Note: outputs are raw SerializedFile object bytes (`.bin`), not necessarily the original file format.
#
# Note (streamed resources): some `AudioClip`/`Texture2D` objects reference external `.resS`/`.resource`
# files that are not embedded inside the bundle. `export-bundle --decode` will try:
# 1) resource nodes inside the same bundle (UnityFS)
# 2) sibling resource files on disk (same directory / `StreamingAssets/`)
# If the resource file is missing, it falls back to exporting raw `.bin`.

# Try async processing (experimental)
cargo run --features async --bin unity-asset-async -- \
    parse-yaml -i Assets/ --recursive --progress

Architecture Details

This project is organized as a Rust workspace with separate crates for different concerns:

  • unity-asset-core: Core data structures and traits
  • unity-asset-yaml: YAML format parsing
  • unity-asset-binary: Binary format parsing (AssetBundle, SerializedFile)
  • unity-asset-decode: Optional decode/export helpers (Texture/Audio/Sprite/Mesh)
  • unity-asset-lib: Main library crate (published as unity-asset)
  • unity-asset-cli: Command-line tools (published as unity-asset-cli)

Acknowledgments

This project is a learning exercise inspired by and learning from several excellent projects:

UnityPy by @K0lb3

  • The gold standard for Unity asset manipulation
  • Our primary reference for understanding Unity formats
  • Test cases and expected behavior patterns

unity-rs by @yuanyan3060

  • Pioneering Rust implementation of Unity asset parsing
  • Architecture and parsing technique inspiration
  • Binary format handling examples

unity-yaml-parser by @socialpoint-labs

  • Original inspiration for this project
  • YAML format expertise and reference resolution patterns
  • Clean API design principles

License

This project is licensed under the MIT License - see the LICENSE file for details.

Commit count: 28

cargo fmt