voicepeak-cli

Crates.iovoicepeak-cli
lib.rsvoicepeak-cli
version0.6.0
created_at2025-06-28 10:01:25.308434+00
updated_at2025-07-10 12:57:07.130141+00
descriptionA command-line interface wrapper for VOICEPEAK text-to-speech software
homepagehttps://github.com/petamorikei/voicepeak-cli
repositoryhttps://github.com/petamorikei/voicepeak-cli
max_upload_size
id1729696
size71,463
Petamori (petamorikei)

documentation

README

voicepeak-cli

ζ—₯本θͺž | English

A command-line interface wrapper for VOICEPEAK text-to-speech software with preset management and automatic audio playback.

What's Different from the Original VOICEPEAK Command?

This wrapper enhances the original VOICEPEAK CLI with several powerful features:

  • 🎡 Auto-play with mpv - Automatically plays generated audio when no output file is specified
  • πŸ“ Voice presets - Save and reuse combinations of narrator, emotions, and pitch settings
  • πŸ“œ Long text support - Automatically splits texts longer than 140 characters and merges audio chunks
  • πŸ”§ Advanced playback modes - Choose between batch (generate all β†’ merge β†’ play) or sequential (generate β†’ play one by one)
  • πŸ”„ Pipe input support - Accept text from stdin: echo "text" | vp
  • πŸ”‡ Clean output - Suppresses technical output by default (use --verbose to see debug info)
  • βš™οΈ Configuration file - Store your preferred settings in ~/.config/vp/config.toml

Key Benefits

  • Enhanced Workflow: No need to manually save and play audio files - just run and listen
  • Batch Processing: Handle long documents without worrying about character limits
  • Flexible Input: Works with direct text, files, or piped input from other commands
  • Personalization: Save your favorite voice configurations for consistent results
  • Professional Output: Clean interface with optional verbose mode for debugging

Requirements

  • macOS
  • VOICEPEAK installed at /Applications/voicepeak.app/
  • mpv for audio playback (install via Homebrew: brew install mpv)
  • ffmpeg for batch mode and multi-chunk file output (install via Homebrew: brew install ffmpeg)

Installation

From crates.io (Recommended)

cargo install voicepeak-cli

From source

  1. Clone this repository
  2. Build and install:
    cargo install --path .
    

Usage

Basic Usage

# Simple text-to-speech (requires preset or --narrator)
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ"

# With explicit narrator
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" --narrator "ε€θ‰²θŠ±ζ’¨"

# Save to file instead of auto-play
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" --narrator "ε€θ‰²θŠ±ζ’¨" -o output.wav

# Read from file
vp -t input.txt --narrator "ε€θ‰²θŠ±ζ’¨"

# Pipe input
echo "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" | vp --narrator "ε€θ‰²θŠ±ζ’¨"
cat document.txt | vp -p karin-happy

Using Presets

# List available presets
vp --list-presets

# Use a preset
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" -p karin-happy

# Override preset settings
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" -p karin-normal --emotion "happy=50"

Voice Controls

# Control speech parameters
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" --narrator "ε€θ‰²θŠ±ζ’¨" --speed 120 --pitch 50

# List available narrators
vp --list-narrator

# List emotions for a specific narrator
vp --list-emotion "ε€θ‰²θŠ±ζ’¨"

Text Length Handling

# Allow automatic text splitting (default)
vp "very long text..."

# Strict mode: reject texts longer than 140 characters
vp "text" --strict-length

Playback Modes

# Batch mode: generate all chunks first, merge, then play (default)
vp "long text" --playback-mode batch

# Sequential mode: generate and play chunks one by one
vp "long text" --playback-mode sequential

# Long text file output (uses ffmpeg to merge chunks)
vp "very long text" -o output.wav

# For sequential playback without ffmpeg
vp "long text" --playback-mode sequential

Configuration

Configuration is stored in ~/.config/vp/config.toml. The file is automatically created on first run.

Example Configuration

default_preset = "karin-custom"

[[presets]]
name = "karin-custom"
narrator = "ε€θ‰²θŠ±ζ’¨"
emotions = [
    { name = "hightension", value = 10 },
    { name = "sasayaki", value = 20 },
]
pitch = 30
speed = 120

[[presets]]
name = "karin-normal"
narrator = "ε€θ‰²θŠ±ζ’¨"
emotions = []

[[presets]]
name = "karin-happy"
narrator = "ε€θ‰²θŠ±ζ’¨"
emotions = [{ name = "hightension", value = 50 }]

Configuration Fields

  • default_preset: Optional. Preset to use when no -p option is specified
  • presets: Array of voice presets

Preset Fields

  • name: Unique preset identifier
  • narrator: Voice narrator name
  • emotions: Array of emotion parameters with name and value
  • pitch: Optional pitch adjustment (-300 to 300)
  • speed: Optional speed adjustment (50 to 200)

Command-Line Options

Usage: vp [OPTIONS] [TEXT]

Arguments:
  [TEXT]  Text to say (or pipe from stdin)

Options:
  -t, --text <FILE>              Text file to say
  -o, --out <FILE>               Path of output file (optional - will play with mpv if not specified)
  -n, --narrator <NAME>          Name of voice
  -e, --emotion <EXPR>           Emotion expression (e.g., happy=50,sad=50)
  -p, --preset <NAME>            Use voice preset
      --list-narrator            Print voice list
      --list-emotion <NARRATOR>  Print emotion list for given voice
      --list-presets             Print available presets
      --speed <VALUE>            Speed (50 - 200)
      --pitch <VALUE>            Pitch (-300 - 300)
      --strict-length            Reject input longer than 140 characters (default: false, allows splitting)
      --playback-mode <MODE>     Playback mode: sequential or batch (default: batch)
  -v, --verbose                  Enable verbose output (show VOICEPEAK debug messages)
  -h, --help                     Print help
  -V, --version                  Print version

Parameter Priority

When multiple sources specify the same parameter, the priority order is:

  1. Command-line options (highest priority)
  2. Preset values
  3. Default values / none (lowest priority)

For example:

  • vp "text" -p my-preset --pitch 100 uses pitch=100 (CLI override)
  • vp "text" -p my-preset uses preset's pitch value
  • vp "text" --narrator "voice" uses no pitch adjustment

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines on how to contribute to this project.

Commit count: 0

cargo fmt