prepis

Crates.io	prepis
lib.rs	prepis
version	0.2.1
created_at	2025-07-20 20:54:56.815111+00
updated_at	2025-07-20 21:35:47.942397+00
description	A command line utility that uses Amazon Transcribe to create video transcriptions
homepage	https://rup12.net
repository	https://github.com/darko-mesaros/prepis
max_upload_size
id	1761388
size	156,882

Darko Mesaros (darko-mesaros)

documentation

README

Prepis - Video Transcription CLI

Sometimes you just need to have a transcription of video and audio files, very very fast. Well, now you don't need to leave the command line for this.

prepis in action

Features

🎥 Multiple Format Support - MP4, MOV, AVI, FLV, MP3, WAV, FLAC, M4A, WebM, MKV
☁️ AWS Integration - Uses Amazon Transcribe for high-quality transcription
📊 Progress Tracking - Real-time status updates with visual indicators
🛡️ Error Handling - Comprehensive error messages with helpful guidance
🧹 Auto Cleanup - Automatically removes temporary S3 files
😎 Emojis - I like to have my CLI output feature a lot of emojis, you've been warned

Prerequisites

Rust (latest stable version)
AWS Account with appropriate permissions
AWS CLI configured or environment variables set
S3 Bucket for temporary file storage

Required AWS Permissions

Your AWS credentials need the following permissions:

s3:PutObject - Upload files to S3
s3:DeleteObject - Clean up temporary files
s3:ListBuckets - Validate credentials
transcribe:StartTranscriptionJob - Start transcription jobs
transcribe:GetTranscriptionJob - Check job status

Installation

Clone the repository:

git clone <repository-url>
cd prepis

Build the project:

cargo build --release

(Optional) Install globally:

cargo install --path .

Usage

Basic Usage

cargo run -- <VIDEO_FILE> <S3_BUCKET>

Examples

# Transcribe a local MP4 file
cargo run -- my-video.mp4 my-transcription-bucket

# Transcribe an audio file
cargo run -- podcast.mp3 my-transcription-bucket

# Using the installed binary
prepis presentation.mov company-transcripts

Help

cargo run -- --help

Configuration

AWS Credentials

The application uses the standard AWS credential chain:

Environment Variables:

export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_DEFAULT_REGION=us-east-1

AWS CLI Configuration:
```
aws configure
```
IAM Roles (when running on EC2)
AWS Profiles:
```
export AWS_PROFILE=your-profile
```

S3 Bucket Setup

Create an S3 bucket for temporary file storage, or you can use your existing one - just be mindful of the implications:

aws s3 mb s3://your-transcription-bucket

Note: Files are automatically deleted after transcription completes.

How It Works

Validation - Checks file format, size (max 2GB), and existence
Upload - Securely uploads file to your S3 bucket with unique naming
Transcription - Starts Amazon Transcribe job with English language detection
Polling - Monitors job status with exponential backoff (5s → 30s intervals)
Retrieval - Downloads and parses transcription results
Display - Shows formatted transcription text
Cleanup - Removes temporary S3 files

Project Structure

The codebase is organized into the following modules:

src/
├── main.rs              # Entry point, CLI handling and workflow orchestration
├── progress.rs          # Handles displaying the upload progress bar
├── error.rs             # Error types and user-friendly error display
├── models.rs            # Data structures and enums
├── utils.rs             # Utility functions for generating keys and job names
├── aws/
│   ├── mod.rs           # AWS module exports
│   ├── client.rs        # AWS client initialization and configuration
│   ├── s3.rs            # S3 operations for file upload and cleanup
│   └── transcribe.rs    # Transcribe job management and result processing
└── file/
    ├── mod.rs           # File module exports
    └── validation.rs    # File validation and transcription saving

Supported File Formats

Video	Audio	Max Size	Max Duration
MP4, MOV, AVI, FLV, WebM, MKV	MP3, WAV, FLAC, M4A	2GB	4 hours

Error Handling

The application provides helpful error messages for common issues:

File not found: Checks file path and permissions
Unsupported format: Lists supported file formats
AWS credential issues: Guides through credential setup
S3 access problems: Verifies bucket permissions
Transcription failures: Shows detailed error reasons

Troubleshooting

Common Issues

"AWS credentials not found"
- Configure AWS CLI: aws configure
- Set environment variables
- Check IAM permissions
"S3 bucket access denied"
- Verify bucket exists: aws s3 ls s3://your-bucket
- Check bucket permissions
- Ensure correct region
"File format not supported"
- Check supported formats list
- Verify file extension
- Try converting file format
"Transcription job failed"
- Check file isn't corrupted
- Verify audio quality
- Ensure file size < 2GB

Getting Help

Check AWS CloudTrail for detailed error logs
Verify IAM permissions in AWS Console
Test AWS credentials: aws sts get-caller-identity

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with the AWS SDK for Rust
Uses Amazon Transcribe for speech-to-text conversion
Inspired by the need for simple, reliable transcription tools

Commit count: 0