| Crates.io | srt_subtitles_parser |
| lib.rs | srt_subtitles_parser |
| version | 0.1.3 |
| created_at | 2025-11-05 20:28:45.417227+00 |
| updated_at | 2025-11-17 09:59:30.505165+00 |
| description | A Rust-based parser that processes .srt (SubRip Subtitle) files and converts them into a structured data format |
| homepage | |
| repository | https://github.com/annnzinch/srt-subtitles-parser |
| max_upload_size | |
| id | 1918571 |
| size | 34,924 |
Crate: https://crates.io/crates/srt_subtitles_parser
Docs: https://docs.rs/srt_subtitles_parser
Srt Subtitles Parser is a Rust-based parser that processes .srt (SubRip Subtitle) files. The parser reads .srt files, validates their structure, and extracts subtitle entries consisting of index number, a start timestamp, an end timestamp, and one or more lines of subtitle text. The parser converts the file into a structured data format, which can be used for:
The parser processes SRT subtitle files with the following structure:
1
00:00:00,000 --> 00:00:02,500
Welcome to the Example Subtitle File!
2
00:00:03,000 --> 00:00:06,000
This is a demonstration of SRT subtitles.
Each subtitle entry consists of:
HH:MM:SS,mmm --> HH:MM:SS,mmm
The parser uses Pest grammar with the following rules:
WHITESPACE = _{ " " | "\t" }
NEWLINE = _{ "\r\n" | "\n" }
index = @{ ASCII_DIGIT+ }
hours = @{ ASCII_DIGIT{2} }
minutes = @{ ASCII_DIGIT{2} }
seconds = @{ ASCII_DIGIT{2} }
milliseconds = @{ ASCII_DIGIT{3} }
timestamp = { hours ~ ":" ~ minutes ~ ":" ~ seconds ~ "," ~ milliseconds }
" --> ".timecode = { timestamp ~ WHITESPACE* ~ "-->" ~ WHITESPACE* ~ timestamp }
text_line = @{ (!NEWLINE ~ ANY)+ }
text_content = { text_line ~ (NEWLINE ~ text_line)* }
subtitle_block = {
index ~ NEWLINE ~
timecode ~ NEWLINE ~
text_content ~ NEWLINE ~
NEWLINE
}
subtitle_file = {
SOI ~
(subtitle_block)+ ~
NEWLINE* ~
EOI
}
The parsing process includes:
The parser produces the following structured data:
pub struct SubtitleFile {
pub subtitles: Vec<Subtitle>,
}
pub struct Subtitle {
pub index: u32,
pub start: Timestamp,
pub end: Timestamp,
pub text: String,
}
pub struct Timestamp {
pub hours: u32,
pub minutes: u32,
pub seconds: u32,
pub milliseconds: u32,
}
The structured subtitle data can be used for:
Serialization: conversion to JSON using Serde
Deserialization: conversion from JSON using Serde
Text Analysis: extracting text for translation or word count
Quality Control: detecting timing errors, missing indices, or overlapping subtitles
Statistics: calculating total duration, average subtitle length, reading speed
Timecode Manipulation: shifting all timestamps by a fixed offset
Time Conversion: converting timestamps to/from milliseconds for calculations
1
00:00:00,000 --> 00:00:02,500
Welcome to the Example Subtitle File!
2
00:00:03,000 --> 00:00:06,000
This is a demonstration of SRT subtitles.
{
"subtitles": [
{
"index": 1,
"start": {
"hours": 0,
"minutes": 0,
"seconds": 0,
"milliseconds": 0
},
"end": {
"hours": 0,
"minutes": 0,
"seconds": 2,
"milliseconds": 500
},
"text": "Welcome to the Example Subtitle File!"
},
{
"index": 2,
"start": {
"hours": 0,
"minutes": 0,
"seconds": 3,
"milliseconds": 0
},
"end": {
"hours": 0,
"minutes": 0,
"seconds": 6,
"milliseconds": 0
},
"text": "This is a demonstration of SRT subtitles."
}
]
}