diokaratzas/swift-subtitle-kit
A Swift 6 library for parsing, converting, resyncing, and saving subtitle files.
Highlights
- Object-first API -- parse into a
Subtitlevalue, then convert, resync, inspect, or save from that same value. - 9 built-in formats with a unified timing/text model (
SubtitleDocument). - Auto-detection by file extension, filename, and content sniffing.
- Round-trip fidelity -- metadata, styles, and cue attributes survive parse/serialize cycles within the same format.
- Extensible -- add new formats by conforming to
SubtitleFormatand registering at runtime. - Concurrency-ready -- all model types are value types and
Sendable. The global registry is thread-safe. - Zero dependencies -- only Foundation.
Supported Formats
| Format | Extension | Type | Notes | | --- | --- | --- | --- | | SubRip | .srt | Time-based | Most widely used subtitle format | | WebVTT | .vtt | Time-based | W3C web standard; supports cue IDs, settings, and metadata blocks | | SubViewer | .sbv | Time-based | YouTube caption format | | MicroDVD | .sub | Frame-based | Requires a frame rate (fps) to convert to/from time-based formats | | Sub Station Alpha | .ssa | Time-based | SSA v4; preserves styles and script info metadata | | Advanced SSA | .ass | Time-based | ASS v4+; superset of SSA with richer styling | | LRC | .lrc | Time-based | Synchronized lyrics; end times are inferred from the next cue | | SAMI | .smi | Time-based | Microsoft format; HTML-based cue content | | JSON | .json | Time-based | Generic array-of-objects interchange format | | Custom | any | any | User-defined via SubtitleFormat protocol |
Installation
Add the package to your Package.swift:
dependencies: [
.package(url: "https://github.com/dioKaratzas/swift-subtitle-kit.git", from: "1.0.0")
]Then add the library product to your target dependencies:
.target(
name: "MyApp",
dependencies: [
.product(name: "SubtitleKit", package: "swift-subtitle-kit")
]
)Requirements: Swift tools-version 6.0, iOS 13+, macOS 10.15+, tvOS 13+, watchOS 6+.
Documentation
- API Docs: https://diokaratzas.github.io/swift-subtitle-kit/documentation/subtitlekit/
Quick Start
Parse with Auto-Detection
import SubtitleKit
let subtitle = try Subtitle.parse(rawSRTText)
print(subtitle.formatName) // "srt"
print(subtitle.cues.count) // number of timed cuesParse with Explicit Format
let subtitle = try Subtitle.parse(rawText, format: .vtt)Parse with Detection Hints
let subtitle = try Subtitle.parse(rawText, fileName: "episode.srt")
// or
let subtitle = try Subtitle.parse(rawText, fileExtension: "vtt")Load from File
let subtitle = try Subtitle.load(from: fileURL)
// Extension of the URL is used for format detectionAccessing Cue Data
let subtitle = try Subtitle.parse(content, format: .srt)
for cue in subtitle.cues {
print("[\(cue.startTime)ms - \(cue.endTime)ms] \(cue.plainText)")
}
// SubtitleEntry includes cues, metadata, and styles
for entry in subtitle.entries {
switch entry {
case .cue(let cue): print(cue.plainText)
case .metadata(let meta): print("\(meta.key)")
case .style(let style): print(style.name)
}
}SubtitleCue fields:
| Property | Type | Description | | --- | --- | --- | | id | Int | Cue sequence number | | cueIdentifier | String? | Optional format-specific ID (e.g., WebVTT cue IDs) | | startTime | Int | Start time in milliseconds | | endTime | Int | End time in milliseconds | | duration | Int | Computed: endTime - startTime | | rawText | String | Original cue text (may contain formatting tags) | | plainText | String | Tag-stripped and normalized cue text | | frameRange | FrameRange? | Frame numbers for frame-based formats (MicroDVD) | | attributes | [SubtitleAttribute] | Format-specific key/value attributes |
Converting Formats
Get Converted Text
let vttText = try subtitle.text(format: .vtt, lineEnding: .lf)Get a Re-Parsed Subtitle Object
let vttSubtitle = try subtitle.convert(to: .vtt, lineEnding: .lf)
print(vttSubtitle.formatName) // "vtt"One-Shot Static Conversion
let output = try Subtitle.convert(
rawSRTText,
from: .srt,
to: .vtt,
lineEnding: .lf
)Serialize Back to Source Format
let text = try subtitle.text() // uses source format and line endingResyncing Timestamps
Offset Shift
// Shift all cues forward by 2 seconds
let shifted = subtitle.resync(.init(offset: 2_000))Ratio Scaling
// Speed up by 5%
let faster = subtitle.resync(.init(ratio: 1.05))Combined
let adjusted = subtitle.resync(.init(offset: 500, ratio: 0.98))Custom Transform
let custom = subtitle.resync { start, end, frame in
(start + 100, end + 300, frame)
}Mutating Variant
var mutable = subtitle
mutable.applyResync(.init(offset: -500))Cleaning Cue Text
let cleaned = subtitle.clean([
.removeSDH,
.removeWatermarks,
.removeSpeakerLabels,
.removeCuesContainingMusicNotes,
.removeAllLineBreaks,
.mergeCuesWithSameText,
.fixUppercaseText,
.removeCurlyBracketTags,
.removeHTMLTags
])Use clean() with no arguments to apply all built-in operations in default order, or applyClean(...) for the mutating variant.
If you need change stats/details (remaining/removed/modified and cue-by-cue status):
let result = subtitle.cleanWithReport([.removeSDH, .removeSpeakerLabels])
print(result.report.originalCueCount)
print(result.report.remainingCueCount)
print(result.report.modifiedCueCount)
print(result.report.removedCueCount)
for change in result.report.changes {
print(change.cueID, change.status, change.changedBy)
}Saving to File
// Infer format from the destination file extension
try subtitle.save(to: outputURL)
// Explicit format and line ending
try subtitle.save(to: outputURL, format: .srt, lineEnding: .crlf)Format Detection
Detection follows a strict priority order:
- Explicit format in
SubtitleParseOptions.format - File extension argument (
fileExtension:) - Filename extension extracted from
fileName: - Content sniffing via registered
canParsechecks
Content sniffing order: VTT, LRC, SMI, ASS, SSA, SUB, SRT, SBV, JSON.
// Detect without parsing
let format = Subtitle.detectFormat(in: rawText)
let format = Subtitle.detectFormat(in: rawText, fileName: "track.srt")Custom Format Extension
1. Conform to SubtitleFormat
import SubtitleKit
struct PipeFormat: SubtitleFormat {
let name = "pipe"
let aliases = ["pipe", "pip"]
func canParse(_ content: String) -> Bool {
content.split(whereSeparator: \.isNewline).contains { line in
line.split(separator: "|", maxSplits: 2).count == 3
}
}
func parse(_ content: String, options: SubtitleParseOptions) throws(SubtitleError) -> SubtitleDocument {
var entries: [SubtitleEntry] = []
for (i, line) in content.split(whereSeparator: \.isNewline).enumerated() {
let parts = line.split(separator: "|", maxSplits: 2, omittingEmptySubsequences: false)
guard parts.count == 3,
let start = Int(parts[0]),
let end = Int(parts[1])
else {
throw SubtitleError.malformedBlock(format: name, details: String(line))
}
entries.append(.cue(.init(
id: i + 1, startTime: start, endTime: end,
rawText: String(parts[2]), plainText: String(parts[2])
)))
}
return SubtitleDocument(formatName: name, entries: entries)
}
func serialize(_ document: SubtitleDocument, options: SubtitleSerializeOptions) throws(SubtitleError) -> String {
let lines = document.cues.map { "\($0.startTime)|\($0.endTime)|\($0.rawText)" }
return lines.joined(separator: options.lineEnding.value)
+ (lines.isEmpty ? "" : options.lineEnding.value)
}
}2. Add Static Accessor
extension SubtitleFormat where Self == PipeFormat {
static var pipe: SubtitleFormat { PipeFormat() }
}3. Register and Use
SubtitleFormatRegistry.register(.pipe)
let parsed = try Subtitle.parse("0|1000|Hello\n", format: .pipe)
let srt = try parsed.text(format: .srt)SAMI-Specific Options
Format-specific serialization options are grouped in SubtitleSerializeOptions rather than polluting every method signature. For SAMI output:
let options = SubtitleSerializeOptions(
format: .smi,
lineEnding: .crlf,
sami: .init(title: "My Subtitles", languageName: "English", closeTags: true)
)
let smiText = try subtitle.text(using: options)
try subtitle.save(to: outputURL, using: options)Concurrency and Thread Safety
- Model types (
Subtitle,SubtitleDocument,SubtitleCue,SubtitleEntry, etc.) are all value types conforming toSendable. They are safe to pass across actor/task boundaries. - Registry --
SubtitleFormatRegistry.currentis protected by an internal lock. The staticregister(_:)method is atomic. - Parsers and serializers are stateless struct methods. As long as custom format implementations are also
Sendable, concurrent parsing across different tasks is safe. - In tests, call
SubtitleFormatRegistry.resetCurrent()in adeferblock to prevent cross-test leakage.
Error Handling
All errors are typed as SubtitleError, which conforms to LocalizedError:
| Case | Meaning | | --- | --- | | unsupportedFormat(String) | Named format not registered | | unableToDetectFormat | No format matched by hints or content | | malformedBlock(format:details:) | Structural parse error in a format block | | invalidTimestamp(format:value:) | Unparseable timestamp string | | unsupportedVariant(format:details:) | Recognized but unsupported format variant | | invalidFrameRate(Double) | Non-positive or invalid FPS value |
do {
let subtitle = try Subtitle.parse(brokenText)
} catch let error as SubtitleError {
print(error.errorDescription ?? "Unknown subtitle error")
}API Reference
Main Types
| Type | Role | | --- | --- | | Subtitle | Primary entry point: parse, convert, resync, save | | SubtitleDocument | Unified document model (entries array + format name) | | SubtitleEntry | Enum: .cue, .metadata, .style | | SubtitleCue | Timed cue with text, timestamps, and attributes | | SubtitleMetadata | Key/value metadata from format headers | | SubtitleStyle | Named style with key/value fields | | SubtitleAttribute | Key/value pair used across cues, metadata, styles |
Configuration Types
| Type | Role | | --- | --- | | SubtitleParseOptions | Format hints, FPS, whitespace preservation | | SubtitleSerializeOptions | Target format, line ending, FPS, SAMI-specific options | | SubtitleResyncOptions | Offset, ratio, frame-value mode | | LineEnding | .lf or .crlf |
Format System
| Type | Role | | --- | --- | | SubtitleFormat | Protocol for format adapters | | SubtitleFormatRegistry | Registration, resolution, and detection | | SRTFormat, VTTFormat, ... | Built-in format implementations |
Limitations and Edge Cases
- Cross-family conversion normalizes style/metadata to the lowest common denominator. ASS/SSA styles are best preserved when staying in the ASS/SSA family.
- SAMI HTML semantics (classes, nested tags) are simplified to plain text when converting to non-SAMI formats.
- MicroDVD (
.sub) stores frame numbers; converting to/from time-based formats requires an accuratefpsvalue. The default is 25 fps. - LRC has no explicit end times; SubtitleKit infers end times from the start of the next cue (final cue gets a 2-second default duration).
- JSON format uses
JSONSerializationfor broad compatibility; the schema follows the subsrt-ts convention (array of objects withtype,start,end,content,textfields). - BOM handling -- UTF-8 byte order marks are stripped during normalization. The
sourceHadByteOrderMarkproperty records whether one was present. - Line endings -- source line endings (LF vs CRLF) are detected and preserved by default. Override with the
lineEnding:parameter on any serialization method.
Package Metadata
Repository: diokaratzas/swift-subtitle-kit
Default branch: main
README: README.md