atacan/AssemblyAI

Swift SDK for AssemblyAI - Speech-to-Text, Transcription, Speaker Diarization, and more. Supports iOS, macOS, tvOS, watchOS, and Linux.

Features

Full AssemblyAI API coverage
Built with Apple's Swift OpenAPI Generator
Async/await support
Type-safe API
Cross-platform: macOS, iOS, tvOS, watchOS, and Linux

Requirements

Swift 5.9+
macOS 13+ / iOS 16+ / tvOS 16+ / watchOS 9+

Installation

Swift Package Manager

Add the following to your Package.swift:

dependencies: [
    .package(url: "https://github.com/atacan/AssemblyAI.git", from: "1.0.0")
]

Then add the dependency to your target:

.target(
    name: "YourTarget",
    dependencies: [
        .product(name: "AssemblyAI", package: "AssemblyAI"),
        .product(name: "AssemblyAITypes", package: "AssemblyAI"),
    ]
)

Xcode

File → Add Package Dependencies
Enter: https://github.com/atacan/AssemblyAI.git
Select your desired version

Quick Start

Setup

import AssemblyAI
import AssemblyAITypes
import OpenAPIAsyncHTTPClient

// Create the client with your API key
let client = Client(
    serverURL: try AssemblyAITypes.Servers.Server1.url(),
    transport: AsyncHTTPClientTransport(),
    middlewares: [
        AuthenticationMiddleware(apiKey: "your-api-key")
    ]
)

Transcribe an Audio File

import Foundation

// 1. Upload your audio file
let audioData = try Data(contentsOf: audioFileURL)
let uploadResponse = try await client.uploadFile(
    .init(body: .binary(.init(audioData)))
)
let uploadUrl = try uploadResponse.ok.body.json.upload_url

// 2. Start transcription
let transcriptResponse = try await client.createTranscript(
    .init(body: .json(.init(
        value1: .init(audio_url: uploadUrl),
        value2: .init()
    )))
)
var transcript = try transcriptResponse.ok.body.json

// 3. Poll for completion
while transcript.status != .completed {
    try await Task.sleep(for: .seconds(1))
    let response = try await client.getTranscript(
        .init(path: .init(transcript_id: transcript.id))
    )
    transcript = try response.ok.body.json
}

// 4. Get the transcription text
print("Transcription: \(transcript.text ?? "")")

Transcribe from URL

If your audio is already hosted online:

let response = try await client.createTranscript(
    .init(body: .json(.init(
        value1: .init(audio_url: "https://example.com/audio.mp3"),
        value2: .init()
    )))
)

API Reference

Available Endpoints

| Method | Description | |--------|-------------| | uploadFile | Upload a local audio file | | createTranscript | Start a new transcription | | getTranscript | Get transcription status/result | | listTranscripts | List all transcripts | | deleteTranscript | Delete a transcript | | getSubtitles | Get subtitles (SRT/VTT) | | getSentences | Get transcript split by sentences | | getParagraphs | Get transcript split by paragraphs | | wordSearch | Search for words in transcript | | createLemurTask | Use LeMUR AI models |

Transcription Options

Enable additional features when creating a transcript:

let response = try await client.createTranscript(
    .init(body: .json(.init(
        value1: .init(audio_url: audioUrl),
        value2: .init(
            speaker_labels: true,      // Speaker diarization
            auto_chapters: true,       // Auto chapters
            entity_detection: true,    // Entity detection
            sentiment_analysis: true,  // Sentiment analysis
            auto_highlights: true,     // Key phrases
            language_code: .en_us      // Language
        )
    )))
)

Working with Results

// Get words with timestamps
if let words = transcript.words {
    for word in words {
        print("\(word.text) [\(word.start)ms - \(word.end)ms]")
    }
}

// Get speaker labels
if let utterances = transcript.utterances {
    for utterance in utterances {
        print("Speaker \(utterance.speaker): \(utterance.text)")
    }
}

// Get chapters
if let chapters = transcript.chapters {
    for chapter in chapters {
        print("\(chapter.headline): \(chapter.summary)")
    }
}

Environment Variables

Create a .env file for local development:

API_KEY=your_assemblyai_api_key

Examples

See the Tests directory for more usage examples.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is available under the MIT License. See the LICENSE file for details.

Package Metadata

Repository: atacan/AssemblyAI

Stars: 0

Forks: 0

Open issues: 0

Default branch: main

Primary language: swift

License: MIT

Topics: asr, assemblyai, ios, macos, speech-to-text, spm, swift, swift-package-manager, transcription, voice-recognition

README: README.md