m1guelpf/swift-realtime-openai
This library provides a simple interface for implementing multi-modal conversations using OpenAI's new Realtime API.
Installation
Swift Package Manager
The Swift Package Manager allows for developers to easily integrate packages into their Xcode projects and packages; and is also fully integrated into the swift compiler.
SPM Through XCode Project
- File > Swift Packages > Add Package Dependency
- Add https://github.com/m1guelpf/swift-realtime-openai.git
- Select "Branch" with "main"
SPM Through Xcode Package
Once you have your Swift package set up, add the Git link within the dependencies value of your Package.swift file.
dependencies: [
.package(url: "https://github.com/m1guelpf/swift-realtime-openai.git", .branch("main"))
]Getting started π
You can build an iMessage-like app with built-in AI chat in less than 60 lines of code (UI included!):
import SwiftUI
import RealtimeAPI
struct ContentView: View {
@State private var newMessage: String = ""
@State private var conversation = try! Conversation()
var messages: [Item.Message] {
conversation.entries.compactMap { switch $0 {
case let .message(message): return message
default: return nil
} }
}
var body: some View {
VStack(spacing: 0) {
ScrollView {
VStack(spacing: 12) {
ForEach(messages, id: \.id) { message in
MessageBubble(message: message)
}
}
.padding()
}
HStack(spacing: 12) {
HStack {
TextField("Chat", text: $newMessage, onCommit: { sendMessage() })
.frame(height: 40)
.submitLabel(.send)
if newMessage != "" {
Button(action: sendMessage) {
Image(systemName: "arrow.up.circle.fill")
.resizable()
.aspectRatio(contentMode: .fill)
.frame(width: 28, height: 28)
.foregroundStyle(.white, .blue)
}
}
}
.padding(.leading)
.padding(.trailing, 6)
.overlay(RoundedRectangle(cornerRadius: 20).stroke(.quaternary, lineWidth: 1))
}
.padding()
}
.navigationTitle("Chat")
.navigationBarTitleDisplayMode(.inline)
.task { try! await conversation..connect(ephemeralKey: YOUR_EPHEMERAL_KEY_HERE) }
}
func sendMessage() {
guard newMessage != "" else { return }
Task {
try await conversation.send(from: .user, text: newMessage)
newMessage = ""
}
}
}Or, if you just want a simple app that lets the user talk and the AI respond:
import SwiftUI
import RealtimeAPI
struct ContentView: View {
@State private var conversation = try! Conversation()
var body: some View {
Text("Say something!")
.task { try! await conversation..connect(ephemeralKey: YOUR_EPHEMERAL_KEY_HERE) }
}
}Architecture
Conversation
The Conversation class provides a high-level interface for managing a conversation with the model. It wraps the RealtimeAPI class and handles the details of sending and receiving messages, managing the conversation history, recording the user's mic, and playing model responses as they stream in.
Reading messages
You can access the messages in the conversation through the messages property. Note that this won't include function calls and its responses, only the messages between the user and the model. To access the full conversation history, use the entries property. For example:
ScrollView {
ScrollViewReader { scrollView in
VStack(spacing: 12) {
ForEach(conversation.messages, id: \.id) { message in
MessageBubble(message: message).id(message.id)
}
}
.onReceive(conversation.messages.publisher) { _ in
withAnimation { scrollView.scrollTo(conversation.messages.last?.id, anchor: .center) }
}
}
}Customizing the session
You can customize the current session using the setSession(: Session) or updateSession(withChanges: (inout Session) -> Void) methods. Note that they requires that a session has already been established, so it's recommended you call them from a whenConnected(: @Sendable () async throws -> Void) callback or await waitForConnection() first. For example:
try await conversation.whenConnected {
try await conversation.updateSession { session in
// update system prompt
session.instructions = "You are a helpful assistant."
// enable transcription of users' voice messages
session.inputAudioTranscription = Session.InputAudioTranscription()
// ...
}
}Manually sending messages
To send a text message, call the send(from: Item.ItemRole, text: String, response: Response.Config? = nil) providing the role of the sender (.user, .assistant, or .system) and the contents of the message. You can optionally also provide a Response.Config object to customize the response, such as enabling or disabling function calls.
To manually send an audio message (or part of one), call the send(audioDelta: Data, commit: Bool = false) with a valid audio chunk. If commit is true, the model will consider the message finished and begin responding to it. Otherwise, it might wait for more audio depending on your Session.turnDetection settings.
Manually sending events
To manually send an event to the API, use the send(event: RealtimeAPI.ClientEvent) method. Note that this bypasses some of the logic in the Conversation class such as handling interrupts, so you should prefer to use other methods whenever possible.
RealtimeAPI
To interact with the API directly, create a new instance of RealtimeAPI providing one of the available connectors. There are helper methods that let you create an instance from an apiKey or a URLRequest, like so:
let api = RealtimeAPI.webRTC(ephemeralKey: YOUR_EPHEMERAL_KEY, model: .gptRealtime) // or RealtimeAPI.webRTC(connectingTo: URLRequest)
let api = RealtimeAPI.webSocket(authToken: YOUR_OPENAI_API_KEY, model: .gptRealtime) // or RealtimeAPI.webSocket(connectingTo: URLRequest)You can listen for new events through the events property, like so:
for try await event in api.events {
switch event {
case let .sessionCreated(event):
print(event.session.id)
}
}To send an event to the API, call the send method with a ClientEvent instance:
try await api.send(event: .updateSession(session))
try await api.send(event: .appendInputAudioBuffer(encoding: audioData))
try await api.send(event: .createResponse())License
This project is licensed under the MIT License - see the LICENSE file for details.
Package Metadata
Repository: m1guelpf/swift-realtime-openai
Default branch: main
README: README.md