swiftautograd/swiftrl

On-device **reinforcement learning** for Swift. Built on [SwiftGrad](https://github.com/SwiftAutograd/SwiftGrad)'s autograd engine.

Status

SwiftRL is in active development. The autograd foundation (SwiftGrad) is complete and tested.

The Problem

There is no reinforcement learning library for Swift. The only attempt (swift-rl) died in 2021 when Swift for TensorFlow was archived. Every RL tool today - Stable-Baselines3, CleanRL, RLlib, Unity ML-Agents - requires Python and cannot run on iOS.

Meanwhile:

AI in gaming is a $5.85B market growing to $38B by 2034
Mobile holds 52% of the AI gaming market
Games using adaptive AI see ~30% higher engagement
Apple has 28 million registered developers with zero RL tools

Why Swift?

| Advantage | Why It Matters for RL | |---|---| | Real-time performance | Policy updates within 16ms frame budgets. No GIL, no GC pauses. | | Privacy by default | RL agents learn from user behavior that never leaves the device. | | Native game integration | Direct access to SpriteKit, RealityKit, GameplayKit game loops. | | Unified memory | Apple Silicon shares CPU/GPU memory - no data copies for training. | | visionOS exclusive | Spatial computing is Swift-only. Adaptive spatial agents require Swift. |

Planned Architecture

SwiftRL
├── Core
│   ├── Environment        - Protocol: step(action) → (state, reward, done)
│   ├── ReplayBuffer       - Uniform and prioritized experience replay
│   ├── Policy             - Protocol for policy networks
│   └── Trainer            - Training loop orchestration
├── Algorithms
│   ├── REINFORCE          - Simplest policy gradient
│   ├── DQN               - Deep Q-Network with target network
│   ├── A2C               - Advantage Actor-Critic
│   └── PPO               - Proximal Policy Optimization
├── Environments
│   ├── GridWorld          - Navigation with obstacles
│   ├── CartPole           - Classic control benchmark
│   └── Bandit             - Multi-armed bandit
└── Optimizers
    ├── SGD               - (from SwiftGrad)
    └── Adam              - Adaptive moment estimation

Planned Usage

import SwiftRL

// Define an environment
let env = GridWorld(size: 8)

// Create a policy network (powered by SwiftGrad)
let policy = MLP(inputSize: env.observationSize, layerSizes: [64, 32, env.actionCount])

// Train with DQN
let agent = DQN(
    policy: policy,
    learningRate: 0.001,
    gamma: 0.99,
    epsilon: DecayingEpsilon(start: 1.0, end: 0.01, decay: 0.995)
)

// Training loop
for episode in 0..<1000 {
    let reward = agent.train(environment: env)
    if episode % 100 == 0 {
        print("Episode \(episode): reward = \(reward)")
    }
}

// Use the trained agent
let action = agent.act(observation: env.reset())

Target Use Cases

| Use Case | RL Algorithm | Platform | |---|---|---| | Adaptive game NPCs | PPO / DQN | iOS, visionOS | | Dynamic difficulty | Contextual bandits → PPO | iOS | | Smart notifications | Multi-armed bandit | iOS, watchOS | | Spatial agents | PPO with continuous actions | visionOS | | Automated playtesting | DQN / A2C | macOS | | Personalized fitness | Contextual bandits | watchOS, iOS |

Demo Apps

See SwiftRLDemos for playable iOS apps showcasing SwiftRL:

Snake - DQN learns to hunt food in real-time
2048 - Policy gradient discovers tile-merging strategies
Connect Four - Self-play TD learning
Blackjack - Monte Carlo policy evaluation

Part of the SwiftAutograd Organization

| Repository | Description | Status | |---|---|---| | SwiftGrad | Autograd engine | Working | | SwiftRL | Reinforcement learning (you are here) | In development | | SwiftRLDemos | Demo apps | Planned |

Research & Inspiration

micrograd by Andrej Karpathy - the autograd engine SwiftGrad is built on
CleanRL - single-file RL implementations we aim to match in clarity
Unity ML-Agents - the closest analog (but Python-dependent, desktop-only training)
Stable-Baselines3 - the API design standard for RL libraries
Apple ML Research - 8+ RL papers published 2023-2025

Contributing

SwiftRL is in early development. If you're interested in contributing, open an issue to discuss before submitting a PR.

License

MIT - see LICENSE.

Package Metadata

Repository: swiftautograd/swiftrl

Default branch: main

README: README.md