vincentamato/mlxdinov3
Swift port of Meta's [DINOv3](https://arxiv.org/abs/2508.10104) using [MLX Swift](https://github.com/ml-explore/mlx-swift).
Installation
Add to your Package.swift:
dependencies: [
.package(url: "https://github.com/vincentamato/MLXDINOv3.git", from: "2.0.0")
]Converting Weights
The Convert target downloads a Hugging Face checkpoint and converts it to MLX format. Only ViT models are supported.
./mlx-run.sh Convert facebook/dinov3-vits16-pretrain-lvd1689m ./Models/dinov3-vits16-mlxUsage
import AppKit
import MLX
import MLXDINOv3
let model = try DinoVisionTransformer.loadPretrained(from: "Models/dinov3-vits16-mlx")
let image = NSImage(contentsOfFile: "image.jpg")!
let inputs = try ImageProcessor()(image)
let outputs = model(inputs)
print("CLS token shape:", outputs.clsToken.shape)
print("Patch tokens shape:", outputs.patchTokens.shape)
print("Last hidden state shape:", outputs.lastHiddenState.shape)Testing
Tests use xcodebuild because MLX depends on the Metal backend (swift test won't work). Before running tests, convert the dinov3-vits16-pretrain-lvd1689m model, making sure it is saved to the test resources directory.
# Convert the model
./mlx-run.sh Convert facebook/dinov3-vits16-pretrain-lvd1689m Tests/MLXDINOv3Tests/Resources
# Run tests
xcodebuild test -scheme MLXDINOv3-Package -destination 'platform=macOS' -derivedDataPath .build/xcodeTests download PyTorch reference outputs from Hugging Face and compare against them using cosine similarity and relative L2 error.
References
License
MIT. See LICENSE.
Pretrained weights are under Meta's DINOv3 License.
Package Metadata
Repository: vincentamato/mlxdinov3
Default branch: main
README: README.md