Contents

truewebber/swift-protoparser

A Swift library for parsing Protocol Buffers `.proto` files (proto2 & proto3) into AST and descriptors without `protoc`.

Overview

SwiftProtoParser enables native parsing of Protocol Buffers schema files directly in Swift without requiring the protoc compiler. This is useful for building code generation tools, schema analyzers, API documentation generators, and other applications that need to process .proto files at runtime.

Installation

Add to your Package.swift:

dependencies: [
    .package(url: "https://github.com/truewebber/swift-protoparser.git", from: "0.8.4")
]

Basic Usage

Parsing Proto Files

import SwiftProtoParser

// Parse a single .proto file
let result = SwiftProtoParser.parseFile("user.proto")
switch result {
case .success(let descriptorSet):
    let file = descriptorSet.file.last!
    print("Package: \(file.package)")
    print("Messages: \(file.messageType.map { $0.name })")
    print("Services: \(file.service.map { $0.name })")
case .failure(let error):
    print("Parse error: \(error.localizedDescription)")
}

Working with Imports

// Parse with import resolution
let result = SwiftProtoParser.parseFile(
    "api.proto",
    importPaths: [
        "/path/to/proto/files",
        "/path/to/google/protobuf"
    ]
)

// Parse entire directory
let result = SwiftProtoParser.parseDirectory(
    "/path/to/proto/files",
    recursive: true,
    importPaths: ["/path/to/imports"]
)

Note: SwiftProtoParser does not bundle well-known types (google/protobuf/*.proto). If your .proto files import them, add the directory containing google/protobuf/ to importPaths. You can obtain the files from the official protobuf repository or from a local protoc installation (typically /usr/local/include or /usr/include/google/protobuf).

Working with Descriptors

// parseFile returns a FileDescriptorSet — all files in topological order
let result = SwiftProtoParser.parseFile("user.proto")
switch result {
case .success(let descriptorSet):
    // Iterate over all resolved files (dependencies first, requested file last)
    for file in descriptorSet.file {
        print("File: \(file.name)")
        print("Package: \(file.package)")
        for message in file.messageType {
            print("  message \(message.name): \(message.field.count) fields")
        }
    }
case .failure(let error):
    print("Error: \(error)")
}

Features

  • Proto3 and Proto2 Support: Both syntax versions fully supported, including required fields, extensions, group fields, and default values
  • AST Generation: Parse files into structured Abstract Syntax Tree
  • Descriptor Building: Generate Google_Protobuf_FileDescriptorProto compatible with SwiftProtobuf
  • Map Fields: Full support for map types with automatic synthetic entry message generation (protoc-compatible)
  • Dependency Resolution: Handle import statements and multi-file dependencies
  • Custom Options (Extensions): Full support for extend google.protobuf.* and the (ext).sub_field qualified option name syntax
  • Extension Range Options: Full support for extensions N to M [declaration = { … }] syntax
  • Proto3 Optional Fields: optional fields in proto3 generate synthetic oneofs and proto3_optional = true, matching protoc output exactly
  • Reserved Ranges: reserved N to max; syntax supported in both messages and enums
  • Scope-Aware Type Resolution: Strict protobuf scoping rules with sibling nested-type resolution (matches protoc behavior)
  • Qualified Types: Nested message references and well-known types resolved from disk via importPaths
  • Well-Known Types Verified: Integration-tested against protoc-generated descriptors for all google/protobuf/*.proto
  • Performance Caching: Content-based caching with 85%+ hit rates
  • Incremental Parsing: Only re-parse changed files in large projects
  • Streaming Support: Memory-efficient parsing of large files (>50MB)

Supported Features

The library supports both the proto3 and proto2 specifications, including:

Core Features

syntax = "proto3";
package example.v1;

import "google/protobuf/timestamp.proto";

// Messages with all field types
message User {
  string name = 1;
  int32 age = 2;
  repeated string emails = 3;
  map<string, string> metadata = 4;
  google.protobuf.Timestamp created_at = 5;
  
  // Nested messages
  message Address {
    string street = 1;
    string city = 2;
  }
  
  // Oneof groups
  oneof contact {
    string email = 10;
    string phone = 11;
  }
}

// Enums
enum Status {
  STATUS_UNSPECIFIED = 0;
  STATUS_ACTIVE = 1;
}

// Services with streaming
service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc StreamUsers(stream GetUserRequest) returns (stream User);
}

Custom Options (Extend)

import "google/protobuf/descriptor.proto";

extend google.protobuf.FileOptions {
  string api_version = 50001;
}

extend google.protobuf.MessageOptions {
  bool enable_validation = 50002;
}

option (api_version) = "v1.0";

message ValidatedMessage {
  option (enable_validation) = true;
  string email = 1;
}

Performance Features

Caching

Content-based AST caching is built into the library. Repeated parsing of the same file content returns a cached result without re-running the Lexer and Parser, achieving 85%+ hit rates in typical workflows. Cache management is automatic and requires no configuration.

Incremental Parsing

An incremental parsing engine tracks file modification timestamps and content hashes across multiple calls. Only files that have actually changed — and their direct dependents — are re-parsed. This reduces parsing time by 60–80% for large proto directories with infrequent changes.

Benchmarking

An internal benchmarking system measures single-file, string, directory, and descriptor generation throughput with statistical analysis (mean, median, standard deviation). It is used for regression detection in the development workflow.

Requirements

  • Swift 5.10+
  • macOS 12.0+ / iOS 15.0+ / watchOS 8.0+ / tvOS 15.0+
  • Linux (Ubuntu 20.04+)

Dependencies

Documentation

Use Cases

  • Protocol Buffer code generators for Swift
  • Schema validation and linting tools
  • API documentation generators from .proto files
  • Proto file analysis and visualization tools
  • Dynamic proto file processing without protoc
  • Build systems requiring schema introspection

Known Limitations

No Type-Linking for Custom Options

SwiftProtoParser parses .proto files without performing type-linking — the step where a compiler resolves custom option names to their extension field types (as protoc does). As a result, custom options (defined via extend) are stored in FieldDescriptorProto.uninterpreted_option with fully-populated NamePart arrays rather than as typed extension fields.

This matches the behaviour of bufbuild/protocompile when run in unlinked mode, and is sufficient for schema analysis, documentation, and code generation use cases that do not require resolved option values at runtime.

Standard built-in options (deprecated, go_package, java_package, etc.) are stored in their typed fields, exactly as protoc would encode them.

If your use case requires fully resolved option types, run protoc to generate a FileDescriptorSet and load that directly.

Testing

The library has comprehensive test coverage with 1690 tests covering all functionality.

Reference descriptor generation

# Generate well-known type reference descriptors
Scripts/generate_well_known_descriptors.sh

# Generate handcrafted and client-proto reference descriptors
Scripts/generate_extension_option_descriptors.sh

Running tests

# Run all tests
swift test

# Run with coverage
make test
make coverage

Test coverage: 96.49% (lines), 93.21% (functions), 94.18% (regions)

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines on:

  • Code style and formatting
  • Testing requirements (90%+ coverage for new code)
  • Pull request process
  • Development workflow

License

MIT License. See LICENSE for details.

Package Metadata

Repository: truewebber/swift-protoparser

Default branch: master

README: README.md