haven-apps/havenfeedparser
A pure Swift package for fetching and parsing RSS 2.0, Atom 1.0, and JSON Feed 1.1 feeds. Built with strict Swift 6 concurrency, zero third-party dependencies.
Requirements
- iOS 18+ / macOS 15+ / tvOS 18+ / watchOS 11+ / visionOS 2+
- Swift 6.2+
- No external dependencies (Foundation, XMLParser, CryptoKit only)
Installation
Add as a local package dependency in Xcode, or reference it in your Package.swift:
.package(url: "https://github.com/Haven-Apps/HavenFeedParser")Usage
Parse from Data
import HavenFeedParser
let service = FeedService() // default cache: 300 s, 200 entries
let feed = try await service.parseFeed(from: data)
print(feed.title) // "My Blog"
print(feed.feedFormat) // .rss, .atom, or .json
for item in feed.items {
print(item.title, item.link)
}Fetch and Parse from URL
let feed = try await service.parseFeed(from: url)Conditional Refresh (ETag / If-Modified-Since)
let updated = try await service.refreshFeed(feed)
// Returns the same feed if the server responded 304 Not ModifiedCache Configuration
let service = FeedService(
cacheMaxAge: 600, // cache feeds for 10 minutes
cacheMaxEntries: 50 // keep at most 50 feeds in memory
)Format-Specific Parsers
let rssFeed = try await RSSParser().parse(rssData)
let atomFeed = try await AtomParser().parse(atomData)
let jsonFeed = try await JSONFeedParser().parse(jsonData)Format Detection
let format = try FeedDetector.detect(data) // .rss, .atom, or .jsonArchitecture
Sources/HavenFeedParser/
├── Models/ Sendable, Codable value types (Feed, FeedItem, FeedAuthor, …)
├── Parsers/ FeedParsing conformances (RSSParser, AtomParser, JSONFeedParser)
├── Services/ Actor-isolated FeedService, FeedFetcher, FeedCache
├── Utilities/ DateParser, FeedDetector, HTMLSanitizer
├── Protocols/ FeedParsing, FeedProviding
└── Errors/ FeedError enumKey Types
| Type | Description | |---|---| | FeedService | Unified public API — auto-detects format, fetches, parses, caches (configurable max age and entry count) | | Feed | Parsed feed with title, link, items, iTunes metadata, ETag, etc. | | FeedItem | Single article/episode with authors, categories, enclosures | | FeedAuthor | Author name, email, and URL | | FeedCategory | Topic or tag with an optional domain/scheme | | FeedEnclosure | Media attachment (audio, video, etc.) with URL, size, and MIME type | | FeediTunesInfo | iTunes/podcast metadata (artwork, duration, episode/season numbers) | | FeedFormat | Enum: .rss, .atom, .json | | FeedError | Typed errors for network, parsing, and validation failures | | FeedDetector | Inspects raw data to determine the feed format | | HTMLSanitizer | Strips HTML tags and decodes entities to plain text | | DateParser | Parses RFC 822, ISO 8601, and common date formats found in feeds |
FeedItem Identity
FeedItem conforms to Identifiable. The raw feed identifier (guid, id, etc.) is stored in feedID, which may be nil if the source omitted it. The computed id property always returns a non-nil, deterministic String via the following fallback chain:
feedID— used when present and non-empty.- Composite key — a combination of
link,title, anddatePublishedwhen at least one is available. - SHA-256 hash (via CryptoKit) — of
summaryandcontentas a last resort.
The computed id is not serialised by Codable; only feedID is persisted. After decoding, id is recomputed from the item's fields so it remains stable across process launches (Swift's randomised hashValue is never used).
item.feedID // "article-123" or nil — the raw value from the feed
item.id // always non-nil, deterministic, safe for SwiftUI List/ForEachHTML Sanitization
Use HTMLSanitizer.plainText(from:) to convert HTML content to plain text:
let plain = HTMLSanitizer.plainText(from: item.summary ?? "")Podcast Support
First-class iTunes namespace support including:
- Feed-level: author, subtitle, summary, explicit, artwork
- Episode-level: duration, episode/season numbers, episode type, per-episode artwork
- Enclosures for audio/video attachments
Security
- SSRF protection —
FeedFetcherblocksfile://,ftp://, and other non-HTTP schemes. Redirect targets are validated against a comprehensive blocklist of private and reserved IPv4 CIDR ranges (RFC 1918 private, loopback, link-local, CGNAT/shared per RFC 6598, multicast, reserved/future-use, and broadcast) as well as IPv6 loopback, unspecified, link-local, unique-local, documentation (2001:db8::/32), and IPv4-mapped addresses. DNS resolution viagetaddrinfochecks all resolved addresses. - Exotic IPv4 literal blocking — Non-standard IPv4 representations that some HTTP clients accept — bare decimal (
2130706433), hexadecimal (0x7f000001), and octal-dotted (0177.0.0.1) — are parsed and checked against the same blocklist. - DNS rebinding mitigation — After the HTTP connection is established,
FeedFetchervalidates the resolved remote IP address against the SSRF blocklist a second time, mitigating TOCTOU / DNS-rebinding attacks where DNS returns a different (private) address after the pre-flight check. - Streaming response with size enforcement — HTTP responses are streamed via
URLSession.bytes(for:)with an incremental byte count. If the response exceeds 10 MB the download is aborted immediately, preventing memory exhaustion from oversized responses. - Sanitized error messages — Network errors returned to callers are sanitized to avoid leaking internal server details, filesystem paths, or other sensitive information from
NSError.localizedDescription. - Input limits — XML parsers cap text accumulation at 1 MB per element and item counts at 10,000. JSON parser rejects payloads over 50 MB.
- HTML sanitization —
HTMLSanitizerstrips tags and decodes entities but does not produce sanitized HTML. Treat its output as plain text only.
Concurrency
All public API is async/await. Parsers and services are Sendable structs or actors, safe for concurrent use under strict Swift 6 concurrency checking.
License
BSD 3-Clause — see LICENSE.md.
Package Metadata
Repository: haven-apps/havenfeedparser
Default branch: main
README: README.md