Parser types

How we extract releases from a source — the difference between a stable adapter and the AI fallback, and what to expect from each.

Every source needs a parser — the code that turns a release-notes page or GitHub repo into a structured list of releases we can analyze. We use one of two: a purpose-built stable parser, or the AI fallback for anything without dedicated support.

Stability tiers

Stable parser

Purpose-built adapter for a known source. Releases are extracted deterministically. If the upstream page changes in a way that breaks the adapter, fetches return zero entries with a logged warning so we notice quickly. No per-poll LLM cost.

Best-effort (AI fallback)

We don't have a deterministic parser for the source yet, so the page text is sent to a language model that extracts a release list. Accuracy depends on how structured the page is. Output may miss or duplicate entries when the page changes — verify against the source URL before acting on a release. We cache results by content hash so unchanged pages don't re-bill.

The current catalog of stable parsers is at Sources → Catalog.

What about other URLs?

Any URL that isn't handled by a stable parser is routed through the AI fallback. We fetch the page, strip it to plain text, and ask a language model to return a structured list of releases.

When the fallback works well

The page is a single, scrollable changelog.
Each release has a clear heading (e.g. ## v1.2.3 — 2025-04-01).
The page renders server-side (no JavaScript hydration required to see the changelog).

When it struggles #

Paginated changelogs. We only see the first page of HTML.
JS-rendered pages. We fetch raw HTML — no headless browser. If the changelog only appears after a client-side fetch, we won't see it.
Mixed content. Pages that intermix unrelated content with the changelog may produce noisy extractions.

If a URL keeps producing wrong or missing releases, tell us — we'll prioritize a stable adapter. See Reference → Parser stability.

Why we don't run AI on stable parsers #

Stable parsers extract release data deterministically from page structure, so there's nothing for the model to do. Skipping the LLM call keeps polling cheap and predictable — the only AI cost on a stable source is the one-time diff analysis when a new release is detected.