Parser types

How we extract releases from a source: the difference between a stable adapter and the AI fallback, and what to expect from each.

Every source needs a parser: the code that turns a release-notes page or GitHub repo into a structured list of releases we can analyze. We use one of two: a purpose-built stable parser, or the AI fallback for anything without dedicated support.

Stability tiers

Stable parser

Purpose-built adapter for a known source. Releases are extracted deterministically. If the upstream page changes in a way that breaks the adapter, fetches return zero entries with a logged warning so we notice quickly. No per-poll LLM cost.

Best-effort (AI fallback)

We don't have a deterministic parser for the source yet, so the page text is sent to a language model that extracts a release list. Accuracy depends on how structured the page is. Output may miss or duplicate entries when the page changes; verify against the source URL before acting on a release. We cache results by content hash so unchanged pages don't re-bill.

The current catalog of stable parsers is at Sources → Catalog.

The parser kind also determines which provenance the source's trust score ends up with: GitHub-backed parsers get OpenSSF Scorecard scores (or a GitHub-signals composite), VS Code / Open VSX extensions get marketplace signals fed through an AI assessor, and the AI fallback parser ends up with an AI-only assessment based on the limited public signals we can collect.

What about other URLs?

Any URL that isn't handled by a stable parser is routed through the AI fallback. We fetch the page, strip it to plain text, and ask a language model to return a structured list of releases.

What we accept

Fallback URLs must be https:// and must resolve to a public IP. We reject hostnames that resolve into loopback, RFC1918 private space, link-local addresses (including cloud-metadata endpoints), or any non-routable range, since fetching those would point our backend at our own internal services. Redirects are followed, but each hop is re-validated against the same list, capped at three hops.

When the fallback works well

  • The page is a single, scrollable changelog.
  • Each release has a clear heading (e.g. ## v1.2.3, 2025-04-01).
  • The page renders server-side (no JavaScript hydration required to see the changelog).

When it struggles #

  • Paginated changelogs. We only see the first page of HTML.
  • JS-rendered pages. We fetch raw HTML, with no headless browser. If the changelog only appears after a client-side fetch, we won't see it.
  • Mixed content. Pages that intermix unrelated content with the changelog may produce noisy extractions.

If a URL keeps producing wrong or missing releases, ask us to build a deterministic adapter instead. There's a Request a deterministic tracker button right in the Add source modal when a URL routes to the fallback, and we prioritize by demand. See Reference → Parser stability.

Because the fallback is best-effort and its first sync counts toward your plan's fair-use allowance, adding one is an explicit opt-in in the modal (a confirmation checkbox); stable-parser sources skip that step.

Why we don't run AI on stable parsers #

Stable parsers extract release data deterministically from page structure, so there's nothing for the model to do. Skipping the LLM call keeps polling cheap and predictable. The only AI cost on a stable source is the one-time diff analysis when a new release is detected.

AI-fallback sources are different: the first time anyone tracks one, its initial sync runs an AI extraction of the page plus summaries of the releases found, and that one-time work counts toward the fair-use allowance of the user who added it. See Fair use.

Frequently asked questions #

What if a repository doesn't use GitHub releases? #

GitHub repos work best when they publish tagged releases, which give clean version boundaries for diff analysis. If a project publishes its releases somewhere else, add it by pasting the release-notes or changelog URL instead. When we have a deterministic parser for that page it's tracked reliably with no AI cost; otherwise it routes to the best-effort AI fallback you opt into explicitly, and its first sync counts toward your plan's fair-use allowance (so results can miss or duplicate entries). Generic sources still get AI release summaries, but git diff analysis and risk scoring run only on GitHub repository sources.

How do I request support for a new release-notes source? #

When a URL routes to the best-effort AI fallback, the Add source modal shows a Request a deterministic tracker button. Add an optional product name and an example release link, then submit; we log every request so we can prioritise by demand and build a stable, AI-free parser, usually within a few days. You can also email info@devupdate.io with the URL. See requesting a tracker for the full flow.