Risk scoring

Every release we analyze gets a risk score: a single integer from 0 to 100 meant to answer "how nervous should I be about merging this update?"

The score band #

Band	Meaning
0–39 (Low)	Patch-level changes, dependency bumps, doc-only edits. Usually safe to update without a careful read.
40–69 (Medium)	Feature additions or refactors. Worth a glance at the summary before merging.
70–100 (High)	Documented breaking changes, undocumented signature changes, or security-sensitive edits. Read the summary in full before updating.

Wherever the score appears (the Pulse feed and every release card on a source's detail page) it renders as a colour-coded Risk NN badge so you can skim by heat rather than reading numbers: teal for Low, amber for Medium, coral for High.

In the email digest, a release flagged with documented breaking changes is never presented below the Medium band, so a card can't read the contradictory "Low risk + breaking changes", even if the raw inputs scored it lower. The underlying 0–100 score shown elsewhere is unchanged; this is a presentation floor specific to the digest.

"High signal" releases #

A busy project can publish dozens of releases that are mostly routine. To make the relevant ones stand out, a release card is tagged High signal when it ships documented breaking changes, security updates, or scores in the High band (70+). This is the same filter that drives the Pulse "High-signal releases" section, so a release reads as relevant the same way on both surfaces. The tag keys off content as well as the number: a release with documented breaking changes is flagged High signal even when its raw churn-based score lands in the Low band, so you won't skim past it.

What goes into the score #

The exact weights are tuned over time, but the inputs are stable:

Code churn. Lines added and removed across the diff. A 20-line patch is rarely as risky as a 2,000-line refactor.
Files modified. Wide changes (many files touched) score higher than narrow ones.
Documented breaking changes. Anything the maintainer explicitly flagged in the release notes.
Undocumented changes. Signature changes, removed exports, behavior shifts that the diff shows but the changelog doesn't mention. See Undocumented changes.
Security signals. Indicators of CVE relevance, dependency vulns, auth/crypto-related code paths.

How to use the number #

Treat the score as a triage signal, not a verdict. Two specific patterns to watch:

Low score, undocumented changes flagged. Even a low overall score is worth reading if undocumented changes are present; the score is averaging many signals, and one specific signal might still bite you.
High score from churn alone. A large but mechanical refactor (e.g. a monorepo restructure) can push the score high without containing any behavior change. The summary will say so; read it before you assume the worst.

If you find scores consistently miscalibrated for a specific source, tell us. Tuning happens at the model and prompt level, not per-user, but feedback shapes the next iteration.

How does this differ from the source trust score? #

Risk scores ask "how risky is this particular release?": they look at the diff between two consecutive versions of the same project. The trust score asks the orthogonal question: "how comfortable should I be depending on this project at all?" A well-maintained source can publish a high-risk release (intentionally breaking change), and a low-trust source can publish a low-risk release (a one-line patch). The dangerous combination is low trust + high risk: the alerts surface treats those as the strongest signal to stop and read.