Risk scoring

How the 0–100 risk number is computed and how to read it.

Every release we analyze gets a risk score — a single integer from 0 to 100 meant to answer "how nervous should I be about merging this update?"

The score band #

BandMeaning
0–39 (Low)Patch-level changes, dependency bumps, doc-only edits. Usually safe to update without a careful read.
40–69 (Medium)Feature additions or refactors. Worth a glance at the summary before merging.
70–100 (High)Documented breaking changes, undocumented signature changes, or security-sensitive edits. Read the summary in full before updating.

What goes into the score #

The exact weights are tuned over time, but the inputs are stable:

  • Code churn. Lines added and removed across the diff. A 20-line patch is rarely as risky as a 2,000-line refactor.
  • Files modified. Wide changes — many files touched — score higher than narrow ones.
  • Documented breaking changes. Anything the maintainer explicitly flagged in the release notes.
  • Undocumented changes. Signature changes, removed exports, behavior shifts that the diff shows but the changelog doesn't mention. See Undocumented changes.
  • Security signals. Indicators of CVE relevance, dependency vulns, auth/crypto-related code paths.

How to use the number #

Treat the score as a triage signal, not a verdict. Two specific patterns to watch:

  • Low score, undocumented changes flagged. Even a low overall score is worth reading if undocumented changes are present — the score is averaging many signals, and one specific signal might still bite you.
  • High score from churn alone. A large but mechanical refactor (e.g. a monorepo restructure) can push the score high without containing any behavior change. The summary will say so — read it before you assume the worst.

If you find scores consistently miscalibrated for a specific source, tell us. Tuning happens at the model and prompt level, not per-user, but feedback shapes the next iteration.