Vulnerability scanners used to emit a few dozen CVEs a week. Then the LLM-assisted reporters arrived — agentic pentesters, AI-assisted CVE triagers, supply-chain scanners that re-evaluate every dependency on every commit — and the weekly count went up an order of magnitude. The 2024 NVD backlog made it worse: thousands of CVEs sat unscored for months, then landed in one batch.
The bottleneck is not finding vulnerabilities anymore. The bottleneck is deciding which of the ones you've already found actually matter for your systems.
That is what effective-risk rescoring does.
The one-paragraph mechanism#
Take a CVE. Take a structured description of the asset it might affect — exposure, criticality, data classes, the controls actually in place. Run both through a deterministic rule engine that knows how to adjust the CVSS vector: a remote-code-execution CVE against an isolated, MFA-gated, WAF-fronted internal service is not the same risk as the same CVE against an internet-exposed unpatched edge box. The engine returns an effective score, every altered metric annotated with the rule that changed it, and a citation chain back to the framework or threat-intel source the rule encodes.
Same input today and same input next quarter produce the same output. That is the property that makes it audit-defensible.
Data flow#
flowchart LR
A["Vulnerability scanners
+ AI reporters
+ NVD feed"] -->|CVE ID + base CVSS| B["Customer's MCP agent"]
C["Asset description
exposure / criticality
data classes"] --> B
D["Control evidence
attested vs evidenced"] --> B
B -->|effective_risk_inline_batch| E["Effective-risk engine"]
E --> F{"Rule library evaluates
CVE × asset × controls"}
F -->|rule fires| G["Altered metric
+ citation"]
F -->|no rule, gap detected| H["Context revision proposal
ask analyst for X"]
G --> I["Effective score
+ provenance chain"]
H --> I
I --> J["Ticket: suppress / escalate /
request more info"]
Every arrow is a contract, not a guess. The engine refuses to silently degrade — if a control claim cannot be evaluated, it says so and proposes the missing input.
Why CVSS base scores alone do not work anymore#
Four well-known problems, sharpened by the LLM-era volume:
- CVSS base is asset-agnostic. A 9.8 against an isolated lab box and a 9.8 against your customer-facing payments API are the same number. Your analyst has to write the difference in a comment, every time.
- CVSS temporal and environmental metrics are rarely populated. The vectors exist; the discipline of filling them in across thousands of vulnerabilities does not survive a real backlog.
- KEV listing changes the priority but not the base score. A medium that is actively exploited in the wild should outrank a non-KEV high. CVSS does not encode that ordering for you.
- Compensating controls are invisible. The fact that the affected service sits behind an authenticated reverse proxy with rate limits and request-body inspection is not in the CVE record. It is in your head, or your runbook, or nowhere.
The engine encodes all four as rules. Each one fires deterministically against the inputs you give it, and each firing carries a citation to the framework or threat-intel record the rule derives from.
What the numbers actually look like#
A typical Patch-Tuesday dump lands ~120 new CVEs on a mid-sized engineering org. Run them through effective-risk rescoring against a realistic asset inventory — most services are not internet-exposed, most have at least two of the standard compensating controls in place, a chunk are not affected because the vulnerable component is not in the call path — and the queue collapses to ~12-15 that move the needle. Of those, two or three usually carry KEV-listed promotions that bump them above what their base CVSS would suggest.
The ratio is the point. An analyst can work 12-15 well-scoped tickets with cited rationale in an afternoon. They cannot work 120 in a week.
Why deterministic, not "just ask an LLM"#
The temptation in 2026 is to feed the CVE plus a system description into a model and ask "is this exploitable here." That works for exploration. It fails for everything downstream:
- Audit. A regulator (DORA, NIS2, the insurer underwriting your cyber policy) will ask why you suppressed CVE-X against system-Y. "Our LLM said so on a Tuesday" is not a defensible answer.
- Reproducibility. The same input next quarter should produce the same score. LLMs do not guarantee that.
- Drift detection. When a rule changes — because a threat-intel source updated, because a control mapping moved — every score that depended on it can be re-run and the deltas reported. LLM outputs cannot be diffed cleanly.
The customer-facing agent is the right place for the LLM. The scoring decision is the wrong place. Effective risk splits those concerns: your agent (Claude Desktop, Copilot Studio, Cursor, anything that speaks MCP) calls a deterministic engine, presents the cited result to the analyst, and helps them write the ticket.
Two ways to invoke it#
The engine is exposed through the Ansvar gateway as two tools, both gated to Team and Company tiers:
effective_risk_inline— one CVE × one inline asset description × one set of controls. Use during a workflow when you already have the asset in the workflow frame.effective_risk_inline_batch— same shape, but up to 100 CVE IDs against the same asset. The Patch-Tuesday triage call.
Both return the same result shape: effective_score, altered_metrics with rule provenance, applicable_controls showing which fired versus which did not, and — when the engine cannot decide without more input — a context_revision_proposed result class carrying specific proposals for what to ask the analyst next. No silent fallbacks. No fabricated reductions.
For repeated scoring against the same asset over months (the audit-ledger use case), register the asset once via the persistent effective_risk tool. Each call then carries a stable scoring_context_id that anchors the audit trail.
Where this lives in the platform#
Effective-risk rescoring is one of the gateway's first-class capabilities, alongside the workflow engine (threat modelling, gap analysis, DPIA), the architecture knowledge tools, and the document-citation surface. The same MCP agent that runs your STRIDE threat model can call the engine mid-walk to score the residual risk of each identified threat. The same agent that drafts your DORA gap analysis can re-score every open vulnerability against the article-level requirement it maps to.
The product is not "another scoring tool." The product is deterministic risk reasoning, callable from the agent you already use, with citations every regulator already accepts.
Try it#
Tier requirement. The
effective_risk_inlineandeffective_risk_inline_batchtools are gated to Team and Company subscriptions. Free and Premium tiers do not include them. See /pricing for the tier matrix.
If your AI client speaks MCP and you are on Team or Company:
- Connect it to the gateway —
https://gateway.ansvar.eu/mcp, OAuth 2.1, two-minute setup. - Hand your agent a CVE list and an asset description and ask: "Use the Ansvar gateway to compute effective risk for each of these CVEs against this asset, and tell me which ones I can suppress."
- The agent calls
effective_risk_inline_batch, returns the rescored list with rule citations, and you triage the short tail.
If you want the longer mechanism, the engine and the rule library are open: effective-risk-mcp and effective-risk-rules. The rules are versioned, the schema is published, and every score the gateway returns can be reproduced by running the engine yourself against the same inputs.
The asymmetry is on your side: an attacker can generate more CVEs against you than you can read. A deterministic engine that suppresses the inapplicable ones with cited reasoning is how you keep the queue workable without hiring a second triage team.