Blog

How we threat-model AI systems: STRIDE meets MCP

STRIDE was built for monolithic services in 1999. We adapted it for agentic systems where the threat surface is a 367-MCP fleet — and shipped it as a workflow your own AI client invokes through the Ansvar gateway.

We threat-model every change that crosses a trust boundary. That's a high bar when you operate an MCP gateway that fans out to 367 downstream sources across 119 jurisdictions — but the alternative is shipping security-debt that compounds at the speed of LLM tool calls.

This is how we actually do it. The same workflow your AI agent can drive end-to-end through the gateway.

Why STRIDE needed adapting#

STRIDE was Microsoft's 1999 framework for desktop and monolithic web apps. The six categories — Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege — were defined when a "data flow" was a function call across a process boundary, and an "external entity" was a human at a keyboard.

For an MCP system, the data flow looks nothing like that:

mermaid
flowchart TB
  A[Customer's AI client] -->|OAuth-authenticated MCP| B[Gateway]
  B --> C{Routing decision}
  C -->|1 of ~367 MCPs| D[Fan-out: 5–50 parallel calls]
  D --> E[Upstream regulator]
  D --> F[Law database]
  D --> G[Sector MCP]
  E & F & G --> H[Gateway parses + cites]
  H --> I[LLM synthesizes]
  I --> J[Customer sees a cited answer]

Every arrow is a potential threat boundary. Most of them weren't in scope when STRIDE was written.

Our adapted category definitions#

We rewrote the six categories to mean what they need to mean for an agent-based system:

Category Classical meaning What it means for an MCP gateway
S — Spoofing Pretending to be another user Pretending to be another tool in the tool list, or another agent in a multi-agent workflow
T — Tampering Modifying data in transit Prompt injection in upstream content that hijacks the orchestrating LLM
R — Repudiation "I didn't do that" An agent action with no audit trail back to the customer who triggered it
I — Info Disclosure Leaking sensitive data Context-window leakage across tenants when the same LLM context is reused
D — Denial of Service Service unavailable Quota exhaustion on a single tool causing cascade failure across the fan-out
E — Elevation of Privilege Becoming admin A free tier customer reaching a company tier tool via tool-list pollution

The mapping isn't 1:1 — and that's the point. Each row encodes a real incident class we've either had to fix or watch competitors get hit with.

Running STRIDE through the gateway — step by step#

This is the actionable bit. The same workflow runs whether you're using Claude Desktop, Copilot Studio, Cursor, or a custom MCP client. The shape below is captured from the live gateway, not from documentation.

0. Connect once#

Point your MCP client at https://gateway.ansvar.eu/mcp. OAuth 2.1 with Dynamic Client Registration — paste the URL, sign in, your client sees the gateway's ~25 tools. Full setup walkthroughs for each client live at /setup.

1. Start the workflow#

Ask your agent to run a STRIDE threat model. Under the hood it makes one tool call:

text
start_workflow(
  workflow_type="threat_model",
  entity_description="Payments API: PCI-DSS scope, processes ~500k tx/day,
                     deployed on AWS Fargate behind ALB. New SDK exposes
                     a delegated-payment endpoint to partner agents."
)

Three real prompt shapes that all hit this same tool — pick whichever fits your client:

text
# Natural language — agent picks the workflow
"Run a STRIDE threat model on our payments API. Here's the design doc."

# Direct MCP tool call
@ansvar-gateway start_workflow workflow_type=threat_model entity_description="..."

# Slash-command form (Claude Desktop with the Ansvar prompt pack)
/ansvar:threat_model

The gateway returns a workflow_id (UUID). Save it — if your OAuth token expires mid-walk (the typical walk is 2–4 hours), you re-authenticate and call resume_workflow(workflow_id=...) to pick up exactly where you stopped. Server-side state persists.

2. Stage 1 — System scoping#

The agent calls get_current_step(workflow_id) and gets back this prompt verbatim:

Describe the system to be threat modeled: its purpose, main components, technologies used, and deployment environment.

What are the key assets or data that need protection?

The required-fields contract is just two slots: system_description and key_assets. The agent presents these to you as a form, fills them from your design doc, and submits with submit_response(workflow_id, step_id="scoping.system_description", responses={...}, user_acknowledged=true).

3. Stage 2 — DFD generation#

This is where the gateway earns its keep. Stage 2 prompt:

Invoke /threat-modeler-dfd with the system description and any uploaded documents. Submit the returned artifact: components, data_flows, trust_boundaries, assets, plus the dfd_mermaid render and any structural_warnings.

/threat-modeler-dfd is a dedicated MCP prompt that produces a structured DFD from your scope: a list of components, data flows between them, the trust boundaries those flows cross, the assets at each component, and a Mermaid diagram you can paste straight into a PR.

Required fields for this stage: components, data_flows, trust_boundaries, assets, dfd_mermaid. The workflow's quality gate refuses to advance if any is missing — no half-built DFDs leak through to the STRIDE walk.

4. Stage 3 — Per-boundary STRIDE walk#

For every trust boundary the DFD identified, the workflow walks through the six STRIDE categories. For each (boundary × category) pair the agent fills:

  • Threat description — the specific attack that fits this category at this boundary
  • Likelihood and consequence scores, on the bands the workflow is configured to use (severity_levels is one of the overridable_configurable settings — defaults are low/medium/high/critical, but you can override per assessment)
  • Existing mitigations — controls already in place that reduce likelihood or consequence
  • Residual risk after those mitigations
  • Citation evidence — gateway searches the security-frameworks MCP for STRIDE source + the threat-stripes corpus for similar incidents

If the boundary is "free-text input from a partner agent crosses into the payment-orchestration LLM," the gateway will surface prompt-injection mitigations from OWASP's LLM Top 10, fold in any case-law on similar incident classes (premium tier), and rank by relevance.

5. Stage 4 — Scoring + refusal modes#

Each threat with residual_risk >= severity_levels.high gets a refusal-mode entry: what does the system return when this threat fires? An exception with a clear error code? A degraded response? Silent success? Silent success is the worst answer and it's almost always the default if you don't think about it. The workflow forces you to think about it.

6. Stage 5 — Report#

generate_report(workflow_id) re-resolves every citation (paragraph-hash drift check, per our citation contract), assembles the threat table, scores aggregated risk, and outputs:

  • A cited threat model in your tier's redaction schema
  • The DFD Mermaid you can drop into the design doc
  • A treatment_plan linking each high-residual threat to a planned mitigation
  • A GRC-ready export (CSV + JSON) keyed for ISO 27001 A.5–A.18 mapping if you provided that framework

Inspecting what came out#

While the walk is in progress (or after generate_report completes), you can introspect:

text
get_workflow_threats(workflow_id="e7d60b1f-...")

…to get back the threats with severity ratings and recommended mitigations identified so far. Useful for filtering to high-residual threats only before a steering review.

What's configurable#

Pulled from the gateway's list_workflow_types response — the knobs that ship today for threat_model:

Override What it controls Default
threat_categories The mnemonic set walked per boundary STRIDE (6 categories)
severity_levels Bands for likelihood / consequence scoring low / medium / high / critical

You set overrides at start_workflow time. For more invasive change (different mnemonic, e.g. STRIDE-LM that adds Lateral Movement), the right path is to fork the workflow YAML in ansvar-workflow-mcp and submit a PR — the workflow definitions are public.

LINDDUN — the privacy threat model#

LINDDUN is a separate base workflow (workflow_type="linddun"), not a variant of STRIDE. Same stage shape, different question set:

  • Linkability — can two pieces of data be associated to the same person?
  • Identifiability — can the data be tied to a specific individual?
  • Non-repudiation — can the user deny an action they took?
  • Detectability — can an outside observer infer that data exists at all?
  • Disclosure of information — can data be exposed beyond the intended recipient?
  • Unawareness — does the user know what data is being processed and why?
  • Non-compliance — is the processing lawful, proportionate, documented?

Stage 1's prompt asks for the system description plus any ROPA (Record of Processing Activities) you have — the privacy framing pulls in different upstream evidence than STRIDE's security framing.

Configurable: linddun_categories and harm_bands — the privacy harm taxonomy used to score severity differs from STRIDE's severity_levels because privacy harm and security severity aren't the same dimension.

Invoke it the same way as STRIDE — your agent calls start_workflow(workflow_type="linddun", entity_description=...) and walks the same get_current_step → submit_response loop. Recommended for DPIAs that go beyond Article 35 checkboxes — the ones an auditor will actually read.

TARA — UNECE R155 / ISO 21434 automotive cyber#

TARA (Threat Analysis and Risk Assessment) for automotive isn't a separate workflow type — it's the same threat_model workflow with the UNECE R155 control catalog plugged in via the framework parameter:

text
start_workflow(
  workflow_type="threat_model",
  framework="unece_r155",
  entity_description="ECU + telematics gateway, type-approval target market: EU + UK"
)

The gateway loads workflows/controls/unece_r155.yaml from the workflow MCP — 11 R155 controls covering CSMS requirements (7.2.1–7.2.5) and per-vehicle-type requirements (7.3.1–7.3.6). The per-control walk in stage 3 then iterates R155's controls instead of STRIDE's six, fetching ISO 21434 mappings, type-approval guidance from the relevant authority MCPs (KBA for German OEMs, RDW for Dutch, VCA for UK), and any case-law on similar incident classes.

You can also combine — framework=unece_r155 plus a custom threat_categories override gets you STRIDE-LM-flavored questions per R155 control. For ISO 21434 Annex E feasibility scoring, override severity_levels to ["very_low", "low", "medium", "high", "very_high"] to match the standard's bands.

Frameworks you can plug into threat_model and gap_analysis today#

Pulled from workflows/controls/ in the workflow MCP:

framework= Catalog Best paired with
stride (default) 6 STRIDE categories threat_model
unece_r155 11 R155 controls — automotive CSMS + vehicle type threat_model (TARA)
nis2 NIS2 Annex I + II controls gap_analysis
dora DORA Articles 5–24 ICT-risk controls gap_analysis
cra EU Cyber Resilience Act essential requirements gap_analysis

New catalogs land in workflows/controls/*.yaml — adding one is a YAML PR, not a code change.

What AI adds that STRIDE didn't anticipate#

Three threat classes that didn't exist when STRIDE was written. We treat them as first-class STRIDE categories in the workflow, not afterthoughts:

Prompt injection as a first-class threat#

Any upstream content the LLM reads is attacker-controlled if the publisher is untrusted. For an MCP that ingests EU regulator data, the publisher is trusted (vetted source-authority registry). For an MCP that scrapes random PDFs a customer uploads, the publisher is the customer's adversary, not the customer.

Mitigation pattern: structural isolation between content the LLM reasons about and instructions the LLM follows. The gateway never passes raw upstream text into the system prompt. Tool results are sandboxed in a tool-result block the LLM treats as data.

Citation fabrication#

If your model generates citations from training data instead of tool results, you have a fabrication threat. We mitigate with a deterministic grounding ratio check (GROUNDING_MIN_RATIO=0.4) that hard-refuses below threshold. This is a hard failure, not a soft degradation — see our no-silent-fallbacks rule.

Fan-out blast radius#

A single customer query can fan out to 50+ downstream MCPs. If one of those MCPs is compromised (or just buggy), the bad data lands in the answer the LLM synthesizes. We mitigate per-MCP envelope validation at the gateway, plus a contract test that asserts each MCP's _citation shape on every PR merge.

How we publish what we find#

Public threat models live at ansvar.eu/security. Per-feature models live in ADR-prefixed design docs in our architecture-documentation repo. Every customer-tier threat model is a workflow output in the gateway, not a PDF we email — so the next time the threat surface changes, the threat model regenerates with citations.

That's the asymmetric advantage of building threat models inside the platform you're modeling: the model and the system stay in sync because they share the source.

Try it yourself#

If you have an MCP-capable client:

  1. Connect it to the gatewayhttps://gateway.ansvar.eu/mcp, OAuth 2.1 + Dynamic Client Registration, two-minute setup
  2. Ask your agent: "Use the Ansvar gateway to run a STRIDE threat model on this data-flow diagram."
  3. The agent calls start_workflow(workflow_type="threat_model", entity_description=...), captures the workflow_id, and walks you through scoping → DFD → per-boundary STRIDE → scoring → report.

For LINDDUN, swap "STRIDE" for "LINDDUN privacy" in the prompt — the agent picks workflow_type="linddun" automatically. For TARA, ask for "TARA per UNECE R155" — the agent calls start_workflow(workflow_type="threat_model", framework="unece_r155", ...). The Ansvar gateway exposes 12 workflow types today (8 base + 4 jurisdictional variants) plus 5 control-catalog frameworks you can plug in; your agent's list_workflow_types call shows the full menu.

Want to read more?#

Cluster topics on the slate:

  • LINDDUN deep-dive: practical privacy threat modeling for agent systems
  • DPIA-DE vs. DPIA-SE: what jurisdictional variant overlays actually change
  • DORA digital-operational-resilience testing without breaking prod
  • The audit-ledger pattern: cryptographic audit trails for AI tool calls
  • How we map regulatory citations across 119 jurisdictions without duplicating work

Subscribe to the RSS feed, or connect the Ansvar gateway to your AI client and ask it to summarize the latest post.

Frequently asked

How do I actually run a STRIDE threat model through Ansvar?
Connect your MCP client (Claude Desktop, Copilot Studio, Cursor, or any OAuth 2.1 MCP client) to gateway.ansvar.eu, then ask your agent to start a STRIDE threat model. The agent invokes the gateway's start_workflow tool with workflow_type='threat_model', captures the returned workflow_id, and walks you through the stages — system scope, DFD generation, per-boundary STRIDE walk, severity scoring, mitigations — calling submit_response between each. A typical walk is 2-4 hours. Resume any time with the workflow_id.
Does STRIDE still work for AI systems?
Yes, but five of the six categories shift meaning when the data flow is agent → MCP → tool → upstream source. Tampering becomes prompt injection; Information Disclosure becomes context-window leakage; Elevation of Privilege becomes tool-permission escalation across the fan-out.
What's the difference between threat_model, linddun, and TARA in the gateway?
threat_model is STRIDE — security threats by element — and is also the workflow type for TARA: same workflow with framework='unece_r155' loads the UNECE R155 control catalog instead of STRIDE's six. linddun is a separate base workflow for privacy threats (Linkability/Identifiability/Non-repudiation/Detectability/Disclosure/Unawareness/Non-compliance). All three share the same scoping → DFD → per-element walk → scoring shape, but the question set and upstream enrichment differ per framework.
What's different about modeling an MCP gateway versus a REST API?
Three things: (1) the tool list is discovered dynamically per session, so the threat surface is enumerated at runtime, not at design time; (2) one logical user request can fan out to 50+ downstream tools, so blast radius is multiplicative; (3) the LLM is both the orchestrator and an untrusted parser of upstream content.
Do you publish your threat models?
The infrastructure threat model is public at ansvar.eu/security. Per-feature models live in ADR-prefixed design docs in the architecture-documentation repo. Customer-tier threat models are workflow outputs in the gateway, not PDFs — they regenerate from source when the threat surface changes.