A DPIA is two jobs wearing one name. One job is clerical: screen the processing, identify which legal triggers fire, write down what the system does, collect the evidence, attach the right article references. The other job is judgment: decide whether the processing is necessary and proportionate, decide whether the residual risk is acceptable, decide whether to consult the supervisory authority. The first job eats most of the hours. The second job is the one a regulator actually holds you accountable for.
Most "AI DPIA" tooling gets this backwards. It tries to automate the judgment — generate a risk score, generate an acceptance, generate a sign-off — and leaves the clerical evidence work as a manual afterthought, which is exactly the part that falls apart under audit. We built the opposite. The cited-evidence trail is the automatable core. The judgment stays with a named human, and the machine's job is to put the right facts and the right article text in front of that human so the decision is defensible.
This post draws the line in detail, against the actual text of Article 35 GDPR rather than a paraphrase of it.
What Article 35 actually requires#
Article 35(1) GDPR sets the obligation: where a type of processing is likely to result in a high risk to the rights and freedoms of natural persons, the controller must carry out a DPIA before the processing starts. The word before is doing real work — a DPIA produced after a breach is not a DPIA, it is a post-mortem.
Article 35(3) names three cases where the high-risk threshold is presumed met and a DPIA is mandatory:
- A systematic and extensive evaluation of personal aspects relating to natural persons based on automated processing, including profiling, on which decisions are based that produce legal effects or similarly significantly affect the person.
- Large-scale processing of special categories of data (the Article 9 categories — health, biometrics, political opinions, and the rest) or of personal data relating to criminal convictions and offences (Article 10).
- Systematic monitoring of a publicly accessible area on a large scale.
Article 35(4) then lets each national supervisory authority publish its own list of processing operations that always require a DPIA. The Dutch AP, the French CNIL, the Swedish IMY each maintain one. A processing activity legal in the abstract can land on a mandatory list in the jurisdiction where it runs, which is why DPIA screening has to be jurisdiction-aware, not just GDPR-aware. We cover this directly — the gateway routes a DPIA screening across the law and data-protection corpora for 30 audited jurisdictions, so the AP's national list is checked, not assumed.
And Article 35(7) defines what the assessment must contain. Four parts, and the structure matters because it is where the automatable/judgment line falls almost perfectly:
- (a) a systematic description of the processing operations and purposes, including, where applicable, the legitimate interest pursued
- (b) an assessment of the necessity and proportionality of the processing in relation to the purposes
- (c) an assessment of the risks to the rights and freedoms of data subjects
- (d) the measures envisaged to address the risks, including safeguards, security measures, and mechanisms to ensure the protection of personal data
Part (a) is clerical. Parts (b) and (c) are judgment. Part (d) is a mix — the menu of safeguards is clerical, the residual-risk acceptance is judgment.
The WP248 screening — the most automatable step#
Before you decide a DPIA is mandatory under Article 35(3), you usually run the screen the Article 29 Working Party set out in WP248. It lists nine high-risk indicators, and the working rule is that processing meeting two or more of them is likely high-risk and needs a DPIA:
- Evaluation or scoring, including profiling
- Automated decision-making with legal or similarly significant effect
- Systematic monitoring
- Sensitive data or data of a highly personal nature
- Data processed on a large scale
- Matching or combining datasets
- Data concerning vulnerable data subjects
- Innovative use or application of new technological solutions
- Processing that prevents data subjects from exercising a right or using a service
This is the single most automatable step in the whole exercise, and it is the one teams most often do badly by hand — because doing it by hand means re-reading the WP248 definitions every time and arguing about whether 40,000 records counts as "large scale" for your sector.
Hand an AI agent a system description and it can mark each of the nine indicators present or not-present with a one-line rationale, count them, and recommend a scope. We run exactly this in the DPIA workflow: for each indicator the agent states present or not-present with a rationale, and the count drives the recommendation — three or more indicators, or special-category data at scale, pushes the assessment to a comprehensive scope. The key discipline is that the agent verifies the EDPB-defined terms against the actual definitions before it decides — it does not get to invent what "large scale" means. When the description says "we monitor employee endpoints," the agent checks the systematic-monitoring definition and the vulnerable-data-subjects indicator (employees are vulnerable in the employment relationship, a point WP248 makes explicitly) rather than guessing.
A screen is a classification task with a fixed rubric and verifiable inputs. That is the sweet spot for automation. The output is not a decision — it is a recommendation with its working shown, which a DPO confirms in seconds instead of assembling from scratch in an hour.
The systematic description — assembly, not judgment#
Article 35(7)(a) wants a systematic description of the processing operations and purposes. In practice this is the longest section of any real DPIA and almost none of it is judgment. It is data assembly: what personal data, in what categories, from what sources, for what purposes, on what legal basis, shared with which processors, retained for how long, transferred where.
If your organisation has already produced a data flow diagram or a threat model of the system, most of these facts already exist as structured records. The agent's job is to pull them, lay them out against the Article 35(7)(a) headings, and flag what is missing. When a customer uploads an existing system description, the workflow extracts the data categories, data subjects, purposes, legal basis, and processors from the document before it asks a single question, and only asks follow-ups for what is genuinely not present in the upload.
The one place judgment creeps into part (a) is the legal basis. If the description claims a legal basis — say Article 6(1)(f) legitimate interest for fraud prevention — the agent pulls the actual text of that article through the gateway and checks the stated basis against the processing purpose. If someone cites consent for a legally mandated KYC/AML process, that is inconsistent, and the agent flags it conversationally and suggests the more defensible basis. But it flags; it does not overwrite. The controller still owns the legal-basis decision. The agent's contribution is that it noticed, with the article text in hand, before the supervisory authority did.
Necessity and proportionality — the hard human line#
Article 35(7)(b) is where automation stops doing the deciding. Necessity and proportionality is a legal judgment about whether the same purpose could be achieved with less data or less intrusive means, and whether the intrusion is justified by the benefit. No model gets to make that call on a controller's behalf.
What the engine can do is build the case file the judgment needs. It pulls the Article 5 principles — purpose limitation, data minimisation, storage limitation — and the relevant legal-basis article, and lays the processing against them. Where the basis is legitimate interest, it walks the three-part test structure: the purpose test (is the interest legitimate, not merely convenient), the necessity test (could the purpose be met with less data), and the balancing test (do the data subjects' rights override the controller's interest). It cross-checks the data minimisation claim against the data categories actually extracted in the description — if special-category data is collected but the stated purpose could be met without it, that contradiction surfaces as a flag.
That is the boundary, stated plainly: the engine assembles the three-part test, populates it with cited article text and the system's own facts, and identifies the tensions. The DPO reads it and writes the conclusion. The machine prepares the argument; the human owns the verdict. Trying to automate the verdict is how you get a DPIA that reads fluently and falls over the first time a regulator asks why.
Risk to rights and freedoms — structured, then judged#
Article 35(7)(c) wants an assessment of risks to the rights and freedoms of data subjects — not business risk, not risk to the controller. This distinction trips up a lot of DPIAs, which quietly slide into assessing reputational and financial risk to the company instead of harm to the person.
The WP248 harm categories give a structure the agent can work against: physical harm, material damage, non-material damage (discrimination, reputational harm to the individual, psychological distress), loss of control over personal data, limitation of rights, and the heightened impact on vulnerable groups. The agent identifies at least one risk per applicable category and names the specific GDPR right each risk threatens, so the assessment stays anchored to data-subject harm rather than drifting into enterprise risk.
Scoring the risks — likelihood against severity on a matrix — is structured enough to automate the bookkeeping, but the severity judgment for harm to a person is not a number a model should assign unilaterally. We treat the scoring as something the analyst reviews at every step. The matrix is consistent; the inputs to it are reviewed by a human who understands the data subjects in question. This is the same principle behind our effective-risk engine for vulnerabilities: the machinery is deterministic, the judgment of what a score means for these people stays with a person, and every step carries its provenance so the reasoning is reproducible rather than vibes.
Safeguards and the Article 36 trigger#
Article 35(7)(d) asks for the measures that address the risks — the safeguards, the security measures, the mechanisms. The menu side of this is automatable: for each identified risk, propose technical and organisational measures, and map them to the relevant security requirements. Encryption, pseudonymisation, access controls, and the rest map cleanly to Article 32 (security of processing) and Article 25 (data protection by design and by default), and pulling the article text for each gives the safeguard a citation rather than a hand-wave.
The judgment side is the residual risk. After the safeguards, is the remaining risk acceptable? That is an accountability decision the controller owns, and it cascades directly into Article 36: where a DPIA indicates the processing would result in a high risk in the absence of measures taken by the controller to mitigate the risk, the controller must consult the supervisory authority before processing. Deciding the residual risk is still high enough to trip Article 36 is a call with real regulatory consequence, and it is squarely a human one. The engine prepares the input — it surfaces the residual-risk picture and the Article 36 text — and a named person decides whether to file. No automation gets to decide on its own that you do or do not need to talk to your regulator.
Why the cited-evidence trail is the part worth automating#
Here is the asymmetry that drove the whole design. The judgment in a DPIA takes a skilled DPO maybe a few hours across all four parts of Article 35(7). The evidence assembly and citation work takes days, and it is the part that decays — references go stale, an article gets amended, a national list gets updated, and the DPIA on file now cites something that no longer says what it claimed.
So we automate the durable, verifiable part. Every regulatory reference in a DPIA produced through the Ansvar gateway is resolved against a live corpus rather than a model's memory. The agent calls the gateway's search and get_provision tools to pull the actual current text of Article 35, the Article 5 principles, the legal-basis article, Article 32, and any supervisory-authority guidance that applies in the jurisdiction. A validate_citation call confirms a provision still says what the draft claims and flags it if the text has been amended. The citation check is deterministic — the same DPIA re-run next year resolves to the same provisions, and any drift since the last run shows up as a diff rather than a silent stale reference.
This matters because a DPIA is an accountability document. When a supervisory authority opens an inquiry, the question is rarely "did you write a DPIA" — it is "show me the assessment and the basis for your conclusions." A DPIA whose every legal reference resolves to live, current provision text, whose screening shows its working against the WP248 indicators, and whose necessity argument is built from the actual Article 5 principles is a document that survives that conversation. A DPIA assembled from a model's recollection of GDPR, however fluent, is one amended-article away from being wrong on the record.
The split is the product. The agent your team already uses — Claude, Copilot, Cursor, any MCP client — drives the workflow: it screens, it assembles the description, it pulls and validates every citation, it structures the necessity test and the risk register. The DPO does the three things only a DPO can do: judge necessity and proportionality, accept or reject the residual risk, decide on Article 36 consultation. Nobody automates the judgment. Everybody should automate the file.
If you want to see the line in practice, the DPIA workflow runs through the gateway on Premium (€249/month per seat), grounded against 30 audited jurisdictions and 61 EU regulations. Connect your MCP client and walk a real assessment in an afternoon — the quickstart is two minutes of OAuth setup. The same gateway that grounds your DPIA also runs your AI Act readiness and gap-analysis work, against the same cited corpus, so the evidence trail is consistent across every assessment your team produces.