Opinion

The Regulatory Stack, Part 3: How the EU AI Act Made Verification a Legal Question, Not a Best-Practice One

The EU AI Act binds providers and deployers to a continuous, evidenced verification posture by December 2027. Here is the operational reality and an architectural answer.

The Regulatory Stack, Part 3: How the EU AI Act Made Verification a Legal Question, Not a Best-Practice One — hero image

The music has stopped for compliance theatre. The EU AI Act binds providers and deployers to a continuous, evidenced verification posture. Here is the operational reality - and an architectural answer.

The Deployer’s Trap

There is a comforting lie circulating in enterprise boardrooms: “We buy our models from OpenAI, Anthropic, or Microsoft, so the regulatory risk belongs to them.”

That is structurally wrong, and the AI Act is unambiguous about why.

The Act draws a hard legal line between a provider (the entity that places a model on the market or puts it into service under its own name) and a deployer (the entity that uses that system under its authority in the course of a professional activity). A professional user of a foundation model wrapped in custom prompts, vector-embedded grounding, and business-process tool calls is, at minimum, a deployer under Article 3(4) (European Parliament and Council, 2024). But role allocation is not automatic: depending on who develops the system, whose name it carries when put into service, who alters its intended purpose, and whether the customisation amounts to a “substantial modification,” the same enterprise can be a deployer, a downstream provider, or both under Article 25 - inheriting the heavier obligations as the role changes.

Under the provisional 7 May 2026 political agreement on the AI Omnibus, the expected application date for stand-alone Annex III high-risk systems is 2 December 2027, while high-risk systems embedded in regulated products are expected to move to 2 August 2028 (European Commission, 2026a). That is roughly eighteen months from publication of this article. That is the engineering deadline.

The Act builds the deployer’s obligations directly into Article 26 - use according to the provider’s instructions, competent human oversight, ensuring that input data under the deployer’s control is relevant and sufficiently representative, monitoring of operation, notifying or suspending the system if a risk arises, and retention of system-generated logs under the deployer’s control for at least six months (European Parliament and Council, 2024, Article 26). Those deployer obligations only work if the provider-side control stack is real. Articles 9, 15, and 72 do not classify a system as high-risk - Article 6 and the Annexes do that - but once a system is high-risk, these provisions define much of the operating burden: lifecycle risk management, robustness and cybersecurity, and post-market monitoring.

Regulators have made one thing explicit: the verification gap is the deployer’s operational problem, even where the statutory obligation sits with the provider. OpenAI is not going to sign an attestation for your memory partitions, your tool integrations, or your runtime loops. The market surveillance authority will turn up at your door, not theirs.

The era of treating AI safety as a best-practice opinion is dead. The AI Act has made verification a binding legal question.

The Triad of Liability: Articles 9, 15, and 72

These three articles define the substantive operating burden once a system is high-risk: lifecycle risk management (Article 9), accuracy/robustness/cybersecurity (Article 15), and provider post-market monitoring (Article 72). Classification of a system as high-risk comes from Article 6 and the Annexes, not from the Triad. The Triad obligations sit on the provider - but a deployer’s exposure rises sharply the moment customisation, re-branding, or a change of intended purpose tips the entity into a downstream-provider role under Article 25. Most enterprise deployments operate close to that line. The architectural answer is the same in either configuration: a deterministic control plane beneath the model.

Article 9 - Continuous Risk Management

Article 9 mandates a risk management system “understood as a continuous iterative process planned and run throughout the entire lifecycle of a high-risk AI system, requiring regular systematic review and updating” (European Parliament and Council, 2024, Article 9). Identification, analysis, evaluation, mitigation, and re-testing - at any time, in any event before being placed on the market, and continuously afterwards.

Most organisations still treat risk management as a design-time artefact: a document signed off before launch and revisited annually. That model is obsolete the moment you ship an autonomous agent, because the agent changes its execution path based on natural-language inputs the risk register never anticipated. If your risk management does not run at runtime, it does not exist.

The architectural primitive that satisfies Article 9 at runtime is Policy-as-Code. Before an agent can execute a tool call, the request is intercepted at the gateway and evaluated by a logic-based policy engine - Open Policy Agent (OPA), Cedar, or equivalent - against declarative rules co-versioned with the agent. The safety logic is decoupled, by construction, from the model’s probabilistic reasoning. Risk decisions become deterministic, evidenced, and replayable.

Article 15 - Robustness and Cybersecurity

Article 15 requires high-risk AI systems to achieve “an appropriate level of accuracy, robustness, and cybersecurity,” and to perform consistently in those respects throughout the lifecycle. The Act then names the threat model with unusual specificity: data poisoning, model poisoning, adversarial examples, model evasion, and confidentiality attacks (European Parliament and Council, 2024, Article 15).

A system prompt instructing the agent to “ignore previous instructions” is not an architecture. It is a wish. Prompt injection is the SQL injection of our era, and the legacy mitigations - input filters, regex denylists, hopeful system prompts - fail with depressing reliability against an attacker who controls the document the agent is about to read.

Article 15 is satisfied at the architectural layer, not the prompt layer. That means strict schema enforcement on every tool envelope (no free-form arguments crossing the trust boundary), input provenance tagging so untrusted context is treated as data not instructions, and where the data classification warrants it, hardware-backed Trusted Execution Environments (Intel TDX, AMD SEV-SNP, ARM CCA) to protect data in use. The UK ICO has been blunt that AI security is “a present-day data protection duty” under Article 32 GDPR - the same direction the EDPB, CNIL, and BfDI are already moving (ICO, 2025).

Article 72 - Post-Market Monitoring

Article 72 obliges providers to “actively and systematically collect, document and analyse” performance data from across the deployed fleet, throughout the system’s lifetime, against a documented post-market monitoring plan that forms part of the Annex IV technical documentation (European Parliament and Council, 2024, Article 72). Article 73 then layers a serious-incident reporting obligation on top.

Article 72 is a provider obligation. The deployer’s operational exposure runs through Article 26: monitor the system, keep system-generated logs under the deployer’s control for at least six months, and notify or suspend the system if a risk arises. The practical effect is the same as if Article 72 fell directly on the deployer. First, you cannot supply the provider with the telemetry needed under Article 72 if your own logs are application-grade. Second, when a regulator asks why your agent denied a loan, redirected a payment, or flagged a customer at 14:32 GMT, “Error: 500” is not an answer - it is a regulatory failure.

The architectural answer is Semantic Observability with Verifiable Audit Logs. Capture decision-relevant inputs and outputs, model configuration, tool envelopes, retrieved-context identifiers or hashes, policy decisions, human-oversight actions, and incident signals as structured events. Write them to a hash-chained, tamper-evident ledger with deterministic replay. The black box becomes a glass box. A replay is evidence. A log is not.

A Worked Example: The HR Screening Agent

Consider a multinational that builds an agentic CV-screening assistant by wrapping a foundation model, grounding it in three years of historic hiring data, and granting it tool-calling rights to the applicant tracking system. Under Annex III(4), employment AI is high-risk. The enterprise is a deployer; if the customisation rises to the level of a substantial modification, it may also be the provider under Article 25.

The agent rejects a qualified candidate. The candidate’s lawyer writes to the data protection authority. The market surveillance authority opens a file. Three questions land on the CISO’s desk before lunch:

  1. Show the documented risk assessment that identified historic hiring data as a bias vector, and the runtime controls that mitigate it. - This is Article 9. If the answer is “we ran a fairness audit at launch,” the answer is wrong.
  2. Show that the system rejected the attacker-controlled CV that tried to inject instructions into the screening reasoning. - This is Article 15. If the mitigation was a prompt that said “ignore embedded instructions,” the mitigation has already failed.
  3. Reconstruct the inputs, retrieved context, policy decisions, and the main elements of the decision for the candidate’s specific interaction. - This is the deployer-side telemetry that supports the provider’s Article 72 monitoring, and it underpins Article 86’s right to clear and meaningful explanations of the AI system’s role and the main elements of the decision. If the audit log says “Score: 0.34. Decision: Reject,” the deployer has nothing.

Every single one of these is solved at the runtime, not in the model. None of them is solved by the model vendor.

Enforcement Has Teeth - and a Postal Address

The penalty tiers are well-publicised: up to EUR 35 million or 7% of global turnover for prohibited practices, EUR 15 million or 3% for breaches of high-risk obligations, EUR 7.5 million or 1% for misleading information (European Parliament and Council, 2024, Article 99). What is less well understood is the enforcement geometry.

The European AI Office sits inside the Commission and supervises general-purpose AI models directly. Everything else - the high-risk regime that catches the bulk of enterprise deployments - is enforced by national market surveillance authorities designated by each Member State (European Commission, 2025a). Those authorities have powers to conduct remote monitoring, demand source code and training data, propose joint investigations, and impose penalties. In several Member States they will be the existing data protection authority. France’s CNIL has already positioned itself for the role; Germany’s BfDI and the Italian Garante are moving the same direction (CNIL, 2024).

This matters because the practical enforcement posture will not be uniform. A deployer operating across the EU should expect their first knock to come from whichever Member State authority is most active - and to be judged against guidance the Commission only began issuing in mid-2026 (European Commission, 2026b). Build to the strictest national interpretation, not the average.

Moving from Checklist to Operating Model

When the legal mandates are mapped onto a standard application stack, the gap is structural. ISO 27001 controls, application logs, and a vendor questionnaire do not satisfy Article 9, Article 15, or Article 72 for an agentic deployment - not because the controls are bad, but because they were designed for systems whose execution path was deterministic.

The shift required is from “make the model smarter” to “build a governed runtime.” Compliance becomes a continuous build artefact rather than a quarterly internal panic. The Triad maps cleanly to four primitives: policy-as-code at the gateway, schema-enforced tool envelopes with TEE-backed workloads where data class demands it, hash-chained semantic observability with deterministic replay, and non-human identity (SPIFFE/SPIRE) so every action is attributable to a specific, ephemeral, configuration-bound agent instance.

An open control-plane framework - GATE is one expression of this shape - wires those primitives into the path of every tool call by default, with mappings to ISO/IEC 42001 and the NIST AI RMF Generative AI Profile (NIST, 2024) emitted as evidence. The point is the architectural pattern, not the brand.

The Sakura Position

The AI Act has permanently shifted the economics of enterprise AI. Move fast and break things is no longer an entrepreneurial badge; it is an executive liability with a published price list.

Three immediate moves for any enterprise inside scope:

  1. Stop relying on vendor disclosures: Treat foundation-model providers as suppliers of raw, ungoverned inference. The burden of securing the context, the memory partitions, and the tool surface is yours. The model card is a starting point, not a defence.

  2. Enforce the tool-gateway invariant: No autonomous agent talks directly to an internal system. Every call traverses a deterministic policy decision point with a structured audit trail, or the runtime kills the session. The exception is the breach.

  3. Upgrade logs to evidence: If your post-market monitoring cannot deterministically replay an anomalous interaction from a hash-chained record, it will not survive a regulatory inquiry. Treat evidence collection as an infrastructure requirement, not a developer convenience.

Autonomy without architecture is an invitation to an audit. The work is in the wiring.

If your leadership team is mapping the operational realities of the AI Act onto your current deployments - or you need to re-architect to meet the December 2027 high-risk deadline - schedule a conversation. The strategic framework is in The Executive’s AI Playbook; the technical implementation maps live in our Trustworthy Agentic AI Blueprint. Get in touch.


Disclosure: The author is a lifetime member of the OWASP Foundation. This article reflects an independent reading of public OWASP material and does not represent the views of the Foundation.

Disclosure: Sakura Sky implements the Governed Agent Trust Environment (GATE) in client engagements. GATE is an open framework, published under CC BY 4.0 at deterministicagents.ai, and is vendor-neutral. The framework was authored by Andrew Stevens; readers should weight the GATE references in this article accordingly.

Not legal advice: This article provides general commentary on the EU AI Act and related instruments for an engineering and executive audience. It is not legal advice and is not a substitute for advice from qualified counsel. Readers should obtain independent legal advice on how the AI Act, GDPR, the Cyber Resilience Act, sector-specific regulation, and national implementing measures apply to their specific products, deployments, and jurisdictions.

References

CNIL, 2024. AI Act: Data Protection Authorities want to be in charge of high-risk systems. Paris: Commission Nationale de l’Informatique et des Libertés. Available at: https://www.cnil.fr/en/ai-act-data-protection-authorities-want-be-charge-high-risk-systems [Accessed 21 May 2026].

European Commission, 2025a. Market Surveillance Authorities under the AI Act. Brussels: Directorate-General for Communications Networks, Content and Technology. Available at: https://digital-strategy.ec.europa.eu/en/policies/market-surveillance-authorities-under-ai-act [Accessed 21 May 2026].

European Commission, 2026a. AI Act: Implementation timeline following the political agreement of 7 May 2026. Brussels: Directorate-General for Communications Networks, Content and Technology. Available at: https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai [Accessed 21 May 2026].

European Commission, 2026b. Guidelines for providers and deployers of AI high-risk systems. Brussels: Directorate-General for Communications Networks, Content and Technology. Available at: https://digital-strategy.ec.europa.eu/en/policies/guidelines-ai-high-risk-systems [Accessed 21 May 2026].

European Parliament and Council, 2024. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union, L 2024/1689, 12 July. Available at: https://eur-lex.europa.eu/eli/reg/2024/1689/oj [Accessed 21 May 2026].

Information Commissioner’s Office (ICO), 2025. AI and data protection: security expectations under Article 32 UK GDPR. Wilmslow: ICO. Available at: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/ [Accessed 21 May 2026].

National Institute of Standards and Technology (NIST), 2024. Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile. NIST AI 600-1. Gaithersburg, MD: U.S. Department of Commerce. Available at: https://doi.org/10.6028/NIST.AI.600-1 [Accessed 21 May 2026].

Sakura Sky and Stevens, A., 2026. The Trustworthy Agentic AI Blueprint: 16 Missing Primitives for Enterprise Autonomy, Version 1.0.4. Available at: https://whitepaper.download/trustworthyagenticai/ [Accessed 21 May 2026].