There is a moment in every enterprise AI project when the demo stops being the point.
The demo worked. The agent read the documents, reasoned through the problem, and produced the right answer. Now someone asks: can we put this in production? And the honest answer - the one that derails projects and delays roadmaps - is: not safely, not yet.
The capability exists. The infrastructure to govern it does not.
That gap is what I am trying to close. Today I am publishing GATE - the Governed Agent Trust Environment - as a fully open framework for anyone building or deploying agentic AI at enterprise scale.
The problem is architectural, not a prompting problem
When people talk about making AI agents “safer,” the conversation usually lands on alignment, fine-tuning, and system prompts. These matter. But they are the wrong layer for enterprise governance.
System prompts are instructions to the model. They are probabilistic controls - the model will usually follow them, and usually is not good enough when the agent has access to your ERP system, your customer database, or your financial accounts.
Consider what an enterprise agent actually does. It reads documents. It retrieves context from vector databases. It calls APIs. It sends emails. It modifies records. Each of these is a real-world action with real consequences. In some cases - transferring funds, deploying infrastructure, publishing external communications - those consequences are irreversible.
The question for enterprise deployment is not “is the model well-aligned?” It is: can you prove every action was authorised? Can you reproduce exactly what happened after an incident? Can you stop a runaway agent within five seconds? Can you show an auditor a tamper-evident record of every decision the system made?
Prompt guardrails cannot answer these questions. They are invisible to auditors, bypassable by adversarial inputs, and produce no verifiable evidence trail.
What is needed is a control plane - a deterministic layer that wraps the probabilistic model and enforces governance at the boundaries where side effects actually occur.
What GATE is
GATE defines 16 controls organised into four layers.
The first layer - Identity and Integrity - establishes the foundation. Every agent instance gets a unique, short-lived cryptographic identity bound to its runtime artifacts: the container image, the policy bundle, the prompt configuration. No shared service accounts. No long-lived API keys. If an agent is compromised, its identity is revoked instantly and the blast radius is contained to that instance.
The second layer - Runtime Enforcement - is where actions are actually controlled. A Tool Gateway sits between the agent runtime and every tool it can call. The agent proposes actions; the gateway authenticates the agent, validates the request against a schema, evaluates a policy-as-code ruleset, checks invariants, enforces budgets, and emits evidence before anything executes. The agent never has direct network access to tool backends. No bypass paths exist.
Alongside policy evaluation, GATE separates invariant checking into its own hardened bundle. Policy decides when an action is allowed. Invariants decide whether an allowed action is ever permissible. These are different questions. A policy might permit a funds transfer given the right conditions; an invariant says no single transfer may ever exceed a hard limit, regardless of what policy says. Invariants are non-overridable at runtime. If an organisation needs to override one in an emergency, it requires a signed break-glass record and dual approval.
The third layer - Observability and Forensics - produces the evidence. Every governed action generates a policy decision record, a hash-chained ledger event committed to WORM storage, and a replay trace step. The replay trace captures all non-determinism at the tool boundary: model configuration, bundle hashes, retrieved context hashes, and full request/response snapshots. Given a run ID, an operator can reproduce exactly what happened - serving tool responses from stored snapshots rather than live systems - without relying on the model producing the same output twice.
The fourth layer - Orchestration and Ecosystem - governs the network. Multi-agent messages are signed, versioned, and nonce-protected to prevent replay and spoofing. The orchestration control plane enforces backpressure, safe rollout, and rollback. Continuous adversarial validation runs in CI to gate deployments against attack scenarios.
Why open
I considered whether this should be a product. The answer was no, at least not at this stage.
The problem with proprietary governance frameworks is that they create a dependency at exactly the layer you least want a single-vendor dependency: the layer that controls what your agents are allowed to do.
More importantly, the field needs a shared vocabulary. Right now every team building agentic AI is solving the same governance problems from scratch, in different ways, producing incompatible evidence formats that cannot be correlated across systems. GATE is an attempt to establish common contracts - JSON Schema definitions for tool envelopes, policy decision records, audit ledger events, replay traces - that implementations can interoperate around.
The framework paper, JSON Schema contracts, OPA/Rego policy and invariant bundles, a Python reference library, conformance checks, and six operational runbooks are all published today under open licences. The schemas and documentation are CC BY 4.0. The code is MIT.
What is published today
The framework is available at deterministicagents.ai, with the following components published across separate versioned repositories:
- gate-contracts contains the normative JSON Schema definitions for all control plane events - the tool envelope, policy decision record, audit ledger event, replay trace, HITL decision record, and multi-agent message envelope schemas. It also includes the canonical JSON specification with implementations in Python, Node.js, and Go, and the machine-readable C01-C16 control catalog.
- gate-python is a reference implementation of the contracts in Python. It provides working implementations of canonical JSON hashing, envelope construction, hash-chained ledger event building and verification, replay trace recording, ES256 action signing and verification, and JSON Schema validation. Three runnable examples demonstrate a full tool gateway flow including HITL, invariant checking, and end-to-end evidence chain verification.
- gate-policies contains the OPA/Rego baseline policy bundle, a separately versioned invariant bundle, a full unit test suite, an Agent Bill of Materials example for a bounded-tier financial agent, and a worked HITL integration example.
- gate-conformance contains 15 conformance checks with test procedures and evidence requirements, a fillable conformance report template, BigQuery SQL queries for evidence chain traversal and integrity verification, and six Day-2 operational runbooks covering break-glass stop, policy rollback, incident replay, HITL outage, invariant bundle update, and agent decommission.
What comes next
The automated conformance runner - a CLI tool that runs all 15 checks against a live environment and produces a machine-readable conformance report - is in development for v1.3. Formal mappings to specific regulatory requirements are also on the list: EU AI Act, DORA for financial services, and HIPAA for healthcare deployments.
If you are building agentic AI infrastructure, deploying agents in a regulated environment, or advising organisations on AI governance, I would welcome your feedback. The framework is designed to be implementable, not just readable. If you find something missing, overconstrained, or wrong for your context, the issues and discussions are open.
GATE is published at deterministicagents.ai. The strategic companion to this framework is the Trustworthy Agentic AI Blueprint.




