the framework

The three-layer
governance model.

Governance of agent actions is three layers, not two. Definition decides what bad is. Enforcement stops the known-bad. Evidence proves who answered. The market is collapsing all three into one, enforcement, and selling the middle as the whole stack. This is the framework that separates them, and the lens that organizes the foundational positions.

George Vagenas · Vesterales · Updated June 2026

The market conversation is collapsing governance to a single layer: enforcement. Deterministic runtime tooling, policy-as-code, kill switches, immutable logs, is being sold and argued as the governance answer. The runtime engine is real and necessary. It is also the middle of three layers, not the whole stack.

Above it sits the layer that decides what the engine should enforce, and answers when the rule itself was wrong. Below it sits the layer that proves, later, who authorized the scope and whether the right policy was in force. Each layer answers a different question.

01Definition

What is this agent allowed to do, on whose behalf, within what limits, and who answers when the rule itself is wrong?

02Enforcement

Is the agent doing only that, right now, at runtime?

03Evidence

Can you prove, after the fact, who decided both?

Enforcement stops the known-bad. Definition decides what bad is. The record proves who answered for it.

Enforcement is necessary, not sufficient.

Definition

What the agent may do, and who answers when the rule is wrong.

Definition is where governance is decided. It sets authorization scope: what an agent may commit to on a principal's behalf, within what limits, under whose authority. It sets data boundaries: what data may move, and under what conditions. It sets liability: who answers when the agent acts or fails.

A deterministic engine enforces whatever it was given. If the terms are wrong or unauthorized, the engine reliably enforces the wrong thing. The quality of governance is set above the runtime, not inside it.

A second problem lives here: who is qualified to author the rule. Point-and-click policy authoring lowers the skill needed to write a rule, not the skill needed to write the right one. Operating an agent and being qualified to define its governance are different jobs.

A deterministic engine enforces whatever it was given. If the terms are wrong, it reliably enforces the wrong thing.

This is the layer the AGF pillars set, and where Positions 01, 02, and 03 operate.

Enforcement

Is the agent doing only that, right now, at runtime?

Enforcement stops the known-bad at runtime. The credible version is deterministic and in-band: the control sits in the path of the action and blocks it before it happens, rather than one model judging another model after the fact. The human belongs upstream, writing policy, not watching transactions at machine speed.

Enforcement is the layer the tooling competes on. It is necessary and replicable. It is not where governance is decided or proven. When vendors compete on policy evaluation speed, runtime monitoring coverage, and risk-framework mapping, differentiation compresses and procurement compares price sheets. That makes enforcement the commoditizing middle: the part of the stack everyone builds. Definition and evidence do not commoditize the same way, because they are organizational and accountability decisions, not runtime features.

Enforcement is necessary, not sufficient.

This is the layer Position 02 sits above, and the market timing Position 03 describes.

Evidence

Can you prove, after the fact, who answered?

Evidence proves who answered. It is the record that binds an enforced action back to the human who set the scope, so accountability can be established later, when someone asks who authorized this and whether the right policy was in force at the time. This is not paperwork after the fact. It is what makes a named owner's accountability real rather than nominal.

The hard, unsolved part is that a traceable record is not yet an attributable one. A log of what happened is not the same as proof of who authorized it, and that the authority was valid at the moment of action. Terms say what an agent may do. Only a conformance record proves what it did.

A traceable record is not yet an attributable one.

Position 05 (Identity Answers Who, Governance Answers What) lives here: identity proves who is acting and on whose behalf; the evidence record proves what was permitted and that the agent did only that. Position 04 locates the missing record, and the standards track has not yet closed the gap. [2]

the synthesis

How the layers organize the positions

An organizing lens, not a replacement for the positions.

The three-layer model locates where each foundational position operates.

Position 01, Direction, Not Access Definition, named at the executive level.
Position 02, Rule Definition, Not Enforcement The definition-above-enforcement relationship.
Position 03, The Terms Layer Defines the Category Terms above tooling, applied to category formation.
Position 04, Governance Debt Compounds Debt accrues when agents run without definition-layer terms and without an evidence-layer record.

The model's contribution is the explicit third layer. The positions already hold definition. Naming evidence as a peer layer to definition and enforcement is what is new.

# References

EU AI Act, Regulation (EU) 2024/1689. Article 9 (risk management system) is the definition layer. Article 12 (logging capabilities) is enforcement visibility. Both required, independently. eur-lex.europa.eu
NIST NCCoE. "Accelerating the Adoption of Software and AI Agent Identity and Authorization," concept paper, 5 February 2026. Poses binding agent actions to human authorization and non-repudiation as open problems: the evidence-layer gap. csrc.nist.gov

let's talk

Which layer
is your gap?

If your organization is deploying agents and you are not sure whether the gap is definition, enforcement, or evidence, that is the conversation. Most take 20 minutes.

Start the conversation

The three-layergovernance model.