Databricks Unity AI Gateway Explained: Governance, Guardrails, and Cost Control for Production AI Agents
Summary
Unity AI Gateway is Databricks' enterprise AI governance layer, built on Unity Catalog, that extends governance from your data and AI assets to the runtime behaviour of AI systems: the model calls, tool invocations, and agent workflows themselves. Announced at the Data + AI Summit 2026, it gives you one place to control which agents can use which models and tools, to apply guardrails against PII leakage and prompt injection, to attribute and cap AI spend, and to trace every model and tool call. The core is generally available; the more advanced runtime controls are in Beta. If you are moving from a few AI demos to a fleet of production agents, this is the control plane the rest of it has to sit on. The catch is in the name: it is an extension of Unity Catalog, so it is only as strong as the governance foundation underneath it.
Last Updated
Published
Authored By
Technical Director
Reviewed By
Managing Partner
TL;DR
- What it is: governance for the AI runtime, built on Unity Catalog. It governs models, agents, MCP services, skills, and tools, and the interactions between them.
- Four things it controls: access (who and which agent can use which model or tool), guardrails (PII, prompt injection, jailbreaks, unsafe content), cost (token-level attribution and hard spend caps), and observability (end-to-end tracing and payload logging into Unity Catalog).
- MCP governance: managed connections to tools like Slack, Jira, GitHub, SharePoint and Google Drive, with SQL-based access policies and on-behalf-of access so an agent only sees what the user is allowed to see.
- Status: core Unity AI Gateway is generally available; contextual service policies, customizable LLM guardrails, and MCP payload logging are in Beta.
- The catch: it extends Unity Catalog. It governs the runtime, but it cannot fix an ungoverned catalog underneath it. Get the foundation right first.
What is Unity AI Gateway?
For a few years, Unity Catalog has been the place you govern data and AI assets in Databricks: tables, models, permissions, lineage, audit. Unity AI Gateway extends that same governance model to what happens at runtime, when an agent actually calls a model, invokes a tool, or talks to an external system.
In Databricks' own words, it delivers "centralized governance, security controls, cost management, and agent monitoring for enterprise AI," across AI providers, coding agents, agent frameworks, enterprise applications, and custom AI systems. The important word is centralized. Instead of every team wiring its own model keys, its own logging, and its own ad-hoc guardrails, you configure governance once and apply it across the estate.
The problem it solves is the one every enterprise hits the moment AI stops being a pilot. As Databricks puts it: "As organizations scale from individual AI applications to fleets of agents connected to models, MCP services, APIs, and enterprise tools, governance challenges expand beyond model access alone." One clever agent is a demo. A hundred agents calling models, tools, and each other is an estate, and estates need governance, security, and cost control or they become chaos.
What it actually governs
Unity AI Gateway governs two layers.
The assets, registered and discoverable in Unity Catalog:
- Foundation models from multiple providers, plus Databricks-hosted models
- External model providers (Claude, GPT, Gemini, Llama, or provider-native APIs)
- MCP services (the Model Context Protocol connections agents use to reach tools)
- Agents and skills (reusable agent endpoints)
The interactions, controlled by runtime policies:
- Model calls
- Tool invocations
- Agent workflows
That second layer is what is genuinely new. Governing which models exist is table stakes. Governing what an agent is allowed to do at the moment it acts, on behalf of a specific user, is the part that makes production agents safe.
The four control planes
1. Access and policy
You set fine-grained access policies on AI assets based on the model provider, its country of origin, its approval status, and governed tags. Every asset sits in a central inventory with lineage and audit trails. So "only approved, EU-hosted models may be used by agents that touch regulated data" stops being a wiki page and becomes an enforced policy.
For tools, Service Policies for MCP (Beta) are written as SQL-based Unity Catalog functions applied directly to an MCP service. They can read agent identity, user context, and request parameters, and they allow, deny, or require approval for an action. Concrete examples Databricks gives: restrict which documents an agent can read, limit write actions to a user's own resources, and require human approval for sensitive operations such as a code push to GitHub or a write to a protected folder. The Gateway intercepts each service call, so enforcement is deterministic and auditable, not a matter of hoping the model behaves.
2. Guardrails
AI guardrails (Beta) protect against PII exposure, prompt injection, jailbreaks, unsafe content, and policy violations. They can be applied to inputs, outputs, or both, and they work by real-time evaluation using a model and a prompt rather than rigid pre-built filters, so you can encode business-specific rules. Requests or responses that contain regulated data can be blocked outright. Every evaluation is logged centrally into Unity Catalog for production visibility.
3. Cost control
This is the one that surprises people. Agents are expensive, and the cost is invisible until the bill arrives. Unity AI Gateway gives you unified AI spend visibility across Databricks-hosted models, frontier models, coding agents, and custom agents, with token-level cost attribution by user, team, tool, and use case, stored in Unity Catalog-governed inference tables. You can set per-user alerts and hard spend caps that stop runaway usage automatically. It also offers smart routing recommendations, sending a simple task to a smaller model and a hard one to a frontier model, so you are not paying premium rates for trivial calls.
4. Observability
End-to-end agent tracing captures every model interaction and MCP tool call into Unity Catalog system tables, giving you a queryable record for debugging, monitoring, and compliance auditing. You can explore coding-agent logs in natural language through Genie, and Lakewatch integration flags suspicious activity and policy violations for investigation. Payload logging for MCP (Beta) captures every request and response across model and tool calls.
Why it matters, and the catch in the name
Most enterprises want agents and are quietly terrified of three things: a security incident, an uncontrolled bill, and a compliance breach. Unity AI Gateway is a direct answer to all three. If you are putting agents into production, it is the control plane you build on, and we expect it to become a default part of any serious enterprise Databricks AI architecture.
But read the name again. It is Unity AI Gateway, an extension of Unity Catalog. It governs the runtime brilliantly, and it assumes the catalog underneath it is already governed. A policy that says "only agents that touch non-regulated data may use this model" is meaningless if your data is not classified and tagged in the first place. On-behalf-of access only protects users if their permissions in Unity Catalog actually reflect who should see what. Cost attribution by team only works if your assets are organised by team. The Gateway is the steering and the brakes. Unity Catalog is the road.
This is the same pattern we wrote about with Databricks CustomerLake and across the rest of the Summit 2026 announcements: the platform got dramatically more capable, and every new capability quietly assumes a governed foundation. AI governance before AI chaos, not after.
The European angle
For regulated European enterprises this is more than a convenience. The ability to enforce model country-of-origin, block responses containing regulated data, require approval for sensitive actions, log every call into a governed table, and scope an agent to exactly what a user is permitted to see, all map directly onto GDPR obligations and the expectations of a risk team. The integrations Databricks is lining up with identity providers (Okta, Ping Identity, SailPoint, Saviynt) and AI-security vendors (CrowdStrike, Palo Alto Networks, Zscaler and others) point the same way: this is being built for enterprises that have to answer to auditors. The opportunity is to design that governance in deliberately, at the data layer, rather than discover after an incident that nobody could say what an agent did.
How to get "AI-governance-ready"
You do not need to wait. The work that makes Unity AI Gateway effective is foundation work that is valuable on its own.
- Govern the catalog first. Classify and tag data, get permissions in Unity Catalog reflecting who should actually see what, and establish lineage. This is the substrate the Gateway reads from. (See our Unity Catalog guide.)
- Inventory your AI assets. Know which models, MCP services, and agents exist, who owns them, and which are approved. You cannot govern what you have not catalogued.
- Get identity and on-behalf-of right. Agents should act with the user's permissions, not a shared service account that sees everything. That depends on your identity model.
- Define your policy model and cost boundaries. Decide what an agent is allowed to do, what needs human approval, and what the spend caps are, before you have a fleet, not after.
- Design residency and audit in. For regulated workloads, settle where data and inference physically sit, and make sure every call is logged into a governed table.
Get those right and you can turn on Unity AI Gateway and have it actually mean something. Skip them and you have a governance dashboard sitting on top of an ungoverned estate.
The Cosmos Thrace perspective
We are a Databricks Silver Partner, and governance is the unglamorous layer we have always worked in: getting the data foundation classified, permissioned, and trustworthy enough that the clever things on top are safe to run. We have delivered dozens of data platform implementations across Europe, many on Databricks, with more than $50M saved for clients in 2025, a 100% client retention rate, and 106 million data points moved daily.
Our honest read on Unity AI Gateway: it is one of the most important enterprise announcements of the Summit, and it is a genuine consulting moment rather than a hype cycle. The appetite for agents is enormous, and the blocker is almost never the model. It is the fear of running them without control, and Unity AI Gateway is the answer to that fear, provided the Unity Catalog foundation beneath it is real. The enterprises that win with agentic AI will be the ones who governed the foundation first. AI governance before AI chaos.
Sources
Databricks blog: AI governance at Data + AI Summit 2026 — What's new with Unity AI Gateway
Databricks blog: What's new in Unity AI Gateway: service policies, guardrails, observability, and cost controls
Databricks product page: Unity AI Gateway
Databricks docs: Unity AI Gateway for agents and LLMs
What people ask about Unity AI Gateway
It is Databricks' enterprise AI governance layer, built on Unity Catalog. It extends governance from data and AI assets to the AI runtime, controlling how agents access models and tools, applying security guardrails, attributing and capping cost, and tracing every model and tool call. It was announced at the Data + AI Summit 2026.
It is an extension of Unity Catalog. Where Unity Catalog governs data and AI assets (tables, models, permissions, lineage), Unity AI Gateway governs the runtime interactions between them: model calls, tool invocations, and agent workflows. It assumes a governed catalog underneath it.
The core capabilities are generally available, including unified spend visibility, cost attribution, hard spend caps, smart routing, runtime policy controls, tracing, and the Unity Catalog extension to models, MCP services, agents, and skills. More advanced controls, including contextual service policies for MCP, customizable LLM guardrails, and MCP payload logging, are in Beta.
Customizable, model-based guardrails that protect against PII exposure, prompt injection, jailbreaks, unsafe content, and policy violations. They can be applied to inputs, outputs, or both, can block requests or responses containing regulated data, and are logged centrally in Unity Catalog. (Beta.)
It gives unified AI spend visibility with token-level cost attribution by user, team, tool, and use case, stored in Unity Catalog-governed inference tables. You can set per-user alerts and hard spend caps that stop runaway usage automatically, and use smart routing to send simple tasks to cheaper models.
MCP (Model Context Protocol) is how agents reach tools like Slack, Jira, GitHub, SharePoint, and Google Drive. Unity AI Gateway governs those connections with SQL-based service policies, and supports on-behalf-of access so an agent acting for a user only sees and does what that user is permitted to, rather than using a shared account with broad access.
Govern the Unity Catalog foundation first (classify and tag data, fix permissions, establish lineage), inventory your AI assets, get identity and on-behalf-of access right, define your policy and cost-cap model, and design residency and audit in. Unity AI Gateway is only as strong as that foundation.