Databricks Agent Bricks Explained: The Production Agent Platform (and the 99% Nobody Budgets For)
Summary
At the Data + AI Summit 2026, Databricks expanded Agent Bricks from a quality-focused agent builder into a full developer platform for production AI agents. Its central argument is the one every team that has shipped an agent already knows in their bones: the core agent loop is about 1% of the work, and the other 99% is hidden technical debt, token capacity, deployment, security, evaluation, monitoring, context, and sharing. Agent Bricks is Databricks productising that 99%. With more than 100,000 agents already built on it and over a quadrillion tokens processed a year, plus broad model choice (now including Grok through a SpaceX partnership), support for any agent framework, managed memory, secure sandboxed execution, and governance through Unity AI Gateway, it is a serious bid to be where enterprises run agents. The thing it does not remove is the need for a governed data foundation underneath it all.
Last Updated
Published
Authored By
Technical Director
Reviewed By
Managing Partner
TL;DR
- What changed: Agent Bricks went from "build a good agent" to "run a fleet of production agents," owning the unglamorous 99%: deployment, evaluation, monitoring, security, memory, context, and cost.
- The framing: the agent loop is 1% of the work. The 99% is the technical debt that kills agent projects after the demo.
- Model choice, no lock-in: OpenAI, Anthropic, Gemini, Qwen, Kimi, and Grok (via a SpaceX partnership), plus fine-tuned and Databricks custom models. Bring any harness too (LangGraph, CrewAI, Claude Code SDK, Omnigent).
- Production plumbing: managed agent memory via Lakebase, Databricks Sandbox for secure isolated execution, Document Intelligence SQL functions (GA), agentic search, and deployment that autoscales.
- The catch: Agent Bricks handles the production engineering. It still assumes governed data and real business context underneath. The agent is only as good as the foundation it reads from.
The reframe that matters: 1% loop, 99% debt
Anyone who has put an agent into production recognises this immediately. Writing the loop, the bit where a model calls a tool, reads the result, and decides what to do next, is the easy, fun 1%. The 99% is everything around it: how you give it enough token capacity, how you deploy and autoscale it, how you stop it leaking data, how you evaluate whether it is actually any good, how you monitor it in production, how you give it the right context, and how you share it safely.
That 99% is where agent projects die. They demo beautifully and then never ship, because nobody scoped the deployment, the evaluation harness, the cost controls, or the security review. Agent Bricks' pitch is that it handles that 99% as platform, so your team spends its time on the 1% that is actually your business logic. As the framing goes, that is what turns an impressive demo into a system you can trust in production.
This is genuinely the right problem to attack. In our delivery work it is exactly the gap we see: enthusiasm for agents, and a complete underestimate of what it takes to run one safely at scale.
Model choice, and no lock-in
Agent Bricks lets you use the model that fits the task: OpenAI, Anthropic, Gemini, Qwen, Kimi, and now Grok through a newly announced SpaceX partnership. You can fine-tune your own with Mosaic AI Model Training and reinforcement learning, and Databricks offers custom models it positions as competitive with frontier models like Opus and Sonnet at lower cost.
The strategic point is no lock-in. As Edmunds' VP of Technology Gregory Rokita put it, "Databricks gives us a secure, governed foundation to run multiple models and switch providers as our needs evolve, all while keeping costs in check." For an enterprise, the ability to swap models without re-platforming is worth more than any single model's benchmark score, because the benchmark leader changes every few months and your architecture should not.
Bring your own harness, and deploy it properly
You are not forced into a Databricks-specific agent framework. Agent Bricks supports open harnesses including LangGraph, Agno, and CrewAI, the Claude Code SDK and OpenAI Agent SDKs, and Databricks released a managed open-source meta-harness called Omnigent. Deployment runs with horizontal autoscaling via Databricks Apps, so a successful agent can scale with demand rather than falling over.
This matters because most enterprises already have agent experiments in two or three frameworks. A platform that governs and deploys all of them beats one that demands you rewrite everything in its own SDK first.
Context: where Agent Bricks meets the foundation
An agent is only useful if it can reach the right data with the right meaning. Agent Bricks adds several pieces here, and this is where it connects to the rest of the platform.
- Governed tool access via MCP. Model Context Protocol support is built into Unity Catalog, with managed connections to Google Drive, JIRA, Slack, and GitHub, and the Databricks Agent Tools suite governed centrally in Unity Catalog.
- Business context via the Genie Ontology. This is the business-semantics layer that gives agents instant context, fiscal-year timing, organisational hierarchy, customer definitions, table usage and data authority, which we covered in our piece on the Unity Catalog semantic layer.
- Agentic search, with a reported 3x speed improvement and quality gains, across Lakehouse and external data.
- Managed memory via Lakebase, so agents keep session history and context, with cross-session and cross-agent memory planned.
Notice the pattern. Every one of these is only as good as the data and meaning underneath it. The Genie Ontology needs a real semantic layer. Governed MCP needs a governed Unity Catalog. Memory needs data worth remembering. Agent Bricks gives you the machinery; the foundation is still yours.
Security and governance: Sandbox and the Gateway
Two pieces make this safe enough for serious enterprises.
Databricks Sandbox runs agent code in secure, isolated VMs with downscoped Unity Catalog access, for code-interpreter tools, subagents, and experimentation. An agent that writes and runs code is a security event waiting to happen unless it is contained; the Sandbox is the containment.
Unity AI Gateway governs the whole estate: discovery of agents, models, MCP services and skills in Unity Catalog, fine-grained access controls, per-user and per-group budgets, intelligent traffic routing, agent traces stored in the Lakehouse (not in a silo), LakeWatch integration for PII-violation alerts and incident response, and contextual security policies written in SQL. We went deep on this in our Unity AI Gateway breakdown, and it is the reason Agent Bricks can be trusted in production rather than just demoed.
There is also Document Intelligence (GA): the SQL functions ai_parse_document, ai_extract, and ai_classify, which Databricks positions as better in quality and cost than frontier LLMs and specialist tools for document processing.
The catch, and where it gets real for you
Agent Bricks solves a real and expensive problem. But read the 99% list again, deployment, security, evaluation, monitoring, context, sharing, and notice two things.
First, platform features are not the same as a working practice. Agent Bricks gives you an evaluation harness; it does not tell you what "good" means for your use case, or stop you from shipping an agent that is confidently wrong. It gives you cost controls; someone still has to set the budgets and decide the trade-offs. It gives you governed tool access; someone still has to govern the catalog.
Second, and this is the thread through the whole Summit 2026 announcement set: the agent is only as good as the foundation it reads from. The same way CustomerLake assumes a real Customer 360, Agent Bricks assumes governed data, real business context, and a security posture you can stand behind. Point a beautifully-engineered agent platform at an ungoverned estate and you have automated your problems at quadrillion-token scale.
The European angle
For regulated European enterprises, the combination of Databricks Sandbox (isolated execution with downscoped access), Unity AI Gateway (governed routing, budgets, PII alerts, audit), and model choice (including the ability to keep workloads on approved providers and regions) is what makes production agents approvable by a risk team. The governance is not a tax on the fun part; for a regulated business it is the thing that lets the fun part happen at all.
How to get "agent-production-ready"
Adopting Agent Bricks well is less about the platform and more about the practice around it.
- Govern the foundation first. Classify data, organise it into domains, define the business glossary and metrics. The Genie Ontology and governed tools depend on it.
- Decide what "good" means. Before you build, define how you will evaluate the agent: what tasks, what success criteria, what failure modes you will not tolerate. The harness is useless without the definition.
- Set budgets and guardrails up front. Per-user and per-group caps, contextual policies, and a security review of any tool the agent can call, especially code execution.
- Pick a model strategy, not a model. Use Agent Bricks' model choice deliberately: cheap models for simple tasks, frontier models where they earn it, and the freedom to switch.
- Design for audit and residency. For regulated workloads, make sure traces, access, and data location are all traceable from day one.
Do this and Agent Bricks accelerates you enormously. Skip it and you have a faster way to ship agents you cannot trust.
The Cosmos Thrace perspective
This is our wheelhouse, the 99%. We are a Databricks Silver Partner, and the unglamorous production engineering and governance, the part most teams underestimate, is exactly where we spend our time so the agents on top are safe and useful. We have delivered dozens of data platform implementations across Europe, many on Databricks, with more than $50M saved for clients in 2025, a 100% client retention rate, and 106 million data points moved daily.
Our honest read on Agent Bricks: it is one of the most consequential things Databricks shipped this year, because it attacks the real reason agent projects fail, the 99%, not the 1%. But the platform is an accelerant, not a substitute for the foundation and the discipline. The enterprises that win with it will be the ones who governed their data, defined what good looks like, and treated security and cost as design inputs rather than afterthoughts. Get the 99% right and the agents are transformative. Skip it and they are a very fast way to scale a mess.
Sources
Databricks blog: Agent Bricks at Data + AI Summit 2026
Databricks blog: Expanding agent governance with Unity AI Gateway
What people ask about Databricks Agent Bricks
Agent Bricks is Databricks' developer platform for building, deploying, governing, and running production AI agents. Announced as a major expansion at the Data + AI Summit 2026, it handles the "99%" of agent work beyond the core loop: deployment, security, evaluation, monitoring, context, memory, and cost. More than 100,000 agents have been built on it, processing over a quadrillion tokens a year.
An agent builder helps you create a good agent. An agent platform also handles everything required to run it in production at scale, deployment, autoscaling, evaluation, monitoring, security, memory, governance, and cost control. Agent Bricks made exactly that transition in 2026.
OpenAI, Anthropic, Gemini, Qwen, Kimi, and Grok (via a newly announced SpaceX partnership), plus models you fine-tune with Mosaic AI Model Training and Databricks' own custom models. The emphasis is on no lock-in: switch providers without re-platforming.
Yes. It supports open harnesses including LangGraph, Agno, and CrewAI, the Claude Code SDK and OpenAI Agent SDKs, and a managed open-source meta-harness called Omnigent. You do not have to rewrite agents in a Databricks-specific SDK.
A secure, isolated execution environment, VMs with downscoped Unity Catalog access, for running agent code such as code-interpreter tools, subagents, and experiments without exposing your wider estate. It is the containment that makes code-executing agents safe.
Through Unity AI Gateway: agents, models, MCP services, and tools are registered in Unity Catalog with fine-grained access controls, per-user and per-group budgets, traffic routing, traces stored in the Lakehouse, LakeWatch alerts, and contextual security policies in SQL.
Govern the data foundation first, define how you will evaluate agents and what failure modes are unacceptable, set budgets and guardrails up front, adopt a deliberate multi-model strategy, and design audit and residency in from the start. The platform accelerates a good practice; it does not replace one.