Databricks Agent Bricks: The Production Agent Platform Explained

TL;DR

What changed: Agent Bricks went from "build a good agent" to "run a fleet of production agents," owning the unglamorous 99%: deployment, evaluation, monitoring, security, memory, context, and cost.
The framing: the agent loop is 1% of the work. The 99% is the technical debt that kills agent projects after the demo.
Model choice, no lock-in: OpenAI, Anthropic, Gemini, Qwen, Kimi, and Grok (via a SpaceX partnership), plus fine-tuned and Databricks custom models. Bring any harness too (LangGraph, CrewAI, Claude Code SDK, Omnigent).
Production plumbing: managed agent memory via Lakebase, Databricks Sandbox for secure isolated execution, Document Intelligence SQL functions (GA), agentic search, and deployment that autoscales.
The catch: Agent Bricks handles the production engineering. It still assumes governed data and real business context underneath. The agent is only as good as the foundation it reads from.

The reframe that matters: 1% loop, 99% debt

Anyone who has put an agent into production recognises this immediately. Writing the loop, the bit where a model calls a tool, reads the result, and decides what to do next, is the easy, fun 1%. The 99% is everything around it: how you give it enough token capacity, how you deploy and autoscale it, how you stop it leaking data, how you evaluate whether it is actually any good, how you monitor it in production, how you give it the right context, and how you share it safely.

That 99% is where agent projects die. They demo beautifully and then never ship, because nobody scoped the deployment, the evaluation harness, the cost controls, or the security review. Agent Bricks' pitch is that it handles that 99% as platform, so your team spends its time on the 1% that is actually your business logic. As the framing goes, that is what turns an impressive demo into a system you can trust in production.

This is genuinely the right problem to attack. In our delivery work it is exactly the gap we see: enthusiasm for agents, and a complete underestimate of what it takes to run one safely at scale.

Model choice, and no lock-in

Agent Bricks lets you use the model that fits the task: OpenAI, Anthropic, Gemini, Qwen, Kimi, and now Grok through a newly announced SpaceX partnership. You can fine-tune your own with Mosaic AI Model Training and reinforcement learning, and Databricks offers custom models it positions as competitive with frontier models like Opus and Sonnet at lower cost.

The strategic point is no lock-in. As Edmunds' VP of Technology Gregory Rokita put it, "Databricks gives us a secure, governed foundation to run multiple models and switch providers as our needs evolve, all while keeping costs in check." For an enterprise, the ability to swap models without re-platforming is worth more than any single model's benchmark score, because the benchmark leader changes every few months and your architecture should not.

Bring your own harness, and deploy it properly

You are not forced into a Databricks-specific agent framework. Agent Bricks supports open harnesses including LangGraph, Agno, and CrewAI, the Claude Code SDK and OpenAI Agent SDKs, and Databricks released a managed open-source meta-harness called Omnigent. Deployment runs with horizontal autoscaling via Databricks Apps, so a successful agent can scale with demand rather than falling over.

This matters because most enterprises already have agent experiments in two or three frameworks. A platform that governs and deploys all of them beats one that demands you rewrite everything in its own SDK first.

Context: where Agent Bricks meets the foundation

An agent is only useful if it can reach the right data with the right meaning. Agent Bricks adds several pieces here, and this is where it connects to the rest of the platform.

Governed tool access via MCP. Model Context Protocol support is built into Unity Catalog, with managed connections to Google Drive, JIRA, Slack, and GitHub, and the Databricks Agent Tools suite governed centrally in Unity Catalog.
Business context via the Genie Ontology. This is the business-semantics layer that gives agents instant context, fiscal-year timing, organisational hierarchy, customer definitions, table usage and data authority, which we covered in our piece on the Unity Catalog semantic layer.
Agentic search, with a reported 3x speed improvement and quality gains, across Lakehouse and external data.
Managed memory via Lakebase, so agents keep session history and context, with cross-session and cross-agent memory planned.

Notice the pattern. Every one of these is only as good as the data and meaning underneath it. The Genie Ontology needs a real semantic layer. Governed MCP needs a governed Unity Catalog. Memory needs data worth remembering. Agent Bricks gives you the machinery; the foundation is still yours.

Security and governance: Sandbox and the Gateway

Two pieces make this safe enough for serious enterprises.

Databricks Sandbox runs agent code in secure, isolated VMs with downscoped Unity Catalog access, for code-interpreter tools, subagents, and experimentation. An agent that writes and runs code is a security event waiting to happen unless it is contained; the Sandbox is the containment.

Unity AI Gateway governs the whole estate: discovery of agents, models, MCP services and skills in Unity Catalog, fine-grained access controls, per-user and per-group budgets, intelligent traffic routing, agent traces stored in the Lakehouse (not in a silo), LakeWatch integration for PII-violation alerts and incident response, and contextual security policies written in SQL. We went deep on this in our Unity AI Gateway breakdown, and it is the reason Agent Bricks can be trusted in production rather than just demoed.

There is also Document Intelligence (GA): the SQL functions ai_parse_document, ai_extract, and ai_classify, which Databricks positions as better in quality and cost than frontier LLMs and specialist tools for document processing.

The catch, and where it gets real for you

Agent Bricks solves a real and expensive problem. But read the 99% list again, deployment, security, evaluation, monitoring, context, sharing, and notice two things.

First, platform features are not the same as a working practice. Agent Bricks gives you an evaluation harness; it does not tell you what "good" means for your use case, or stop you from shipping an agent that is confidently wrong. It gives you cost controls; someone still has to set the budgets and decide the trade-offs. It gives you governed tool access; someone still has to govern the catalog.

Second, and this is the thread through the whole Summit 2026 announcement set: the agent is only as good as the foundation it reads from. The same way CustomerLake assumes a real Customer 360, Agent Bricks assumes governed data, real business context, and a security posture you can stand behind. Point a beautifully-engineered agent platform at an ungoverned estate and you have automated your problems at quadrillion-token scale.

The European angle

For regulated European enterprises, the combination of Databricks Sandbox (isolated execution with downscoped access), Unity AI Gateway (governed routing, budgets, PII alerts, audit), and model choice (including the ability to keep workloads on approved providers and regions) is what makes production agents approvable by a risk team. The governance is not a tax on the fun part; for a regulated business it is the thing that lets the fun part happen at all.

How to get "agent-production-ready"

Adopting Agent Bricks well is less about the platform and more about the practice around it.

Govern the foundation first. Classify data, organise it into domains, define the business glossary and metrics. The Genie Ontology and governed tools depend on it.
Decide what "good" means. Before you build, define how you will evaluate the agent: what tasks, what success criteria, what failure modes you will not tolerate. The harness is useless without the definition.
Set budgets and guardrails up front. Per-user and per-group caps, contextual policies, and a security review of any tool the agent can call, especially code execution.
Pick a model strategy, not a model. Use Agent Bricks' model choice deliberately: cheap models for simple tasks, frontier models where they earn it, and the freedom to switch.
Design for audit and residency. For regulated workloads, make sure traces, access, and data location are all traceable from day one.

Do this and Agent Bricks accelerates you enormously. Skip it and you have a faster way to ship agents you cannot trust.

The Cosmos Thrace perspective

This is our wheelhouse, the 99%. We are a Databricks Silver Partner, and the unglamorous production engineering and governance, the part most teams underestimate, is exactly where we spend our time so the agents on top are safe and useful. We have delivered dozens of data platform implementations across Europe, many on Databricks, with more than $50M saved for clients in 2025, a 100% client retention rate, and 106 million data points moved daily.

Our honest read on Agent Bricks: it is one of the most consequential things Databricks shipped this year, because it attacks the real reason agent projects fail, the 99%, not the 1%. But the platform is an accelerant, not a substitute for the foundation and the discipline. The enterprises that win with it will be the ones who governed their data, defined what good looks like, and treated security and cost as design inputs rather than afterthoughts. Get the 99% right and the agents are transformative. Skip it and they are a very fast way to scale a mess.

Sources

Databricks blog: Agent Bricks at Data + AI Summit 2026

Databricks blog: Expanding agent governance with Unity AI Gateway

Ready to implement AI where your executives, data scientists, and business teams all understand ROI, decisions, and outcomes?

FAQ

What people ask about Databricks Agent Bricks

What is Databricks Agent Bricks?

Agent Bricks is Databricks' developer platform for building, deploying, governing, and running production AI agents. Announced as a major expansion at the Data + AI Summit 2026, it handles the "99%" of agent work beyond the core loop: deployment, security, evaluation, monitoring, context, memory, and cost. More than 100,000 agents have been built on it, processing over a quadrillion tokens a year.

What is the difference between an agent builder and an agent platform?

An agent builder helps you create a good agent. An agent platform also handles everything required to run it in production at scale, deployment, autoscaling, evaluation, monitoring, security, memory, governance, and cost control. Agent Bricks made exactly that transition in 2026.

Which models does Agent Bricks support?

OpenAI, Anthropic, Gemini, Qwen, Kimi, and Grok (via a newly announced SpaceX partnership), plus models you fine-tune with Mosaic AI Model Training and Databricks' own custom models. The emphasis is on no lock-in: switch providers without re-platforming.

Does Agent Bricks support frameworks like LangGraph and CrewAI?

Yes. It supports open harnesses including LangGraph, Agno, and CrewAI, the Claude Code SDK and OpenAI Agent SDKs, and a managed open-source meta-harness called Omnigent. You do not have to rewrite agents in a Databricks-specific SDK.

What is Databricks Sandbox?

A secure, isolated execution environment, VMs with downscoped Unity Catalog access, for running agent code such as code-interpreter tools, subagents, and experiments without exposing your wider estate. It is the containment that makes code-executing agents safe.

How are agents governed in Agent Bricks?

Through Unity AI Gateway: agents, models, MCP services, and tools are registered in Unity Catalog with fine-grained access controls, per-user and per-group budgets, traffic routing, traces stored in the Lakehouse, LakeWatch alerts, and contextual security policies in SQL.

How should an enterprise prepare to use Agent Bricks?

Govern the data foundation first, define how you will evaluate agents and what failure modes are unacceptable, set budgets and guardrails up front, adopt a deliberate multi-model strategy, and design audit and residency in from the start. The platform accelerates a good practice; it does not replace one.

Databricks Agent Bricks Explained: The Production Agent Platform (and the 99% Nobody Budgets For)

Summary

Last Updated

Published

Authored By

Reviewed By

TL;DR

The reframe that matters: 1% loop, 99% debt

Model choice, and no lock-in

Bring your own harness, and deploy it properly

Context: where Agent Bricks meets the foundation

Security and governance: Sandbox and the Gateway

The catch, and where it gets real for you

The European angle

How to get "agent-production-ready"

The Cosmos Thrace perspective

Sources

Ready to implement AI where your executives, data scientists, and business teams all understand ROI, decisions, and outcomes?

What people ask about Databricks Agent Bricks

Ready to implement AI where your executives, data scientists, and business teams all understand ROI, decisions, and outcomes?

Services

Links

Help

Crafted By