Databricks

Databricks CustomerLake Explained: What an Agentic CDP Is, and Whether Your Data Foundation Is Ready for It

Summary

Databricks CustomerLake is an agentic Customer Data Platform (CDP) built natively on the Databricks lakehouse and governed by Unity Catalog. Databricks announced it on 16 June 2026 at the Data + AI Summit, and it is currently in Private Preview. Instead of the old model of planning and scheduling campaigns across a dozen disconnected tools, CustomerLake puts governed customer data, AI models, and execution in one place, and lets agents continuously decide and act in real time. That is the headline. The part that matters more for your business is quieter: an agentic CDP only works if the data underneath it is unified, governed, and identity-resolved. Most enterprises are not there yet.

Last Updated

23 Jun 2026

Published

23 Jun 2026
Many streams converging into one calm lake at dawn, in purple and orange — fragmented customer data unifying in Databricks    CustomerLake.

TL;DR

  • What it is: CustomerLake is Databricks' first-party agentic CDP, embedded in the lakehouse. Announced 16 June 2026, in Private Preview now.
  • What "agentic" means: governed customer context, AI models, and activation in a single environment, with agents that analyse, decide, and act continuously instead of waiting for a marketer to launch a campaign.
  • The new vocabulary: "infinity campaigns" (always-on engagement), Profile Agents (build Customer 360 from raw data), Campaign Agents (build audiences and activate), and agentic identity resolution.
  • Who it challenges: legacy and composable CDPs and the broader martech stack (the Salesforce and Adobe ecosystems).
  • The catch: it runs on governed Unity Catalog data. If your data foundation is fragmented or ungoverned, you cannot adopt it, no matter how good the agents are.
  • For EU enterprises: the identity and enrichment layer is US-data-broker-centric, so GDPR and EU data residency need to be designed in from the start.

What is Databricks CustomerLake?

CustomerLake is a Customer Data Platform that lives inside Databricks rather than beside it. A traditional CDP is a separate system you copy customer data into. CustomerLake flips that: it brings Customer 360, identity resolution, segmentation, audience building, activation, and personalisation directly to the governed data you already hold in the lakehouse, so you move faster without duplicating sensitive data into yet another silo.

It is built on the Databricks Lakehouse, governed by Unity Catalog, and uses the same agentic building blocks Databricks has been shipping across its platform: Genie for natural-language agentic analysis, Lakebase for operational data, and Agent Bricks for building and running agents. It was announced on 16 June 2026 at the Data + AI Summit, alongside Genie One and Databricks' acquisition of Panther Labs. It is in Private Preview, with named early customers including HP, Circle K, AB InBev, and Getnet by Santander.

In Databricks CEO Ali Ghodsi's framing, "marketers need to reimagine their entire foundation, not just the campaigns they run, but the customers they run them for, which now include agents." That last point is easy to miss and worth holding onto: in this model the audience increasingly includes other software agents, not only people.

What is an "agentic CDP"?

A legacy CDP runs on a waterfall. Someone plans a campaign, builds an audience, schedules it, pushes it to an activation tool, waits, then reads a report. The data, the decisioning, and the execution usually live in different systems stitched together with pipelines.

An agentic CDP collapses that. It puts the governed customer context, the AI models, and the execution capability in one environment, and it does not wait for a human to pull the trigger. It continuously analyses customer data, makes decisions, and acts. Databricks calls the result an "infinity campaign": always-on, agent-driven engagement that reacts to customer context in real time, rather than a one-off blast.

The mechanics, in Databricks' own terms:

  • Profile Agents turn raw, messy customer data into business-ready Customer 360 profiles through agentic loops.
  • Campaign Agents build audiences, recommend the next best action, activate across channels, and keep optimising toward a business goal.
  • Agentic identity resolution combines deterministic rules with agents to unify fractured customer records into one profile.
  • A built-in identity marketplace offers enrichment from data providers including Acxiom, Epsilon, LiveRamp, TransUnion, and Adstra.
  • Native integrations and reverse ETL push data bi-directionally into the existing martech and adtech stack, with an open partner ecosystem spanning Adobe, Meta, The Trade Desk, Braze, Bloomreach, Iterable, Twilio, Snapchat, and more.

Why it matters: Databricks just walked into martech

Two things make this more than a product release.

First, the competitive signal. By embedding a CDP in the lakehouse, Databricks is positioning directly against the legacy and composable CDP market and, by extension, the Salesforce and Adobe ecosystems that own enterprise marketing data today. The pitch is "AI-native and agent-driven versus traditional," and the launch partner list (Adobe, Meta, The Trade Desk, Twilio) is the kind of ecosystem you assemble when you intend to compete at the centre of the stack, not at the edge.

Second, the architectural shift. For a decade the CDP was a place you sent a copy of your customer data. CustomerLake argues the opposite: the lakehouse, where the governed data already sits, becomes the CDP. If that idea holds, the centre of gravity for customer data moves from the marketing department's tools back to the enterprise data platform. That is a meaningful change in who owns customer data and where it is governed.

The catch nobody is putting on the slide

Here is the part the launch coverage skips, and it is the part that decides whether any of this works for you.

An agentic CDP is only as good as the data underneath it. Agents that build profiles, resolve identity, and choose the next best action need data that is unified, clean, governed, and identity-resolved. CustomerLake assumes that foundation, in the form of governed tables in Unity Catalog. It does not create it for you.

In our implementation work, this is exactly where most enterprise data projects stall, and it is rarely the technology's fault. It is scoping and governance. Customer data is spread across systems that never agreed on a key. Identity is fractured. Ownership and lineage are unclear. Consent and residency were never modelled properly. Point an agent at that, and you get fast, confident, wrong decisions at scale, which is worse than the slow manual process it replaced.

So the honest readiness question is not "should we buy CustomerLake when it is generally available." It is "is our data foundation in a state where an agentic CDP could safely act on it." For most organisations the honest answer today is no, and that gap is fixable.

How to get "CustomerLake-ready"

You do not need to wait for general availability to start, and you should not. The foundation work below is valuable on its own, and it is the prerequisite for any agentic CDP, Databricks' or anyone else's.

  • Unify and govern in Unity Catalog. One governed source of customer data, with clear ownership, access control, and lineage. This is the substrate CustomerLake reads from.
  • Get identity resolution right. A real Customer 360 with a defensible matching strategy, deterministic where it can be, before you let agents loop on top of it.
  • Build the medallion foundation. Bronze to silver to gold, so the "business-ready" layer agents consume is actually business-ready, conformed and typed, not raw. See our guide to medallion architecture.
  • Design consent and EU data residency in, not on. For European enterprises this is not optional. The enrichment and identity layer in CustomerLake leans on US data providers, so the GDPR controller and processor model, lawful basis for enrichment, and where regulated data physically sits have to be settled up front, not retrofitted after an agent has already acted on the data.
  • Decide what an agent is allowed to do. Governance is not only about data access. It is about which decisions and actions you are comfortable delegating, with what guardrails and what human oversight.

Get those five right and you are ready for an agentic CDP whenever you choose to adopt one. Skip them and the agents will simply automate your existing mess.

Get a straight answer on where you stand. We run a short data-foundation readiness assessment for the agentic-CDP era: is your customer data unified, governed, and identity-resolved enough to let agents act on it safely, and what would it take to close the gap. Talk to our team →.

The EU angle the US coverage is missing

Most of the writing on CustomerLake so far is US-centric, and it shows in one place especially: the identity marketplace. Acxiom, Epsilon, LiveRamp, TransUnion, and Adstra are powerful enrichment sources, and they are built around the US data-broker model. For a European enterprise, third-party enrichment of customer data sits squarely in GDPR territory: lawful basis, purpose limitation, data-subject rights, and cross-border transfer all apply.

None of this makes CustomerLake unusable in Europe. It makes the governance design non-negotiable. The advantage of an agentic CDP built on Unity Catalog is precisely that residency, lineage, and access can be governed at the data layer rather than bolted onto a separate marketing tool. But that advantage only materialises if someone designs it that way deliberately. That is a foundation decision, made before the first agent runs, not a setting you flip afterwards.

The Cosmos Thrace perspective

We are a Databricks Silver Partner, and our work is the unglamorous layer underneath announcements like this one: getting the data foundation unified, governed, and trustworthy enough that the clever things on top actually work. We have delivered dozens of data platform implementations across Europe, many on Databricks, with a consistent pattern behind the results, more than $50M saved for clients in 2025, a 100% client retention rate, and 106 million data points moved daily.

Our read on CustomerLake is straightforward. The agentic CDP is a genuine shift, and the enterprises that benefit first will not be the ones who buy fastest. They will be the ones whose data foundation was already in shape: unified in Unity Catalog, identity-resolved, governed, with EU residency and consent designed in. That work is worth doing now whether or not you ever turn on CustomerLake, because every agentic capability Databricks ships next will assume the same foundation. The platform moved the goalposts toward agents. The teams that win are the ones who get the foundation right before the agents arrive.

Sources

Databricks press release: Databricks Enters the Marketing Industry with CustomerLake

Databricks blog: Introducing CustomerLake: The Agentic CDP embedded in Databricks

Databricks blog: Introducing the Agentic CDP: A New Species of CDP for a New Era of Agents

Databricks product page: CustomerLake: The Agentic CDP built in Databricks

FAQ

What people ask about Databricks CustomerLake

What is Databricks CustomerLake?
What is an agentic CDP?
Is Databricks CustomerLake available yet?
Does CustomerLake replace Salesforce or Adobe?
Do I need Databricks to use CustomerLake?
Is CustomerLake GDPR compliant for EU companies?
What is the difference between an agentic CDP and a composable CDP?