Skip to main content
databricks customerlakeagentic CDPcustomer data platformunity catalogcustomer 360identity resolutioninfinity campaigns

What Is Databricks CustomerLake?

Databricks CustomerLake, announced at the Databricks Data + AI Summit on June 16, 2026, is an agentic Customer Data Platform (CDP) built natively on the lakehouse and governed by Unity Catalog. It consolidates the work a traditional CDP spreads across separate tools - customer 360 profiles, identity resolution, audience building, campaign automation, and activation - into a single AI-native foundation, and adds agents that act on customer data continuously rather than waiting for someone to launch a campaign. With it, Databricks formally enters the marketing technology market.

The shift it represents is from a CDP as a place to store and segment customer data to a CDP that decides and acts. As with the rest of the agentic stack Databricks announced, the result depends less on the agents themselves than on whether the underlying customer data is unified, trustworthy, and governed - which is where context and governance become decisive.

TL;DR

CustomerLake is Databricks' agentic CDP (private preview, June 2026): customer 360, identity resolution, audience building, campaign automation, and activation - plus reverse ETL - unified on the lakehouse and governed by Unity Catalog. Its "infinity campaigns" replace one-off sends with continuous, agent-driven loops that read customer signals and act in real time. Early customers include HP, Circle K, Getnet by Santander, and Zé Delivery (AB InBev). An agentic CDP is only as good as the governed data behind it: what a segment means, which profile is authoritative, and which fields are sensitive. A cross-platform context layer governs that meaning across every system and serves it to any agent via open MCP.

What Is CustomerLake?

CustomerLake is an agentic CDP embedded directly in Databricks. Built on the lakehouse and governed by Unity Catalog, it brings the core functions of a customer data platform into one AI-native system:

  • Customer 360 - unify customer data into complete profiles on the lakehouse.
  • Identity resolution - including agentic identity resolution to stitch records into a single view of each customer.
  • Audience building and segmentation - define and manage audiences governed alongside the rest of your data.
  • Campaign automation and activation - run and activate campaigns across channels.
  • Reverse ETL - move unified data bi-directionally into the marketing and advertising stack.

It uses Lakehouse Federation to reach data across systems and ships with an open partner ecosystem spanning advertising, messaging, and identity vendors. CustomerLake is currently in private preview, with early customers including HP, Circle K, Getnet by Santander, and Zé Delivery (AB InBev).

What Makes It Agentic

The "agentic" label is the substance, not just branding. CustomerLake deploys agents - including campaign agents and profile agents - that continuously analyze customer behavior, decide on a next-best action, and act across channels without a person assembling each workflow.

Databricks frames this as infinity campaigns: "continuous, agent-driven engagement loops that analyze customer signals, decide on the next-best action, and act across channels based on real-time customer context and business goals." Instead of building a campaign, scheduling it, and measuring it after the fact, the system runs as an ongoing loop that reacts to each customer's context as it changes. That is a meaningful change in how marketing operates - and it raises the stakes on data quality, because an agent acting automatically on the wrong understanding of a customer does so continuously, at scale.

Why Governed Customer Data Decides It

An agentic CDP is only as good as the governed data and definitions behind it. When agents act automatically on customer data, three things have to be true, or the loop amplifies mistakes:

  • Shared meaning. "Active customer," "high-value segment," and "churn risk" have to mean the same thing across teams, or different agents and campaigns optimize for different definitions. This is the job of a governed business glossary.
  • A trustworthy single view. Identity resolution and customer 360 depend on clean, reconciled records - the same discipline as master data management. An agent acting on a duplicated or stale profile makes the wrong call confidently.
  • Sensitivity, governed. Customer data is full of personal information, so classification and policy have to travel with it - especially when agents move it through reverse ETL into ad platforms and downstream tools.

Unity Catalog gives CustomerLake a strong governance foundation inside Databricks. The harder, broader challenge is that customer data, and the meaning attached to it, rarely lives in one place.

How Databricks CustomerLake works CUSTOMERLAKE: AN AGENTIC CDP ON THE LAKEHOUSE CRM · web · app ad & messaging warehouses Customer 360 identity resolution unified profiles audiences Agents campaign · profile infinity campaigns analyze · decide · act continuous loop · real-time customer context DATABRICKS · UNITY CATALOG governance · access control · classification across the lakehouse The gap customer data and its meaning also live in CRM, ad platforms, and other warehouses - govern it once, everywhere
Click to enlarge

The Cross-Platform Gap

CustomerLake unifies and governs customer data well - on the lakehouse. But customer data is famously scattered: a single customer shows up in a CRM, a web analytics tool, an email platform, ad networks, billing, and one or more warehouses. CustomerLake uses Lakehouse Federation and reverse ETL to reach across some of that, yet the meaning attached to a customer - the definitions, the sensitivity, the lineage - still needs to be governed consistently wherever the data lives.

Two gaps matter for an agentic CDP specifically:

  • Meaning that spans systems. A segment definition or a consent flag is only safe to act on automatically if it means the same thing in every system the data touches, not just inside Databricks. Lineage and definitions have to follow the customer across platforms.
  • Governed context for any agent. Marketing agents, and the copilots and assistants around them, need the same governed business meaning - delivered through an open standard, the Model Context Protocol (MCP) - so they act on one consistent understanding of the customer.

Without that, customer context becomes another set of context islands, and an agentic CDP acting in a continuous loop will faithfully act on whatever partial understanding it has.

How Dawiso Fits

Dawiso is the cross-platform context layer that governs the meaning of customer data across the whole estate - with Databricks and CustomerLake as first-class sources inside it. It supplies what agents need to act on customer data safely:

  • Customer concepts defined once. The business glossary defines "active customer," "high-value segment," and consent and lifecycle terms once, connected to the underlying data wherever it lives, so every team and agent shares one definition.
  • Classification and lineage across systems. Classification marks personal and sensitive customer fields, and interactive data lineage traces how customer data flows across CRM, warehouses, and the lakehouse - complementing Unity Catalog, not duplicating it.
  • Served to any agent via open MCP. The context layer delivers this governed context to any MCP-compatible agent through the MCP Server - so marketing agents and every other agent act on the same trustworthy view of the customer.

Unity Catalog keeps governing customer data inside Databricks; Dawiso makes sure the definitions, sensitivity, and lineage that make automated action safe are consistent across every system the customer touches.

Conclusion

CustomerLake is Databricks bringing the CDP onto the lakehouse and making it agentic: customer 360, identity resolution, audiences, and activation in one governed system, with agents running continuous infinity campaigns instead of one-off sends. It is a strong move, and it makes one thing clear - when agents act on customer data automatically, the governed meaning behind that data is what decides whether the outcome is right. Customer data and its meaning span far more than one platform, so the discipline that makes an agentic CDP safe is cross-platform context: definitions, classification, and lineage governed once and served to any agent through open MCP. Govern customer data in Databricks, then make that context consistent everywhere the customer lives.

See it in action

MCP (Model Context Protocol)

Connect agents and LLMs directly to your enterprise data and business knowledge.