Databricks Medallion Architecture Best Practices for the Agentic Era (Updated Post-DAIS 2026)

The best practices for Databricks medallion architecture come down to discipline, not design. Silver should handle cleaning, conforming, and typing. Gold should handle anything that requires domain knowledge: joins, aggregations, and business logic. The architecture itself is sound. What breaks it is convenience. Engineers place logic in whatever layer they happen to be working in, and the boundaries erode over time. The fix is structural: use a domain-first Unity Catalog layout, apply the "does it require domain knowledge?" decision rule at every transformation, and run a regular 3-question health check on your Gold layer.

Databricks medallion architecture diagram showing Bronze, Silver, and Gold layers with Unity Catalog governance, illustrated by Cosmos Thrace — The medallion architecture as we implement it in production — Bronze for raw ingestion, Silver for cleaned/typed data, Gold for business-ready aggregates, all under Unity Catalog governance.

Every Bronze → Silver → Gold pattern we ship in EMEA enterprise engagements follows this structure, with Unity Catalog enforcing access controls and lineage across all three layers.

TL;DR

Medallion architecture doesn't fail fast or loudly. It degrades slowly until maintenance overtakes development.
The Silver vs Gold boundary should be defined by one question: does this transformation require domain knowledge?
Convenience is the primary failure mechanism. "I'm already in Silver, let me do it here" erodes the architecture over time.
Unity Catalog should be structured domain-first, not layer-first. Align catalogs to business domains, not Bronze/Silver/Gold.
Run the 3-question diagnostic regularly: who consumes Gold? Who owns the boundary? When was Gold last updated?
You don't need to rebuild. You need to move the logic to the right place.
After DAIS 2026, your Gold layer is no longer just for dashboards — it is the grounding surface for Genie Ontology and every AI agent you deploy. A degraded Gold layer now degrades your agents, not just your reports.

Introduction

At Data + AI Summit 2026, Ali Ghodsi told 30,000 attendees that AI doesn't have an intelligence problem — it has a context problem. Every headline announcement, from Genie One and Genie Ontology to LTAP and the Unity AI Gateway, traced back to that idea.

Here's what most recaps missed:

the context problem starts in your medallion architecture. If business logic is scattered across Silver, reporting tools, and dashboards, no ontology layer can extract clean context from it. The discipline problems we described in this article's first edition just got more expensive.

We have seen this pattern across dozens of data platform implementations. A team sets up Bronze, Silver, and Gold. The initial structure looks clean. Six months later, data engineers spend more time on maintenance than development. Reporting tools point at Silver tables. Gold layers go stale. On paper, it still looks like a medallion architecture. In practice, it is a mess of misplaced logic and blurred boundaries.

This article breaks down the specific practices that prevent that degradation, drawn from what we have observed across real enterprise implementations and from practitioner community discussions that have been growing louder since 2024.

Why Enterprise Medallion Implementations Break Down

The tricky part about medallion architecture is that it is actually quite good. The Bronze/Silver/Gold pattern is intuitive, easy to explain to stakeholders, and maps cleanly to the extract-transform-load paradigm that data teams already understand. That is also what makes it dangerous.

Because the pattern is simple, teams assume the execution will be simple too. It is not. The architecture does not fail because the design is wrong. It fails because convenience overrides discipline.

Here is the pattern we see repeatedly. A data engineer is working in the Silver layer, cleaning and conforming data. They need to add a join that combines two source tables with a business rule. They know it belongs in Gold. But they are already in Silver. The pipeline is already running. "Let me just do it here." One shortcut becomes two. Two becomes ten. Within a few months, Silver contains business logic, Gold is underused, and the reporting layer bypasses Gold entirely.

The result: 70% of compute can end up driven by a single materialized view. Reporting tools query Silver directly. Gold tables go weeks without being updated. Data engineers start doing more maintenance than development.

The community has noticed. Reddit threads like "What the CRAP is the difference between SILVER and GOLD???" reflect genuine confusion. Practitioners debate whether medallion is anything more than marketing rebranding of traditional data warehousing concepts. The frustration is real, but the solution is not to abandon the pattern. It is to enforce the boundaries that make it work.

In 2026 the cost of drift compounds. Genie Ontology, the live context layer announced at DAIS 2026, continuously extracts business meaning from your tables, metrics, lineage, and queries to ground Genie One and Genie Agents. It learns from what it finds. If your business logic lives in Silver, in materialized views, or inside Power BI measures, the ontology inherits that ambiguity — and your agentic coworkers give confidently wrong answers. Boundary discipline is no longer just a maintenance cost issue; it is an AI accuracy issue.

The Silver vs Gold Decision Rule

The single most useful rule we apply is this: does this transformation require domain knowledge?

If yes, it belongs in Gold. If no, it stays in Silver.

Silver handles cleaning, conforming, and typing. These are operations that any engineer can understand without knowing what the business does with the data. Deduplication, null handling, schema enforcement, timestamp standardisation. If a new hire on day one could look at the transformation and understand what it does without asking a single business question, it belongs in Silver.

Gold handles joins, aggregations, and business logic. These are operations that require understanding of how the organisation uses the data. Revenue attribution rules. Customer segmentation logic. Inventory thresholds that trigger alerts. If the transformation requires a conversation with a domain expert to understand why it exists, it belongs in Gold.

This rule eliminates the grey area that causes most medallion failures. Teams no longer debate where logic belongs. They ask one question and place it accordingly.

The "new hire test" makes it practical. Walk through each transformation in your Silver layer. For each one, ask: could someone who joined the team yesterday understand what this does and why? If the answer is no, that logic has drifted into the wrong layer.

"Where Unity Catalog Metrics fits."

Key points to make:

Unity Catalog Metrics (metric views) now lets you define business metrics — revenue attribution, churn, margin — once, governed in Unity Catalog, consumed identically by Databricks SQL, AI/BI, Genie, Excel, and agents.
This gives Gold a formal home for the "requires domain knowledge" logic that used to leak into BI tools. The decision rule now has three tiers: Silver = no domain knowledge; Gold = domain logic materialized as tables; Metric views = domain definitions that must stay consistent across every consumer.
Practical guidance: audit your Power BI/Tableau measures — anything defined in two or more dashboards belongs in a metric view.

How to Structure Unity Catalog for Medallion Architecture

Unity Catalog gives you three structural options for organising your medallion layers. The choice matters more than most teams realise, because it determines how governance, access control, and discoverability scale as your data estate grows.

Option 1: Single catalog. Everything in one catalog, layers separated by schemas. Simple to start with. Does not scale. Access control becomes a nightmare when you have hundreds of tables across business domains.

Option 2: Catalog-per-layer. One catalog for Bronze, one for Silver, one for Gold. This is the most common pattern we see in the wild. It is neat and intuitive. It also creates silos. Cross-layer lineage becomes harder to trace. Teams organise around layers instead of around business outcomes.

Option 3: Domain-first catalog. Catalogs aligned to business domains (finance, supply chain, customer), with layers as schemas within each domain catalog. Harder to set up initially. Scales cleanly. Governance maps to organisational structure. Teams own their domain end-to-end.

We recommend domain-first. Every time.

The reason is that layer-first organisation optimises for the engineering team's view of the world. Domain-first organisation optimises for the business's view of the world. When you align catalog structure to business domains, ownership becomes natural. The finance team owns the finance catalog. Access requests make sense. Data discovery follows the way people actually think about their data, not the way it was technically processed.

What DAIS 2026 changes:

External lineage (GA): lineage now extends beyond Databricks to upstream sources and downstream BI reports — making the domain-first structure even more valuable, because ownership and lineage finally align end-to-end.
Column-level popularity in Table Insights: a new derived signal showing which columns queries actually read; it also feeds Genie Ontology. Use it in your Gold-consumption audits (ties directly to your Question 1).
Multimodal FILE type (Beta): managed Delta and Iceberg tables can now govern PDFs, images, audio, and video. Your Bronze layer definition should expand: "raw ingestion" now includes unstructured files under the same Unity Catalog governance, not a side bucket.
Iceberg v3 GA + Delta interoperability: deletion vectors, row lineage, and VARIANT are now implemented compatibly across Delta and Iceberg, so Parquet data files are shared between formats. Practical implication: the medallion pattern is now format-agnostic — stop debating Delta vs Iceberg per layer; the storage layer is unifying (with Iceberg v4 / Delta 5 metadata convergence targeted for Q4 2026).

The Agentic Data Stack: What DAIS 2026 Means for Each Medallion Layer

This is the flagship addition and your main SEO play. Structure it layer by layer:

Bronze — ingestion goes agentic and real-time. Lakeflow Connect passed 100+ connectors (Salesforce, Workday, NetSuite, SharePoint, Google Analytics). Zerobus (GA) lands high-volume event data directly into Delta tables with no Kafka to manage — near real-time writes with Kafka-compatible APIs in beta. Spark Real-Time Mode (GA) brings millisecond-latency streaming to Spark Declarative Pipelines. Advice angle: Bronze discipline matters more when ingestion is this easy — more connectors means more raw tables means more temptation to shortcut.

Silver — pipelines get built by more people, so the boundary needs more enforcement. Lakeflow Designer (GA) is a natural-language, drag-and-drop pipeline builder aimed at non-engineers; everything it builds compiles to editable Spark Declarative Pipelines. Genie Code assists authoring; Genie ZeroOps (private preview) watches production pipelines, traces failures through Unity Catalog lineage, and tests fixes on shallow clones before a human approves. Advice angle: when business users can build Silver transformations in plain English, your "does it require domain knowledge?" rule must be encoded in PR review and boundary ownership — this directly reinforces your Question 2. Also note Lakeflow Jobs can now trigger on data readiness instead of fixed schedules.

Gold — from dashboard layer to context layer. Genie Ontology grounds Genie One (GA), Genie Agents (GA), and Agent Bricks in your governed data. Unity Catalog Metrics carries business definitions to every consumer. Advice angle: your Question 1 ("who consumes Gold?") gets a 2026 update — the answer should increasingly be agents, and agents are far less forgiving of stale or misplaced logic than analysts.

Serving — the case for a fat Gold layer just got stronger. Lakehouse//RT (Beta), powered by the new Reyden engine, claims sub-100ms latency at 12,000 QPS directly on governed Delta/Iceberg tables — collapsing the separate real-time serving tier (ClickHouse/Druid/Pinot pattern). LTAP (Lake Transactional/Analytical Processing, coming soon) lets Lakebase transactions and lakehouse analytics share a single copy of data in open formats, with no sync ETL. Advice angle: fewer copies and fewer serving tiers means the layer boundaries you do keep — Silver vs Gold — carry even more of the architectural weight.

Governance — a fourth concern joins the medallion. Unity AI Gateway is now the control point for AI spend, agent activity, token budgets, and model routing. Brief mention only; link to a future dedicated article (internal-link opportunity).

The 3-Question Medallion Health Check

Before proposing any changes to a client's medallion architecture, we run a simple diagnostic. Three questions that tell you whether the architecture is healthy or quietly degrading.

Question 1: Who consumes your Gold layer?

Check your query logs. If reporting tools, dashboards, and downstream applications primarily query Silver tables, your Gold layer is not doing its job. In a healthy medallion architecture, Gold is where consumption happens. Silver is an intermediate step, not a destination. If 70% or more of reporting queries hit Silver, the boundary has already eroded.check column-level popularity in Table Insights and Genie query logs, not just BI query history. If Genie spaces are grounding on Silver tables, your agents inherit the drift.

Question 2: Who owns the Silver/Gold boundary?

If the answer is "everyone" or "no one," you have a problem. The boundary between Silver and Gold is the most important architectural decision in the entire pattern. Someone needs to own it. That means a named person or team who reviews pull requests that add transformations, who decides whether new logic belongs in Silver or Gold, and who flags drift before it compounds. Boundary ownership now includes reviewing Lakeflow Designer pipelines built by non-engineers — a new drift vector.

Question 3: When was your Gold layer last meaningfully updated?

Not a schema change. Not a column rename. A meaningful update: new business logic, new aggregation, new domain-specific transformation. If Gold has not seen a meaningful update in weeks or months, it is likely that business logic is accumulating elsewhere. Silver, the application layer, or worse, in the reporting tools themselves.

Good answers to all three questions mean your medallion architecture is working as designed. Bad answers to any one of them mean the quiet failure is already underway.

The Cosmos Thrace Perspective

We have seen teams walk into organisations and propose roadmaps and big restructuring projects without first understanding the current situation. That is not how we work.

What we have learned from implementing data platforms across Europe is that medallion architecture is not very different from traditional data warehousing. The layers map to staging, integration, and presentation. The concepts are proven. What breaks is not the architecture. What breaks is the discipline to maintain boundaries when it is inconvenient.

The most common mistake is not a bad technical decision. It is a cultural one. Teams that treat the Silver/Gold boundary as a suggestion rather than a contract will always end up with a degraded architecture, regardless of what tools they use.

The good news is that you do not have to rebuild. In most cases, the data is there, the pipelines are there, and the infrastructure is sound. You just have to move the logic to the right place. That is a refactoring project, not a migration project. It is weeks, not months.

We run the 3-question diagnostic at the start of every engagement where medallion architecture is involved. It takes thirty minutes. It tells us whether the architecture needs a tune-up or a structural intervention. That clarity saves our clients time and money before a single line of code is written.

Conclusion

Medallion architecture works when the boundaries are enforced. It fails when convenience overrides discipline. The practices that matter are not complex: use the domain knowledge rule to place logic correctly, structure Unity Catalog around business domains, and run the 3-question health check regularly.

Medallion architecture was designed for analytics. In 2026 it is also the foundation of the agentic data stack — Genie Ontology, Genie One, and every agent you deploy will only ever be as trustworthy as your Gold layer. The boundaries you enforce today are the context your agents inherit tomorrow." Keep the download CTA and contact line.

If your medallion architecture feels like it is creating more work than it saves, the problem is likely not the pattern. It is where the logic lives. Download our free medallion architecture guide from here.

Also you can review latest feedback from our Managing Director Idan Harel while visiting DAIS 2026 here.

Cosmos Thrace is a Databricks Partner that specializes in exactly this kind of diagnostic and remediation work. If you want to talk through your medallion architecture, reach out at hi@cosmosthrace.com.

Book a 30-minute Databricks readiness review with one of our senior engineers. No pitch deck. We'll look at where you are, where you want to be, and the fastest path between the two.

FAQ

What people ask about Medallion Architecture

What is the difference between Silver and Gold in medallion architecture?

Silver handles cleaning, conforming, and typing. These are generic data quality operations that do not require business context. Gold handles joins, aggregations, and business logic that require domain knowledge. The simplest test: if a new hire cannot understand the transformation without asking a business question, it belongs in Gold.

How do I know if my medallion architecture is broken?

Check three things. First, look at your query logs. If most reporting queries hit Silver instead of Gold, the boundary has eroded. Second, check who owns the Silver/Gold boundary. If no one has explicit ownership, drift is inevitable. Third, check when Gold was last meaningfully updated. Stale Gold layers indicate business logic is accumulating elsewhere.

Should I use one Unity Catalog or multiple catalogs for medallion architecture?

We recommend a domain-first catalog structure, where each catalog aligns to a business domain (finance, supply chain, customer) with layers as schemas within each domain. This scales better than a single catalog or catalog-per-layer approach because governance maps to organisational structure and teams own their domain end-to-end.

Is medallion architecture just traditional data warehousing?

The concepts are closely related. Bronze maps to staging, Silver to integration, Gold to presentation. The medallion pattern adds the lakehouse context: schema evolution, Delta format, Unity Catalog governance, and streaming capabilities. But the discipline required is the same discipline that made traditional data warehousing work.

How long does it take to fix a broken medallion architecture?

In most cases, you do not need to rebuild. The data and infrastructure are already in place. The fix is moving misplaced logic from Silver to Gold. This is a refactoring project, not a migration. Depending on the size of the estate, it takes weeks rather than months. The 3-question diagnostic takes about thirty minutes and tells you the scope.

What is the most common cause of medallion architecture failure?

Convenience. Engineers place business logic in Silver because they are already working there. One shortcut compounds into ten. The architecture degrades slowly rather than failing dramatically. This is why explicit boundary ownership and regular health checks matter more than the initial design.

Sources

Databricks Data+AI Summit 2024 — "What's Wrong with Medallion Architecture" session. Acknowledged that companies regret the layering of their lake when boundaries are not maintained.
Reddit r/databricks — Community discussion: "What the CRAP is the difference between SILVER and GOLD???" Thread reflects widespread practitioner confusion about layer boundaries.
Daniel Beach — "This so called 'Medallion Architecture' is NOTHING more than Marketing Speak." Practitioner blog critique comparing medallion to traditional data warehousing patterns.
Databricks documentation — Unity Catalog best practices for multi-layer data architectures. Covers catalog, schema, and table organisation patterns.
Cosmos Thrace implementation experience — Patterns observed across dozens of data platform implementations in Financial Services, Manufacturing, and Retail across Europe.

More in this series: executive data roadmaps for the Databricks lakehouse, how Databricks lakehouse architecture is evolving, the CTO’s guide to Databricks SQL lakehouse analytics, enterprise AI agents on the lakehouse, and what to do when a lakehouse implementation stalls.

Book a 30-minute Databricks readiness review with one of our senior engineers. No pitch deck. We'll look at where you are, where you want to be, and the fastest path between the two.

What Are the Best Practices for Databricks Medallion Architecture?

Summary

Last Updated

Published

Authored By

Reviewed By

TL;DR

Introduction

Why Enterprise Medallion Implementations Break Down

The Silver vs Gold Decision Rule

How to Structure Unity Catalog for Medallion Architecture

The Agentic Data Stack: What DAIS 2026 Means for Each Medallion Layer

The 3-Question Medallion Health Check

The Cosmos Thrace Perspective

Conclusion

Book a 30-minute Databricks readiness review with one of our senior engineers. No pitch deck. We'll look at where you are, where you want to be, and the fastest path between the two.

What people ask about Medallion Architecture

Sources

Book a 30-minute Databricks readiness review with one of our senior engineers. No pitch deck. We'll look at where you are, where you want to be, and the fastest path between the two.

Services

Links

Help

Crafted By