Back to Insights
Data Architecture
20 November 20255 min read

Why Medallion Architecture Works

Exploring the bronze, silver, gold pattern and why it's becoming the standard for modern data platforms.

Arc Horizon Team

Arc Horizon

Why Medallion Architecture Works

The medallion architecture — bronze, silver, gold — has become the de facto standard for organising data in modern platforms. But why has this particular pattern caught on so effectively? And more importantly, when does it actually work?

The Problem It Solves

Traditional data architectures often suffer from a fundamental tension: the need for data to be both raw (for auditability and reprocessing) and refined (for consumption and analysis).

Historically, organisations attempted to solve this by either:

  • Keeping everything raw and transforming on the fly (expensive, slow)
  • Transforming once and discarding the source (brittle, inflexible)
  • Creating sprawling ETL pipelines with unclear lineage (unmaintainable)

The medallion architecture provides a structured approach that addresses all three concerns.

The Three Layers Explained

Bronze: The Ingestion Layer

The bronze layer is your system of record. Data lands here in its raw form, preserving:

  • Original schema and data types
  • Ingestion timestamps and metadata
  • Source system identifiers
  • Historical records (append-only or slowly changing)

Key Principle

Bronze data should be immutable. Once ingested, it shouldn't change. This gives you the ability to reprocess and rebuild downstream layers without losing fidelity.

Silver: The Conformance Layer

Silver is where data becomes enterprise-ready. This layer handles:

  • Data type standardisation
  • Deduplication and entity resolution
  • Business key alignment
  • Basic quality checks and filtering
  • Joining related entities from different sources

The silver layer is often where you'll find your conformed dimensions and fact tables — the building blocks of analytics.

Gold: The Consumption Layer

Gold is purpose-built for specific use cases. These are:

  • Aggregated datasets for dashboards
  • Feature stores for machine learning
  • API-ready data products
  • Department-specific data marts

Gold tables are denormalised, pre-aggregated, and optimised for read performance.

Why This Pattern Works

1. Clear Separation of Concerns

Each layer has a single responsibility:

  • Bronze: capture and preserve
  • Silver: cleanse and conform
  • Gold: aggregate and serve

This makes debugging straightforward. When something goes wrong, you know exactly where to look.

2. Incremental Reprocessing

Because each layer builds on the previous, you can:

  • Reprocess silver from bronze when business rules change
  • Rebuild gold without touching upstream data
  • Add new gold layers without modifying silver

3. Multiple Consumption Patterns

Different consumers have different needs:

| Consumer | Typical Layer | Reason | |----------|---------------|--------| | Data Scientists | Bronze/Silver | Need granular, historical data | | Analysts | Gold | Need pre-aggregated, fast queries | | ML Engineers | Silver/Gold | Need clean features at scale | | Compliance | Bronze | Need immutable audit trail |

4. Technology Agnostic

The pattern works equally well on:

  • Databricks with Delta Lake
  • Snowflake with dynamic tables
  • AWS with Glue and Athena
  • Azure with Synapse and Data Lake

Common Pitfalls to Avoid

Over-Engineering Bronze

Bronze should be simple. Don't add complex transformations here — that defeats the purpose of having raw data.

Do: Add ingestion metadata (timestamps, source system, batch ID)

Don't: Apply business logic, filtering, or complex parsing

Skipping Silver

Some teams try to go directly from bronze to gold. This leads to:

  • Duplicated transformation logic across gold tables
  • Inconsistent business rules
  • Harder debugging and maintenance

Too Many Gold Tables

Gold should be purposeful. If you have 500 gold tables, you've likely:

  • Created tables for one-off analyses that should be views
  • Duplicated logic that belongs in silver
  • Lost control of your data catalogue

A good rule of thumb: you should have 5-10x more silver tables than gold tables. Gold is for repeated, high-value use cases only.

Implementing Medallion Architecture

Start Small

Don't try to migrate everything at once. Pick a single domain — perhaps customer data or transactions — and build out the full bronze-silver-gold pipeline.

Invest in Lineage

Tools like dbt make it trivial to document and visualise the relationships between layers. This is essential for debugging and governance.

Automate Quality Checks

Each layer transition should include data quality checks:

  • Bronze → Silver: Schema validation, null checks, type casting
  • Silver → Gold: Business rule validation, referential integrity

Plan for Evolution

Your silver and gold schemas will change. Plan for this by:

  • Using schema evolution features in your platform
  • Versioning your transformation logic
  • Maintaining backward compatibility where possible

When Medallion Isn't the Answer

No architecture fits every situation. Consider alternatives when:

  • Real-time requirements dominate: Streaming architectures may be more appropriate
  • Data volumes are tiny: The overhead may not be justified
  • Single source, single use case: A simpler staging → production model may suffice

Conclusion

The medallion architecture has become standard because it elegantly solves the core tension in data management: preserving raw data whilst making it consumable.

By providing clear layer boundaries, enabling incremental processing, and supporting multiple consumption patterns, it gives organisations a flexible foundation for their data platforms.

The key is implementation discipline — keeping bronze simple, investing in silver, and being purposeful about gold.


Further Reading

Want to Discuss This Further?

Book a call with our team to explore how these insights apply to your organisation.

Book an Intro Call