Skip to main content

Lakehouse Medallion Architecture

The medallion architecture (also known as multi-hop architecture) is a data design pattern that organizes data into layers of increasing quality and refinement. BimlFlex fully supports this pattern through its flexible integration stages.

Architecture Layers

Bronze Layer (Raw)

The Bronze layer contains raw data as received from source systems with minimal transformation. Data is typically:

  • Appended incrementally
  • Retained in its original format
  • Augmented with metadata (load timestamps, source identifiers)

BimlFlex Implementation: Staging and Persistent Staging Area

Silver Layer (Curated)

The Silver layer contains cleansed, conformed, and integrated data. BimlFlex supports two approaches:

  1. Data Vault (Recommended)

    • Hub, Link, and Satellite patterns
    • Maximum flexibility and auditability
    • Ideal for complex enterprise scenarios
    • See: Data Vault Documentation
  2. Normal Form

    • Traditional 3NF relational modeling
    • Suitable for simpler use cases
    • Familiar to teams with RDBMS backgrounds

Gold Layer (Business-Ready)

The Gold layer contains aggregated, business-ready data optimized for analytics and reporting:

  • Dimensional models (Star Schema)
  • Fact and Dimension tables
  • Pre-aggregated metrics
  • Business-specific data marts

BimlFlex Implementation: Data Mart Documentation

Benefits of Medallion Architecture

BenefitDescription
Separation of ConcernsEach layer has a distinct purpose and SLA
Data QualityProgressive refinement improves quality at each stage
FlexibilityRaw data retention enables reprocessing and new use cases
PerformanceGold layer optimized for query performance
GovernanceClear lineage from source to consumption

Platform Considerations

Microsoft Fabric Lakehouse

  • Uses OneLake for unified storage
  • Delta tables for all layers
  • Native integration with Power BI for Gold layer consumption

Databricks

  • Unity Catalog for governance
  • Delta Lake format throughout
  • Supports both SQL and notebook-based processing

Snowflake

  • Native Snowflake tables
  • Zero-copy cloning for development
  • Automatic clustering optimization