Lakehouse Medallion Architecture
The medallion architecture (also known as multi-hop architecture) is a data design pattern that organizes data into layers of increasing quality and refinement. BimlFlex fully supports this pattern through its flexible integration stages.
Architecture Layers
Bronze Layer (Raw)
The Bronze layer contains raw data as received from source systems with minimal transformation. Data is typically:
- Appended incrementally
- Retained in its original format
- Augmented with metadata (load timestamps, source identifiers)
BimlFlex Implementation: Staging and Persistent Staging Area
Silver Layer (Curated)
The Silver layer contains cleansed, conformed, and integrated data. BimlFlex supports two approaches:
-
Data Vault (Recommended)
- Hub, Link, and Satellite patterns
- Maximum flexibility and auditability
- Ideal for complex enterprise scenarios
- See: Data Vault Documentation
-
Normal Form
- Traditional 3NF relational modeling
- Suitable for simpler use cases
- Familiar to teams with RDBMS backgrounds
Gold Layer (Business-Ready)
The Gold layer contains aggregated, business-ready data optimized for analytics and reporting:
- Dimensional models (Star Schema)
- Fact and Dimension tables
- Pre-aggregated metrics
- Business-specific data marts
BimlFlex Implementation: Data Mart Documentation
Benefits of Medallion Architecture
| Benefit | Description |
|---|---|
| Separation of Concerns | Each layer has a distinct purpose and SLA |
| Data Quality | Progressive refinement improves quality at each stage |
| Flexibility | Raw data retention enables reprocessing and new use cases |
| Performance | Gold layer optimized for query performance |
| Governance | Clear lineage from source to consumption |
Platform Considerations
Microsoft Fabric Lakehouse
- Uses OneLake for unified storage
- Delta tables for all layers
- Native integration with Power BI for Gold layer consumption
Databricks
- Unity Catalog for governance
- Delta Lake format throughout
- Supports both SQL and notebook-based processing
Snowflake
- Native Snowflake tables
- Zero-copy cloning for development
- Automatic clustering optimization