Bronze Layer - Raw Data Ingestion
The Bronze layer is the foundation of the medallion architecture, capturing raw data exactly as received from source systems. In BimlFlex, this maps to the Staging and Persistent Staging Area integration stages.
Bronze Layer Characteristics
- Raw Format: Data stored as received with minimal transformation
- Append-Only: New data appended without modifying existing records
- Full History: Complete record of all data ever received
- Metadata Enriched: Augmented with load timestamps and source identifiers
BimlFlex Implementation
Landing Area
For cloud-based sources, data first lands in a Landing Area:
- OneLake (Microsoft Fabric)
- Azure Blob Storage / ADLS Gen2 (Databricks)
- Snowflake stages (Snowflake)
Staging Area
The Staging Area holds the current batch of data being processed:
- Truncated with each load cycle
- Used for delta detection
- Temporary holding area before downstream processing
Persistent Staging Area (PSA)
The Persistent Staging Area provides historical retention:
- Retains all data ever received
- Enables reprocessing and recovery
- Supports full audit trail requirements
tip
For detailed configuration of staging layers, see the Delivering Staging Layer documentation.
Configuration Options
BimlFlex provides two history modes for the Bronze layer:
| Mode | Description | Use Case |
|---|---|---|
| Full History (PSA) | Retains all changes over time | Audit requirements, reprocessing needs |
| Current State (ODS) | Retains only latest version | Operational reporting, reduced storage |
Configure via the Persist History setting on source connections.
Platform-Specific Considerations
Microsoft Fabric
- Land data in OneLake Files area
- Load to Bronze Delta tables
- Use shortcuts for external data sources
Databricks
- Land in Azure Blob/ADLS as Parquet
- Load to Delta tables in Bronze schema
- Leverage Auto Loader for streaming ingestion
Snowflake
- Stage data in internal/external stages
- Load to Bronze schema tables
- Use Snowpipe for continuous ingestion