Implementing Fabric Lakehouse Using Data Factory
BimlFlex provides an intuitive process to implement Microsoft Fabric Lakehouse using Data Factory (FDF) for cloud-based data warehousing solutions. This integration extends metadata-driven automation to the Fabric ecosystem, enabling customers to design and generate Lakehouse solutions directly within the platform.
Architecture Overview
BimlFlex uses Data Factory copy commands to ingest and land (stage) source data in OneLake, Azure Blob Storage, or Data Lake Storage Gen2. BimlFlex provides logic to map the resulting files so that the generated code can load the data into Fabric Lakehouse tables.
Fabric Lakehouse Features
Microsoft Fabric Lakehouse support in BimlFlex provides capabilities similar to the Databricks and Snowflake integrations:
Metadata Import Support
Schema and object metadata can be imported directly from Fabric Lakehouse to drive automation patterns. This reduces manual mapping and accelerates the creation of Data Vault and Data Mart solutions based on existing Lakehouse structures.
To import metadata from a source:
- Navigate to the source connection in the Connections editor
- Click the Preview button to view available objects
- Select the tables you want to import (you can import thousands of tables at once)
- BimlFlex imports the metadata and makes it available for modeling
Data Vault Templates
New templates generate hubs, links, and satellites within Fabric Lakehouse. These accelerate Data Vault implementation on Fabric and ensure consistent, metadata-driven automation across the platform.
Supported Data Vault constructs include:
- Hubs: Core business entities
- Links: Relationships between hubs
- Satellites: Descriptive attributes with history
PIT (Point-in-Time) tables and Bridge tables are not yet supported for Fabric Lakehouse. These constructs are available when using other integration templates such as ADF with SQL Server or Snowflake targets.
Data Mart Templates
Specialized templates for Fabric Lakehouse automate the creation of dimensional models and reporting structures. This provides optimized Data Mart solutions that align with Fabric's analytics and performance capabilities.
Delete Detection Templates
Delete detection for Fabric Lakehouse is currently under development and is not yet available in the generated output. For pipelines that require delete detection, consider using ADF with Databricks or Snowflake integration templates.
Lakehouse Medallion Architecture Support
BimlFlex supports the medallion architecture pattern (Bronze/Silver/Gold) for Fabric Lakehouse implementations:
| Layer | BimlFlex Implementation | Fabric Components |
|---|---|---|
| Bronze | Staging + Persistent Staging | Landing files in OneLake, Delta tables for raw data |
| Silver | Data Vault or Normal Form | Cleansed and integrated Delta tables |
| Gold | Data Mart / Dimensional | Optimized analytics tables and views |
Bronze Layer (Raw Data)
The Bronze layer captures raw data as received from source systems. In BimlFlex, this maps to:
- Landing Area: Initial data ingestion via Data Factory copy commands
- Staging Area: Transient storage for current batch processing
- Persistent Staging Area: Historical retention of all received data
Silver Layer (Curated Data)
BimlFlex supports two approaches for the Silver layer:
- Data Vault (recommended): Provides flexibility, auditability, and scalability through Hub, Link, and Satellite patterns
- Normal Form: Traditional relational modeling for simpler use cases
Gold Layer (Business-Ready Data)
The Gold layer delivers business-ready data through:
- Dimensional Models: Star schema with Fact and Dimension tables
- Data Marts: Purpose-built analytics structures
For detailed guidance on implementing each layer, see the Delivering Lakehouse documentation.
Prerequisites
Before implementing Fabric Lakehouse with Data Factory, ensure you have completed the following:
- Fabric Configuration: Complete the setup outlined in the Microsoft Fabric Configuration Overview
- Storage: Configure OneLake, blob storage, or Data Lake Storage Gen2 for landing, staging, archive, and error containers
- Connections: Create and configure the Fabric Lakehouse connection in BimlFlex
Detailed prerequisites and configuration steps are provided in the Microsoft Fabric Configuration Overview section.
Configuring Fabric Lakehouse in BimlFlex
Loading Sample Metadata
BimlFlex provides two pre-configured sample metadata sets for Fabric Lakehouse:
| Sample | Description | Use Case |
|---|---|---|
| Fabric Data Vault | Pre-configured for Data Vault implementation | Building a silver layer with Hub, Link, and Satellite patterns |
| Fabric Datamart | Pre-configured for dimensional modeling | Building bronze-to-gold layer data marts |
To load a sample:
- Navigate to the BimlFlex Dashboard
- Select from the Load Sample Metadata dropdown
- Choose either Fabric Data Vault or Fabric Datamart
The sample metadata includes pre-configured projects, connections, and objects that demonstrate best practices for Fabric Lakehouse implementations.
At this point, the BimlFlex App displays the Dashboard with the Load Sample Metadata dropdown expanded. The dropdown lists available sample sets including Fabric Data Vault and Fabric Datamart. Navigate to Dashboard in the BimlFlex App and open the Load Sample Metadata dropdown to see these options listed alongside any other available samples for your configured environment.
For more information on lakehouse and data modeling implementations:
Connection Configuration
Configure your Fabric Lakehouse connections from within the BimlFlex Connections editor:
Source System Connection:
- Enable Cloud option for the source system
- Configure Staging / Landing Environment for OneLake, Blob Storage, or Data Lake Storage Gen2 with Data Factory connections
At this point, the BimlFlex App displays the Connections editor for the source system connection. The form shows the Cloud toggle enabled, and the Staging / Landing Environment field is populated with a connection referencing OneLake, Blob Storage, or Data Lake Storage Gen2. Navigate to Connections in the BimlFlex App and select your source connection to review and configure these settings.
Fabric Lakehouse Connection:
- Set System Type to Fabric Lakehouse
- Configure the Connection String appropriately for Fabric Lakehouse
- Configure Integration Template to Data Factory Source -> Target
- Set External Location to the OneLake path (e.g.,
abfss://<workspace>@onelake.dfs.fabric.microsoft.com/<lakehouse>/Files/) - Set External Reference to the Fabric connection ID (every connection in Fabric has an internal ID that BimlFlex uses to reference it)
At this point, the BimlFlex App displays the Connections editor for the Fabric Lakehouse connection. The form shows System Type set to Fabric Lakehouse, the Integration Template set to Data Factory Source -> Target, the External Location field populated with the OneLake ABFSS path, and the External Reference field containing the Fabric connection ID. Navigate to Connections in the BimlFlex App and select your Fabric Lakehouse connection to review these settings.
The External Reference is required for all Fabric Lakehouse connections. This ID can be found in the Fabric portal and enables BimlFlex to properly reference the connection when generating and deploying Data Factory pipelines.
Batch Configuration
Prior to building your solution, configure batches from the BimlFlex Batches editor to:
- Assign batches to different compute resources
- Configure scaling parameters
- Set execution priorities
At this point, the BimlFlex App displays the Batches editor, showing the list of configured batches with columns for batch name, compute resource assignment, scaling parameters, and execution priority. Navigate to Batches in the BimlFlex App to create or update batches before triggering a build of your Fabric Lakehouse solution.
Generated Output
BimlFlex generates all necessary Fabric Lakehouse artifacts automatically—you do not need to write notebooks, stored procedures, or pipeline code manually:
| Artifact Type | Description |
|---|---|
| Lakehouse Tables | DDL scripts for creating all Lakehouse table structures |
| Notebooks | Spark notebooks for data processing (all code to load data is generated automatically) |
| Stored Procedures | SQL procedures for transformation logic where applicable |
| Data Factory Pipelines | Complete pipeline orchestration including copy activities, notebook execution, and error handling |
Pipeline Features
Generated pipelines include sophisticated data movement logic:
- High watermark lookups for incremental loading
- Copy activities with proper connection settings
- Notebook execution for staging layer processing
- Automatic file handling (archive/error movement)
- Error handling and retry logic
At this point, BimlStudio displays the Build Output pane after a successful build of the Fabric Lakehouse solution. The output lists the generated artifacts by category: Lakehouse table DDL scripts, Spark notebooks, stored procedures (where applicable), and Data Factory pipeline JSON files. The output pane also shows any warnings or informational messages produced during generation. Open your project in BimlStudio and trigger a build to see this output.
Deployed Solution
Once deployed to Data Factory, the solution provides:
- Visual pipeline representation
- Monitoring and logging capabilities
- Error handling with automatic file archiving
At this point, the Microsoft Fabric Data Factory interface displays the deployed pipeline in its canvas view. The pipeline shows a sequence of activities: a high watermark lookup activity, copy activities for landing source files into OneLake, notebook execution activities for staging layer processing, and file archiving activities for post-load cleanup. Navigate to your Fabric workspace, open Data Factory, and select the deployed pipeline to view this canvas.
Monitoring and Management
After deployment, you can:
- Scale compute resources up or down
- View copy command completions and errors
- Suspend or resume solution execution
- Monitor execution status and performance
At this point, the Microsoft Fabric Data Factory Monitor view displays the pipeline run history for the deployed Lakehouse solution. Each run shows its status (succeeded, failed, in progress), start time, duration, and the number of copy activity completions and errors. Selecting an individual run expands the activity-level detail, including bytes read/written per copy activity. Navigate to your Fabric workspace, open Data Factory, and select Monitor to access this view.
Files encountering errors are automatically moved to an error folder. On subsequent runs, these files will have already been processed and archived appropriately.
Related Resources
BimlFlex Documentation
- Microsoft Fabric Configuration Overview
- Implementing Fabric Warehouse
- Configuring a Landing Area
- Configuring Connections
External Resources
- Microsoft Fabric Documentation
- Fabric Lakehouse Overview
- Fabric Data Factory Documentation
- Fabric Data Factory Data Source Management
Video Resources
Refer to the video in the Microsoft Fabric Configuration Overview for a walkthrough of configuring BimlFlex for Microsoft Fabric, including Lakehouse implementations.
Fabric as a Source System
BimlFlex supports using Fabric Lakehouse as both a source AND target. This enables scenarios such as:
- Processing data from one Lakehouse to another Lakehouse
- Moving data between Bronze, Silver, and Gold layers within Fabric
- Using naming patterns and schemas for layer separation
To configure Fabric Lakehouse as a source:
- Create a connection with Integration Stage set to Source System
- Set System Type to Fabric Lakehouse
- Configure the appropriate connection string and external reference
Benefits of Using BimlFlex with Fabric Lakehouse
BimlFlex provides significant advantages when building Fabric Lakehouse solutions:
- No Code Required: Your team only needs to understand data modeling—BimlFlex generates all notebooks, stored procedures, and pipelines automatically
- Focus on Design: Concentrate on source-to-target mappings and transformations, not implementation details
- Automatic Updates: As Microsoft Fabric evolves, BimlFlex templates are updated to ensure optimal implementations
- Data Vault Accelerator: Full access to the Data Vault accelerator for modeling hubs, links, and satellites
- Transformation Support: Apply transformations directly in BimlFlex, including macros for reusable patterns
- Data Lineage: Complete data lineage visualization for any object in your solution
- Schema Documentation: Automatic schema diagrams and documentation generation