Skip to main content

Using BimlFlex with Databricks

Databricks is a unified analytics platform provided by Databricks Inc. BimlFlex supports Databricks as a target platform for building metadata-driven lakehouse solutions.

BimlFlex generates native Azure Data Factory pipelines and Databricks assets directly from your metadata. All generated artifacts remain fully native to Databricks and ADF, with no proprietary runtime required. You maintain full control of your environment.

Supported Source Systems

BimlFlex supports a wide variety of source systems when loading data into Databricks:

  • Relational databases (SQL Server, Oracle, etc.)
  • Flat files and Parquet files
  • Microsoft Dynamics 365 and Dynamics CRM
  • Salesforce
  • FTP and SFTP sources

Lakehouse Layers

BimlFlex organizes a Databricks lakehouse into distinct layers, each serving a specific purpose:

LayerDescriptionBimlFlex Components
LandingRaw data extracted from source systemsLanding connection with Azure Blob Storage
BronzeHistorized raw data (optional)Staging and Persistent Staging connections
SilverIntegrated and historized business dataData Vault connection (Hubs, Links, Satellites)
GoldAnalytics-ready dimensional modelsData Mart connection (Facts, Dimensions)

Layer Configuration Options

  • Persisted Landing: Enable to preserve raw source data exactly as extracted. Disable to overwrite on each load.
  • Persistent Staging: Enable to historize the bronze layer with full change history. Disable to keep only the current state.

Projects

Projects define how data flows through your solution. They stitch together the connections and determine the path from source through landing and staging into your integration layers and finally into reporting.

tip

Please take note of these additional resources on working with Databricks:

Implementing ADF pipelines to Databricks