Azure Data Factory Landing Area
When using an ELT ('pushdown') approach for data integration, BimlFlex creates a number of Execute Pipelines with various Stored Procedure calls and Copy Activities.
The Azure Data Factory (ADF) Copy Activity does not allow for transformations during the copying of the data. To support transformation of data using a Copy Activity, BimlFlex uses a Landing Area
that is accessible by SQL-based Stored Procedures in the Staging Area.
This way, data can be accessed directly to perform any required transformations while the ADF Execute Pipeline can use the Copy Activity to efficiently move the results to the Staging Area.
The general architecture is below with the Landing Area in the middle, supporting either a database or files.
Tip
For additional details on ADF or the Copy activity refer to the below guides:
Microsoft Docs: Azure Data Factory documentation
Microsoft Docs: Copy Activity in Azure Data Factory
Configure a Landing Area By Example
This section walks through the common steps and considerations for creating and configuring the landing connection. This example uses sample metadata to configure an appropriate Landing Area for a Table-Based Configuration.
For addition details of alternate configurations refer to the detailed configuration section.
Tip
The Varigence YouTube channel contains various introduction videos about using BimlFlex and BimlStudio. This video walks through how to configure a Landing Area for Azure Data Factory using BimlFlex.
High Level Steps
- Start with a sample metadata set. This example uses the
01 - MSSQL Starting Point
, which includes connections for a sample on-premises SSIS data solution including Batch and Project definitions - Archive the
BFX_DV
andBFX_DM
Project and Batches to minimize the amount of additional metadata that is visible in the app - Update the
EXT_AWLT_SRC
extract Project to have the target set to the staging area connection,BFX_STG
- Archive the
BFX_DV
andBFX_DM
Connections - Navigate to the Connections page, create a Landing connection by duplicating the BFX_STG connection
- Name the new connection
BFX_LND
- Update the Integration Stagefor the Connection to
Landing Area
- Name the new connection
- Enable the connection for ADF use by switching on the CLOUD setting
- Enabling the connection for Cloud exposes the Linked Services configuration for the connection
- Update the source connection to use the
BFX_LND
connection as the LANDING CONNECTION - Configure the Linked Service to use the appropriate configuration
- The video uses a pre-created Azure Key Vault to store the connection string information for the connections
- Add the Azure Key Vault as a configuration for the Linked Service
- Define the secret that the Linked Service should use as the connection string
- Reuse the Key Vault and secret configuration for each of the connections
- Reconfigure the
EXT_AWLT_SRC
Project to generate ADF artifacts instead of SSIS projects by updating the Integration Template used
Importing Metadata
The example uses the 01 - MSSQL Starting Point
metadata to generate Source System and Staging Area Connections.
Tip
For additional details on Importing Samples and Metadata refer to the below guides:
In order to proceed a Source System Connection and Staging Area Connection is needed.
Note
The example could be followed along with using any supported Source System and any Stage Area that supports a Table-Based Configuration for the Landing Area (Microsoft SQL Server, Azure Synapse).
Cleaning (Illustration and Example Only)
The video archives multiple BimlFlex Entities to illustrate the minimal configuration needed to implement and configure a Landing Area. These are optional steps to streamline and minimize configuration.
Warning
Do not archive any BimlFlex Entities if following along with your own metadata.
Create a Landing Connection
When using a Table Based Configuration the Landing Area and Staging Area should share the same connection details. The easiest way to achieve this is by duplicating the Staging Area Connection, naming it appropriately and updating the Integration Stage to Landing Area
.
Note
In this configuration, the Landing Area is a conceptual separation via naming practice only. Ensure that the Catalog
and Connection String
for the Staging Area and Landing Area are the same.
Configure Linked Services
Azure Data Factory requires the use of Linked Services. These are defined as part of the Connection but are configured separately on each connection once enabled. For each connection that will be used by the ADF process, the Cloud field is required to be enabled in order to expose the configuration for the corresponding Linked Service.
The example uses a pre-created Azure Key Vault to manage the complete connection string through use of a Secret. With familiarly of Azure Key Vaults and the deployment environment, the Linked Services can be configured to suit individual sensitive information management practices.
Tip
For additional details on Linked Services, Azure Key Vaults and Sensitive Information Management refer to the below guides:
Configure the BimlFlex Project for ADF
The last step is to configure the BimlFlex Project to use the integration template of Azure Data Factory
.
This step is performed last to ensure there are no validation errors when saving the Project.
Note
The Azure Data Factory
integration template requires all Connections to have cloud enabled.
The metadata is now configured to stage data via a Landing Area using Azure Data Factory.
Detailed Configuration
Depending to the system type of the Staging Area, the Landing Area can be configured as either a group of database tables or a blob storage. The below table for outlines the supported Landing Area configurations for each supported Staging Area.
Use the below selectors to configure documentation for the desired scenario.
Configured for Azure Synapse Analytics.
Available Landing Area configurations.
Configured for a table based landing.
It is important to note that although the Landing Area is configured as a separate BimlFlex Connection, in a Table Based Configuration it should be considered the same database as the Staging Area. As such the Landing Area does not deploy separately, and instead deploys with the Staging Area.
Important
A Table Based Configuration requires the Landing Area to be in the same database as the Staging Area. Ensure the connection string, catalog and applicable Linked Service configurations are identical to that of the Staging Area Connection.
When using a Table-Based Configuration, the Landing Area is a group of tables separated through use of naming conventions. By default the Landing Area tables will have a land_
prefix appended to all landing tables.
Note
The prefix appended to landing tables can be configured by updating the Append Name Landing
BimlFlex Setting in the Staging Naming
group.
The default configuration is land
.
The following Azure Settings are used to configure the table destinations.
- Staging Naming\Naming - configures naming patterns for Landing Tables and other Staging Objects
Tip
For additional details on Blob Storage refer to the below guides: