Common Patterns
Reusable BimlScript patterns for common data engineering tasks. Each pattern includes working code examples you can adapt for your projects.
Staging Layer Pattern
When to Use
- Loading data from source systems into a staging area
- Creating a consistent landing zone before transformation
- Maintaining source data structure with audit columns
Design Principles
- Temporary storage: Non-persistent, truncated before each load
- No transformations: Same attributes as source
- Audit columns: Add identity, load date for tracking
Code Example
Step 1: Import source metadata
<#@ template tier="10" #>
<#@ import namespace="Varigence.Biml.CoreLowerer.SchemaManagement" #>
<#
var sourceConnection = RootNode.OleDbConnections["Source"];
var includedSchemas = new List<string>{"Sales", "Person", "Production"};
var tables = sourceConnection.GetDatabaseSchema(
includedSchemas,
null,
ImportOptions.ExcludeForeignKey | ImportOptions.ExcludeViews
).TableNodes;
#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Tables>
<# foreach (var table in tables) { #>
<Table Name="<#= table.Name #>" SchemaName="Staging.<#= table.Schema #>">
<Columns>
<# foreach (var column in table.Columns) { #>
<#= column.GetBiml() #>
<# } #>
<!-- Audit columns -->
<Column Name="StageID" DataType="Int32" IsNullable="false" IdentityIncrement="1" IdentitySeed="1" />
<Column Name="LoadDateTime" DataType="DateTime2" IsNullable="false" />
</Columns>
</Table>
<# } #>
</Tables>
</Biml>
Step 2: Generate DDL deployment package
<#@ template tier="20" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="DeployStaging" ConstraintMode="Parallel">
<Tasks>
<# foreach (var table in RootNode.Tables.Where(t => t.Schema.Name == "Staging")) { #>
<ExecuteSQL Name="Create_<#= table.Name #>" ConnectionName="Target">
<DirectInput><#= table.GetDropAndCreateDdl() #></DirectInput>
</ExecuteSQL>
<# } #>
</Tasks>
</Package>
</Packages>
</Biml>
Truncate and Load Pattern
When to Use
- Full refresh of staging or dimension tables
- Small to medium datasets where full reload is acceptable
- Scenarios where data lineage isn't required
Code Example
<#@ template tier="30" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# foreach (var table in RootNode.Tables) { #>
<Package Name="TruncateLoad_<#= table.Schema #>_<#= table.Name #>" ConstraintMode="Linear">
<Tasks>
<!-- Step 1: Truncate target -->
<ExecuteSQL Name="TruncateTarget" ConnectionName="Target">
<DirectInput>TRUNCATE TABLE [<#= table.Schema #>].[<#= table.Name #>]</DirectInput>
</ExecuteSQL>
<!-- Step 2: Load data -->
<Dataflow Name="LoadData">
<Transformations>
<OleDbSource Name="Source" ConnectionName="Source">
<DirectInput>SELECT * FROM [<#= table.Schema #>].[<#= table.Name #>]</DirectInput>
</OleDbSource>
<!-- Add load timestamp -->
<DerivedColumns Name="AddAuditColumns">
<Columns>
<Column Name="LoadDateTime" DataType="DateTime2" Length="8">(DT_DBTIMESTAMP2,7)GETDATE()</Column>
</Columns>
</DerivedColumns>
<OleDbDestination Name="Destination" ConnectionName="Target">
<ExternalTableOutput Table="[<#= table.Schema #>].[<#= table.Name #>]" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages>
</Biml>
Metadata-Driven Generation
When to Use
- Enterprise-scale solutions with many tables
- Need to filter or customize by table metadata
- Centralized control over package generation
Code Example: Using Annotations
<#@ template tier="10" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Tables>
<Table Name="Customer" SchemaName="dbo">
<Annotations>
<Annotation Tag="LoadType">Full</Annotation>
<Annotation Tag="Priority">High</Annotation>
</Annotations>
<Columns>
<Column Name="CustomerID" DataType="Int32" IsNullable="false" />
<Column Name="Name" DataType="String" Length="100" />
</Columns>
</Table>
<Table Name="OrderHistory" SchemaName="dbo">
<Annotations>
<Annotation Tag="LoadType">Incremental</Annotation>
<Annotation Tag="Priority">Low</Annotation>
</Annotations>
<Columns>
<Column Name="OrderID" DataType="Int32" IsNullable="false" />
<Column Name="OrderDate" DataType="DateTime" />
</Columns>
</Table>
</Tables>
</Biml>
Generate packages based on annotations:
<#@ template tier="20" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# foreach (var table in RootNode.Tables) {
var loadType = table.GetTag("LoadType") ?? "Full";
var priority = table.GetTag("Priority") ?? "Normal";
#>
<Package Name="Load_<#= table.Name #>_<#= loadType #>" ConstraintMode="Linear">
<Annotations>
<Annotation Tag="Priority"><#= priority #></Annotation>
</Annotations>
<Tasks>
<# if (loadType == "Full") { #>
<ExecuteSQL Name="Truncate" ConnectionName="Target">
<DirectInput>TRUNCATE TABLE [<#= table.Schema #>].[<#= table.Name #>]</DirectInput>
</ExecuteSQL>
<# } #>
<Dataflow Name="LoadData">
<!-- Data flow content -->
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages>
</Biml>
Incremental Load Pattern
When to Use
- Large tables where full refresh is too slow
- CDC (Change Data Capture) scenarios
- Delta detection based on timestamp or version columns
Code Example: Using High-Water Mark
<#@ template tier="30" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# foreach (var table in RootNode.Tables.Where(t => t.GetTag("LoadType") == "Incremental")) { #>
<Package Name="IncrementalLoad_<#= table.Name #>" ConstraintMode="Linear">
<Variables>
<Variable Name="LastLoadDate" DataType="DateTime" EvaluateAsExpression="true">
(DT_DATE)"1900-01-01"
</Variable>
<Variable Name="CurrentLoadDate" DataType="DateTime" EvaluateAsExpression="true">
GETDATE()
</Variable>
</Variables>
<Tasks>
<!-- Get last successful load date -->
<ExecuteSQL Name="GetLastLoadDate" ConnectionName="Target" ResultSet="SingleRow">
<DirectInput>
SELECT ISNULL(MAX(LoadDateTime), '1900-01-01') AS LastLoadDate
FROM [<#= table.Schema #>].[<#= table.Name #>]
</DirectInput>
<Results>
<Result Name="LastLoadDate" VariableName="User.LastLoadDate" />
</Results>
</ExecuteSQL>
<!-- Load new/changed records -->
<Dataflow Name="LoadDelta">
<Transformations>
<OleDbSource Name="GetChanges" ConnectionName="Source">
<DirectInput>
SELECT *
FROM [<#= table.Schema #>].[<#= table.Name #>]
WHERE ModifiedDate > ?
</DirectInput>
<Parameters>
<Parameter Name="0" VariableName="User.LastLoadDate" />
</Parameters>
</OleDbSource>
<OleDbDestination Name="InsertChanges" ConnectionName="Target">
<ExternalTableOutput Table="[<#= table.Schema #>].[<#= table.Name #>]" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages>
</Biml>
Error Handling Pattern
When to Use
- Production packages requiring robust error handling
- Logging failed rows for later analysis
- Graceful failure with notifications
Complete Error Handling Walkthrough
This pattern implements comprehensive error handling with:
- Data flow error row redirection
- Task-level event handlers
- Centralized error logging
Step 1: Create Error Logging Table
CREATE TABLE dbo.ErrorLog (
ErrorLogID INT IDENTITY(1,1) PRIMARY KEY,
PackageName NVARCHAR(255),
TaskName NVARCHAR(255),
ErrorCode INT,
ErrorDescription NVARCHAR(MAX),
ErrorColumnName NVARCHAR(255),
SourceRowData NVARCHAR(MAX),
CreatedDate DATETIME2 DEFAULT GETDATE()
);
Step 2: Data Flow Error Redirection
<Dataflow Name="LoadData">
<Transformations>
<OleDbSource Name="Source" ConnectionName="Source">
<DirectInput>SELECT * FROM SourceTable</DirectInput>
</OleDbSource>
<DataConversion Name="ConvertTypes">
<Columns>
<Column SourceColumn="Amount" TargetColumn="AmountDecimal" DataType="Decimal" Precision="18" Scale="2"/>
</Columns>
<ErrorHandling ErrorRowDisposition="RedirectRow" TruncationRowDisposition="RedirectRow"/>
</DataConversion>
<OleDbDestination Name="Destination" ConnectionName="Target">
<ExternalTableOutput Table="TargetTable"/>
<ErrorHandling ErrorRowDisposition="RedirectRow"/>
</OleDbDestination>
<!-- Error output path -->
<OleDbDestination Name="ErrorDestination" ConnectionName="Target">
<InputPath OutputPathName="ConvertTypes.Error"/>
<ExternalTableOutput Table="dbo.ErrorLog"/>
</OleDbDestination>
</Transformations>
</Dataflow>
Step 3: Event Handler for Task Failures
<Package Name="LoadWithErrorHandling">
<Events>
<Event Name="OnError" EventType="OnError">
<Tasks>
<ExecuteSQL Name="LogError" ConnectionName="Target">
<DirectInput>
INSERT INTO dbo.ErrorLog (PackageName, TaskName, ErrorCode, ErrorDescription)
VALUES (?, ?, ?, ?)
</DirectInput>
<Parameters>
<Parameter Name="0" VariableName="System.PackageName" DataType="String"/>
<Parameter Name="1" VariableName="System.SourceName" DataType="String"/>
<Parameter Name="2" VariableName="System.ErrorCode" DataType="Int32"/>
<Parameter Name="3" VariableName="System.ErrorDescription" DataType="String"/>
</Parameters>
</ExecuteSQL>
</Tasks>
</Event>
</Events>
<Tasks>
<!-- Your main tasks here -->
</Tasks>
</Package>
Error Row Disposition Options
| Disposition | Behavior | Use Case |
|---|---|---|
FailComponent | Stop package on error | Critical data, no tolerance for errors |
RedirectRow | Send to error output | Log errors, continue processing valid rows |
IgnoreFailure | Skip row, continue | Non-critical data, acceptable data loss |
Code Example: Complete Error Handling Package
<#@ template tier="30" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# foreach (var table in RootNode.Tables) { #>
<Package Name="Load_<#= table.Name #>" ConstraintMode="Linear">
<Variables>
<Variable Name="ErrorCount" DataType="Int32">0</Variable>
<Variable Name="ErrorMessage" DataType="String"></Variable>
</Variables>
<Tasks>
<Dataflow Name="LoadData">
<Transformations>
<OleDbSource Name="Source" ConnectionName="Source">
<DirectInput>SELECT * FROM [<#= table.Schema #>].[<#= table.Name #>]</DirectInput>
</OleDbSource>
<!-- Redirect errors -->
<OleDbDestination Name="Destination" ConnectionName="Target">
<ErrorHandling ErrorRowDisposition="RedirectRow" />
<ExternalTableOutput Table="[<#= table.Schema #>].[<#= table.Name #>]" />
</OleDbDestination>
<!-- Capture error rows -->
<FlatFileDestination Name="ErrorOutput" ConnectionName="ErrorFile">
<InputPath OutputPathName="Destination.Error" />
</FlatFileDestination>
</Transformations>
</Dataflow>
</Tasks>
<Events>
<Event EventType="OnError">
<Tasks>
<ExecuteSQL Name="LogError" ConnectionName="Target">
<DirectInput>
INSERT INTO dbo.ErrorLog (PackageName, TaskName, ErrorCode, ErrorDescription)
VALUES ('<#= table.Name #>', ?, ?, ?)
</DirectInput>
<Parameters>
<Parameter Name="0" VariableName="System.SourceName" DataType="String" />
<Parameter Name="1" VariableName="System.ErrorCode" DataType="Int32" />
<Parameter Name="2" VariableName="System.ErrorDescription" DataType="String" />
</Parameters>
</ExecuteSQL>
</Tasks>
</Event>
</Events>
</Package>
<# } #>
</Packages>
</Biml>
Master Workflow Pattern
When to Use
- Orchestrating multiple child packages
- Controlling parallel vs sequential execution
- Grouping packages by schema or priority
Code Example
<#@ template tier="50" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="Master_LoadAll" ConstraintMode="Linear">
<Tasks>
<!-- Execute high-priority packages first -->
<Container Name="HighPriority" ConstraintMode="Parallel">
<Tasks>
<# foreach (var pkg in RootNode.Packages.Where(p => p.GetTag("Priority") == "High")) { #>
<ExecutePackage Name="Execute_<#= pkg.Name #>">
<Package PackageName="<#= pkg.Name #>" />
</ExecutePackage>
<# } #>
</Tasks>
</Container>
<!-- Then execute normal priority -->
<Container Name="NormalPriority" ConstraintMode="Parallel">
<Tasks>
<# foreach (var pkg in RootNode.Packages.Where(p => p.GetTag("Priority") != "High")) { #>
<ExecutePackage Name="Execute_<#= pkg.Name #>">
<Package PackageName="<#= pkg.Name #>" />
</ExecutePackage>
<# } #>
</Tasks>
</Container>
</Tasks>
</Package>
</Packages>
</Biml>
Quick Reference: Common Extension Methods
| Method | Purpose | Example |
|---|---|---|
GetDatabaseSchema() | Import tables from connection | connection.GetDatabaseSchema() |
GetBiml() | Convert object to Biml XML | column.GetBiml() |
GetDropAndCreateDdl() | Generate DDL script | table.GetDropAndCreateDdl() |
GetColumnList() | Get comma-separated columns | table.GetColumnList() |
GetTag() | Read annotation value | table.GetTag("LoadType") |
Next Steps
- Automation and Patterns - Why automation matters and metadata-driven development concepts
- Build Staging Layer - Complete end-to-end example
- Biml for SSIS - Deep dive into SSIS generation
- Troubleshooting Guide - Debug compilation errors