Skip to main content

Driving SSIS Generation From Custom Metadata Tables

Why Drive Generation From a Custom Table

Generating SSIS packages from system catalogs like INFORMATION_SCHEMA.TABLES works well when every package follows the same pattern. Many real loads do not. Some tables need a derived column. Some are not ready to be regenerated yet. Some pull from a SELECT statement that joins or filters the source. A purpose-built metadata table captures these per-package decisions in one place and lets BimlScript read them at compile time.

The Scenario

Three source tables need to be loaded into matching destinations. One of them needs a derived column today, and others might need similar transformations later. New tables that follow the same shape will be added over time. Rather than copy and paste packages, the generator reads a control table and emits one package per active row.

The Metadata Table

A table named LoadControl carries one row per package that should be generated. The columns drive the loop:

CREATE TABLE dbo.LoadControl
(
LoadControlId INT IDENTITY(1,1) PRIMARY KEY,
SourceTableName NVARCHAR(128) NOT NULL,
TargetTableName NVARCHAR(128) NOT NULL,
SourceSqlQuery NVARCHAR(MAX) NOT NULL,
NeedsTransform BIT NOT NULL,
IsActive BIT NOT NULL
);

SourceSqlQuery holds the SELECT that will feed the OLE DB source. NeedsTransform is a flag the BimlScript checks to decide whether to emit a derived column. IsActive lets a row be skipped without deleting it.

Reading the Control Table

The script connects to the metadata database, runs a SELECT against LoadControl filtered by IsActive, and walks the rows:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<Connection Name="WorkArea"
ConnectionString="Provider=SQLNCLI11.1;Data Source=localhost;Initial Catalog=WorkArea;Integrated Security=SSPI;Auto Translate=False;" />
</Connections>
<Packages>
<#
string controlConn = "Provider=SQLNCLI11.1;Data Source=localhost;Initial Catalog=WorkArea;Integrated Security=SSPI;Auto Translate=False;";

string targetTableName;
string targetSchema;
string sourceTableName;
int needsTransform;
string sourceSqlQuery;

DataTable controlRows = ExternalDataAccess.GetDataTable(
controlConn,
"SELECT SourceTableName, TargetTableName, NeedsTransform, SourceSqlQuery FROM dbo.LoadControl WHERE IsActive = 1");

foreach (DataRow row in controlRows.Rows)
{
targetTableName = row["TargetTableName"].ToString();
targetSchema = "dbo";
sourceTableName = row["SourceTableName"].ToString();
needsTransform = Convert.ToInt32(row["NeedsTransform"]);
sourceSqlQuery = row["SourceSqlQuery"].ToString();
#>

Everything below the foreach opening brace and above its matching closing brace will repeat once per active row.

Emitting the Package

Each iteration emits a package whose name and queries come from the current row:

<Package Name="Load <#=sourceTableName#>" ConstraintMode="Linear" ProtectionLevel="EncryptSensitiveWithUserKey">
<Tasks>
<ExecuteSQL Name="Capture starting count" ConnectionName="WorkArea">
<DirectInput>
SELECT COUNT(1) AS RowTotal FROM <#=targetTableName#>
</DirectInput>
</ExecuteSQL>
<Dataflow Name="Move <#=sourceTableName#>">
<Transformations>
<OleDbSource ConnectionName="WorkArea" Name="<#=sourceTableName#> Source">
<DirectInput>
<#=sourceSqlQuery#>
</DirectInput>
</OleDbSource>

The SourceSqlQuery column from LoadControl flows directly into the OLE DB source as DirectInput. That makes per-table query customization a metadata change rather than a code change.

Conditionally Emitting a Derived Column

A nested code nugget checks the NeedsTransform flag and emits a derived column only when the flag is set:

<# if (needsTransform == 1) { #>
<DerivedColumns Name="Stamp <#=sourceTableName#> rows">
<Columns>
<Column Name="LoadDate" DataType="DateTime">
GETDATE()
</Column>
</Columns>
</DerivedColumns>
<# } #>
<OleDbDestination Name="Insert into target" ConnectionName="WorkArea">
<ExternalTableOutput Table="<#=targetSchema#>.<#=targetTableName#>" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages>
</Biml>

<#@ template language="C#" tier="2" #>
<#@ import namespace="System.Data" #>
<#@ import namespace="Varigence.Biml.CoreLowerer.SchemaManagement" #>

A simple GETDATE column is shown here, but the same conditional pattern fits a Slowly Changing Dimension component, a lookup, an audit row, or any other optional shape that some tables need and others do not.

Why This Pattern Scales

Generating packages from a control table separates the logic of each load from the metadata that decides which loads exist. Adding a new source table is an INSERT into LoadControl. Pausing a package is an UPDATE to IsActive. Adding a derived column to one more table is a flip of NeedsTransform. The Biml file does not change, but the generated package set updates the next time the script runs.

The trade-off is complexity. A control table that grows to control twenty independent options becomes hard to read. The right balance is a small set of flags that capture the variation actually present in the warehouse, and a separate Biml template per genuinely different load pattern.