Generating Many SSIS Packages With BimlScript
Why Generate Packages From a Loop
Hand building extract and load packages for dozens of source tables turns into copy and paste, and copy and paste turns into inconsistencies. BimlScript embeds a small amount of C# inside a Biml template, queries the source for a list of tables, and emits one package per table. The generated packages are ordinary .dtsx files that any SSIS developer can open and maintain.
Directives and Imports
A BimlScript file that uses C# needs a directive that declares the language and imports for any namespaces the code uses. Put these directives at the top or the bottom of the file. The compiler picks them up either way:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
</Biml>
<#@ template language="C#" #>
<#@ import namespace="System.Data" #>
<#@ import namespace="Varigence.Biml.CoreLowerer.SchemaManagement" #>
Code Nuggets
BimlScript drops into C# inside angle-bracket and pound-sign markers and returns to Biml when those markers close. A typical block of variable declarations looks like this:
<#
string sourceConnString = "Data Source=localhost;Initial Catalog=SalesSource;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";
string targetTableName;
string tableSchema;
string sourceTableName;
DataTable sourceTables;
#>
Anything outside the markers is plain Biml. Anything inside is C#. The compiler runs the C# during expansion, then turns the resulting flat Biml into the .dtsx output.
Setting Up Connections
Two connections are declared at the top of the Biml: one for the source database and one for the destination. The destination database is a copy of the source schema with constraints removed so a full load can run repeatedly.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<Connection Name="SalesSource"
ConnectionString="Data Source=localhost;Initial Catalog=SalesSource;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" />
<Connection Name="SalesTarget"
ConnectionString="Data Source=localhost;Initial Catalog=SalesTarget;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" />
</Connections>
<Packages>
Reading the Table List From System Metadata
INFORMATION_SCHEMA.TABLES returns one row per table in the source database. The script pulls schema and name into a DataTable and loops over the result:
<#
string sourceConnString = "Data Source=localhost;Initial Catalog=SalesSource;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";
DataTable sourceTables = ExternalDataAccess.GetDataTable(
sourceConnString,
@"SELECT t.TABLE_CATALOG, t.TABLE_SCHEMA, t.TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES t
WHERE t.TABLE_SCHEMA <> 'dbo'
AND t.TABLE_TYPE = 'BASE TABLE'");
foreach (DataRow row in sourceTables.Rows)
{
string tableSchema = row["TABLE_SCHEMA"].ToString();
string sourceTableName = row["TABLE_NAME"].ToString();
#>
Filtering happens in the WHERE clause. Drop the schemas, prefixes, or table-type values that are not in scope.
The Per-Table Package Body
Inside the loop, the Biml describes one package. Variables from the outer C# block flow into Biml attributes through the angle-bracket-pound-equals syntax:
<Package ConstraintMode="Linear" Name="Extract <#=sourceTableName#>" ProtectionLevel="EncryptSensitiveWithUserKey">
<Tasks>
<ExecuteSQL Name="Truncate <#=sourceTableName#>" ConnectionName="SalesTarget">
<DirectInput>
TRUNCATE TABLE <#=tableSchema#>.<#=sourceTableName#>
</DirectInput>
</ExecuteSQL>
<Dataflow Name="DFT Extract">
<Transformations>
<OleDbSource ConnectionName="SalesSource" Name="<#=sourceTableName#> Source">
<DirectInput>
SELECT * FROM <#=tableSchema#>.<#=sourceTableName#>
</DirectInput>
</OleDbSource>
<OleDbDestination Name="Insert <#=sourceTableName#>" ConnectionName="SalesTarget" KeepIdentity="true">
<ExternalTableOutput Table="<#=tableSchema#>.<#=sourceTableName#>" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages>
</Biml>
Note the closing brace for the foreach loop sits below the closing Package tag. Everything between the opening brace and that closing brace runs once per table in the result set, including the Biml.
Result
A few dozen lines of script produce one .dtsx file per source table. Each package truncates its target table, runs an OLE DB source to OLE DB destination data flow, and uses the same naming convention. Adding or removing a source table no longer requires editing packages, only regenerating from the same script.
Where to Take It Next
The same loop can branch on table name to add per-table transformations, switch between full and incremental loads based on a flag column, or read its driver list from a custom metadata table rather than INFORMATION_SCHEMA. The compile-time loop is the foundation. What it loops over and what it emits inside the loop are open.