Creating a Staging Area with Biml
Why Generate a Staging Area
Extraction packages depend on a target environment that already exists. Building those staging tables by hand is repetitive and tightly coupled to whatever the source looks like today. A Biml script that reads INFORMATION_SCHEMA from the source database can emit a package full of ExecuteSQL tasks that build matching staging tables. Pulling from system tables means the same generator works on any database, supports type translations across providers, can apply naming conventions, can rename schemas, and can append framework columns that do not exist in the source.
A Package per Table
The shape of the output is one ExecuteSQL task per source table. The script first selects the schema and table list from INFORMATION_SCHEMA.TABLES, iterates the rows, and emits an ExecuteSQL task whose name and target table reflect the source schema and table.
<#@ import namespace="System.Data" #>
<#@ import namespace="System.Data.SqlClient" #>
<#@ template language="C#" tier="2" #>
<#
string srcConnString = @"Data Source=.\sql2019;Persist Security Info=true;Integrated Security=SSPI;Initial Catalog=RetailSource";
string tableSql = "select TABLE_SCHEMA, table_name from INFORMATION_SCHEMA.TABLES where TABLE_TYPE='BASE TABLE'";
DataTable tables = new DataTable();
SqlDataAdapter tableAdapter = new SqlDataAdapter(tableSql, srcConnString);
tableAdapter.Fill(tables);
#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="staging_db" ConnectionString="Provider=SQLNCLI11;Data Source=.\sql2019;Persist Security Info=False;Integrated Security=SSPI;Initial Catalog=RetailStage" />
</Connections>
<Packages>
<Package Name="001_Create_Staging" ConstraintMode="Linear">
<Tasks>
<# foreach (DataRow tbl in tables.Rows) { #>
<ExecuteSQL Name="create_landing_<#= tbl["TABLE_SCHEMA"] #>_<#= tbl["TABLE_NAME"] #>" ConnectionName="staging_db">
<DirectInput>
create table landing.[<#= tbl["TABLE_SCHEMA"] + "_" + tbl["TABLE_NAME"] #>] ()
</DirectInput>
</ExecuteSQL>
<# } #>
</Tasks>
</Package>
</Packages>
</Biml>
A source table named Sales.Order becomes landing.Sales_Order in staging.
Adding Column Definitions
The next step pulls the column list per table from INFORMATION_SCHEMA.COLUMNS and turns each row into a column definition. Type translation needs care: smallint(5,0) is invalid SQL, so a helper function picks the right parenthesized arguments per data type. Character types use CHARACTER_MAXIMUM_LENGTH, NUMERIC types use precision and scale, FLOAT uses precision only, and the rest use just the type name.
public string RowConversion(DataRow Row)
{
string text = "[" + Row["COLUMN_NAME"] + "] " + Row["DATA_TYPE"];
switch (Row["DATA_TYPE"].ToString().ToUpper())
{
case "NVARCHAR":
case "VARCHAR":
case "NCHAR":
case "CHAR":
case "BINARY":
case "VARBINARY":
if (Row["CHARACTER_MAXIMUM_LENGTH"].ToString() == "-1")
text += "(max)";
else
text += "(" + Row["CHARACTER_MAXIMUM_LENGTH"] + ")";
break;
case "NUMERIC":
text += "(" + Row["NUMERIC_PRECISION"] + "," + Row["NUMERIC_SCALE"] + ")";
break;
case "FLOAT":
text += "(" + Row["NUMERIC_PRECISION"] + ")";
break;
}
return text;
}
The function lives in a class block so it is available throughout the file. The DirectInput then iterates the column rows and uses the position to control the leading comma.
<DirectInput>
create table landing.[<#= tbl["TABLE_SCHEMA"] + "_" + tbl["TABLE_NAME"] #>]
(
<#
string colSql = "select COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE from INFORMATION_SCHEMA.COLUMNS where TABLE_SCHEMA='" + tbl["TABLE_SCHEMA"] + "' and TABLE_NAME='" + tbl["TABLE_NAME"] + "' order by ORDINAL_POSITION";
DataTable cols = new DataTable();
SqlDataAdapter colAdapter = new SqlDataAdapter(colSql, srcConnString);
colAdapter.Fill(cols);
for (int i = 0; i < cols.Rows.Count; i++)
{
DataRow row = cols.Rows[i];
if (i == 0) WriteLine(RowConversion(row));
else WriteLine(", " + RowConversion(row));
}
#>
)
</DirectInput>
Custom Framework Columns
Static Biml columns can be appended after the dynamic block. The full template adds a load_dt column to every staging table.
<#@ import namespace="System.Data" #>
<#@ import namespace="System.Data.SqlClient" #>
<#@ template language="C#" tier="2" #>
<#
string srcConnString = @"Data Source=.\sql2019;Persist Security Info=true;Integrated Security=SSPI;Initial Catalog=RetailSource";
string tableSql = "select TABLE_SCHEMA, table_name from INFORMATION_SCHEMA.TABLES where TABLE_TYPE='BASE TABLE'";
DataTable tables = new DataTable();
SqlDataAdapter tableAdapter = new SqlDataAdapter(tableSql, srcConnString);
tableAdapter.Fill(tables);
#>
<#+
public string RowConversion(DataRow Row)
{
string text = "[" + Row["COLUMN_NAME"] + "] " + Row["DATA_TYPE"];
switch (Row["DATA_TYPE"].ToString().ToUpper())
{
case "NVARCHAR":
case "VARCHAR":
case "NCHAR":
case "CHAR":
case "BINARY":
case "VARBINARY":
if (Row["CHARACTER_MAXIMUM_LENGTH"].ToString() == "-1")
text += "(max)";
else
text += "(" + Row["CHARACTER_MAXIMUM_LENGTH"] + ")";
break;
case "NUMERIC":
text += "(" + Row["NUMERIC_PRECISION"] + "," + Row["NUMERIC_SCALE"] + ")";
break;
case "FLOAT":
text += "(" + Row["NUMERIC_PRECISION"] + ")";
break;
}
return text;
}
#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="staging_db" ConnectionString="Provider=SQLNCLI11;Data Source=.\sql2019;Persist Security Info=False;Integrated Security=SSPI;Initial Catalog=RetailStage" />
</Connections>
<Packages>
<Package Name="001_Create_Staging" ConstraintMode="Linear">
<Tasks>
<# foreach (DataRow tbl in tables.Rows) { #>
<ExecuteSQL Name="create_landing_<#= tbl["TABLE_SCHEMA"] #>_<#= tbl["TABLE_NAME"] #>" ConnectionName="staging_db">
<DirectInput>
create table landing.[<#= tbl["TABLE_SCHEMA"] + "_" + tbl["TABLE_NAME"] #>]
(
<#
string colSql = "select COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE from INFORMATION_SCHEMA.COLUMNS where TABLE_SCHEMA='" + tbl["TABLE_SCHEMA"] + "' and TABLE_NAME='" + tbl["TABLE_NAME"] + "' order by ORDINAL_POSITION";
DataTable cols = new DataTable();
SqlDataAdapter colAdapter = new SqlDataAdapter(colSql, srcConnString);
colAdapter.Fill(cols);
for (int i = 0; i < cols.Rows.Count; i++)
{
DataRow row = cols.Rows[i];
if (i == 0) WriteLine(RowConversion(row));
else WriteLine(", " + RowConversion(row));
}
#>
, load_dt datetime
)
</DirectInput>
</ExecuteSQL>
<# } #>
</Tasks>
</Package>
</Packages>
</Biml>
The pattern is flexible: target schemas can be remapped, framework columns can be added, and per-type translation rules can substitute datatypes when porting between providers.