Skip to main content

Reverse Engineering Metadata with Biml

Why Reverse Engineer Existing Staging

Adopting a metadata driven Biml solution for staging is straightforward when the staging area is greenfield. When an existing staging database already holds dozens or hundreds of tables, populating the metadata model by hand is tedious and error prone. Even though metadata extraction is a one time job, BimlScript handles it well: the same engine that generates packages can read schema and emit T-SQL that loads the metadata tables.

The Metadata Table

The example assumes a single metadata table holds one row per source table that staging knows about. Adjust the columns to match the local model:

CREATE TABLE [dbo].[StagingMeta_Tables] (
[ConnectionName] [nvarchar](50) NULL,
[SchemaName] [nvarchar](50) NULL,
[TableName] [nvarchar](50) NULL,
[Columns] [nvarchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

The walkthrough also assumes a naming convention in the existing staging database: each table is prefixed with the source connection name, optionally followed by a schema name and the original table name. A table named CrmSource_Sales_Orders maps back to a source schema of Sales and a source table of Orders on the CrmSource connection. A table named CrmSource_Customer maps back to a source schema of dbo and a source table of Customer. Adjust the parsing logic for any other convention.

Pointing Biml at the Existing Staging Database

A static environment file declares the connection to the staging database that already holds the tables to be inventoried:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="StagingDb" ConnectionString="Provider=SQLNCLI11;Server=.;Initial Catalog=ExistingStagingDb;Integrated Security=SSPI;" />
</Connections>
</Biml>

Building the INSERT Statements

The next file is invoked through CallBimlScript and accepts the source connection prefix as a property. It uses ImportDB with a SQL like wildcard to pull every staging table whose name begins with that prefix, then strips the prefix, splits the remainder into schema and table parts, and emits a DELETE plus INSERT pair into the metadata table for each:

<#@ template language="VB" #>
<#@ property name="srcPrefix" type="string" #>
<#
Dim tableName As String
Dim schemaName As String
Dim sourceConnection As AstDbConnectionNode = RootNode.Connections("StagingDb")
Dim stagingTables As ImportResults = sourceConnection.ImportDB(Nothing, srcPrefix + "_%")
For Each tbl As AstTableNode In stagingTables.TableNodes
tableName = tbl.Name.Substring(srcPrefix.Length + 1)
If tableName.IndexOf("_") < 0 Then
schemaName = "dbo"
Else
schemaName = tableName.Substring(0, tableName.IndexOf("_"))
tableName = tableName.Substring(tableName.IndexOf("_") + 1)
End If
#>
DELETE FROM StagingMeta_Tables WHERE ConnectionName = '<#= srcPrefix #>' AND SchemaName = '<#= schemaName #>' AND TableName = '<#= tableName #>'
INSERT INTO StagingMeta_Tables (ConnectionName, SchemaName, TableName, Columns) VALUES ('<#= srcPrefix #>', '<#= schemaName #>', '<#= tableName #>', '<#= tbl.GetColumnList() #>')
<# Next #>

The DELETE sits in front of the INSERT so the script is rerunnable without producing duplicate rows.

Writing the Output to a SQL File

Because the script only needs to run once per source, the output is dumped to a .sql file that can be opened in SSMS and executed against the metadata database:

<#@ template language="VB" #>
<# System.IO.File.WriteAllText("c:\Temp\StagingMeta_Crm.sql", CallBimlScript("BuildMetaSql.biml", "CrmSource")) #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd" />

Repeat the call for every connection prefix in the staging database. The result is a complete metadata snapshot ready to drive ongoing package generation.