Skip to main content

Deriving a Metadata Model From SQL Server With BimlExpress

Why This Pattern Matters

BimlExpress does not include the metadata modeling features that ship with BimlStudio, but the same outcome can be reached by storing the model in a SQL Server schema, importing it with ImportDB, and exposing the rows as a DataSet that the rest of the project can read. The result is a flexible, table driven generator that builds connections, databases, schemas, tables, and load packages from a handful of metadata tables.

Prerequisite: A Metadata Schema in SQL Server

The example uses six tables in a 'meta' schema: Connections, Databases, Schemas, Tables, Columns, plus the foreign keys that link them. The full setup script is included at the bottom of this walkthrough. The pattern works with any custom model as long as the foreign keys are declared so that ImportDB picks up the relationships.

The Include File

A single file, INC_MetaData.biml, holds the import logic and the helper functions. Including it from any later file makes the metadata DataSet and the helper functions available everywhere:

<#@ template language="VB" optionexplicit="false" #>
<#@ import namespace="System.Data" #>
<#
connMetadata = "Data Source=.;Initial Catalog=DerivedMetadataDb;Provider=SQLNCLI11;Integrated Security=SSPI;"
Dim MetaConnection = SchemaManager.CreateConnectionNode("Meta", connMetadata)
Dim MetaSchema = MetaConnection.ImportDB("meta", nothing, nothing)
Dim Meta as new DataSet

for each tbl in MetaSchema.TableNodes
DT = ExternalDataAccess.GetDataTable(connMetadata, "select * from " & tbl.SchemaQualifiedName)
DT.TableName = tbl.Name
Meta.Tables.Add(DT)
next

for each tbl in MetaSchema.TableNodes
for each fk in tbl.Columns.OfType(Of AstTableColumnTableReferenceNode)
PK = MetaSchema.TableNodes.Where(function(c) c.Name = fk.ForeignTable.Name).First _
.Columns.Where(Function(c) c.IsUsedInPrimaryKey = true).First.Name
Meta.Relations.Add(fk.ForeignTable.Name & "_" & tbl.Name,
Meta.Tables(fk.ForeignTable.Name).Columns(PK),
Meta.Tables(tbl.Name).Columns(fk.Name))
next
next
#>

<#+
Public Function GetChildList(RS as DataRow(), Col as String) as List(of String)
Dim LST as new List(of String)
For each rec in RS
LST.Add(rec(Col))
next
If LST.Count = 0 Then LST.Add("NONEMPTYSPACER")
Return LST
End Function

Public Function GetAnnotations(tbl as DataRow) as String
Dim annostring as String
for each col in tbl.Table.Columns
annostring = annostring +
"<Annotation AnnotationType=""Tag"" Tag=""" + col.Caption + """>" +
tbl(col).ToString() + "</Annotation>" + vbCrLf
next
Return annostring
End Function
#>

ImportDB reads the schema, the GetDataTable loop fills the DataSet with one DataTable per metadata table, and the second loop wires foreign keys into DataRelations so GetChildRows works on the in memory model.

GetChildList returns a single column from a DataRow array as a List of String, with a placeholder for empty results so calls like GetDatabaseSchema do not crash on an empty filter list. GetAnnotations turns every column on a DataRow into a Tag annotation, which means new metadata columns become accessible to downstream Biml without any extra wiring.

Tier 10: Environment

The first generated file builds Connections, Databases, and Schemas straight from the imported metadata. Each connection also gets one annotation per metadata column so the rest of the project can read attributes such as Type:

<#@ template language="VB" optionexplicit="false" tier="10" #>
<#@ include file="INC_Metadata.biml" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<# For Each conn In Meta.Tables("Connections").Rows #>
<OleDbConnection Name="<#= conn("Name") #>" ConnectionString="<#= conn("ConnectionString") #>">
<Annotations>
<#= GetAnnotations(conn) #>
</Annotations>
</OleDbConnection>
<# next #>
</Connections>
<Databases>
<# For Each db In Meta.Tables("Databases").Rows #>
<Database Name="<#= db("DB") #>" ConnectionName="<#= db("Connection") #>" />
<# next #>
</Databases>
<Schemas>
<# For Each schem In Meta.Tables("Schemas").Rows #>
<Schema Name="<#= schem("Schema") #>" DatabaseName="<#= schem("DB") #>" />
<# next #>
</Schemas>
</Biml>

Tier 20: Tables With Filtering and Annotations

The next file walks the source connections, asks each one for its tables (filtered by the metadata Tables list using the foreign key relation), and emits a Biml Table per row. If the metadata Columns table contains entries for a table, the column list is filtered down to that subset; otherwise every column comes through:

<#@ template language="VB" optionexplicit="false" tier="20" #>
<#@ include file="INC_Metadata.biml" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Tables>
<# For Each conn In Meta.Tables("Connections").Select("type = 'Source'")
Dim SrcConn as AstDbConnectionNode = RootNode.Connections(conn("Name"))
tbls = GetChildList(conn.GetChildRows("Connections_Tables"), "TableName")
Dim importResult = SrcConn.GetDatabaseSchema(nothing, tbls, nothing)
for each tbl in importResult.TableNodes
Metatbl = Meta.Tables("Tables").Select(
"Connection = '" & conn("name") & "'" &
" and schemaname = '" & tbl.Schema.Name & "'" &
" and tablename = '" & tbl.Name & "'").First #>
<Table Name="<#= Metatbl("ID") #>" SchemaName="<#= Metatbl("TargetSchema") #>">
<Columns>
<# if Metatbl.GetChildRows("Tables_Columns").Length = 0 then #>
<#= tbl.Columns.GetBiml() #>
<# else #>
<#= tbl.Columns.Where(Function(c)
GetChildList(Metatbl.GetChildRows("Tables_Columns"), "ColumnName")
.ConvertAll(function(u) u.ToUpper())
.IndexOf(c.Name.ToUpper) <> -1).GetBiml() #>
<# end if #>
</Columns>
<Annotations>
<#= GetAnnotations(Metatbl) #>
</Annotations>
</Table>
<# next
next #>
</Tables>
</Biml>

Tier 30: Build Staging Schemas and Tables

After tier 20 fills RootNode.Tables, generating the staging DDL is straight forward. The package below creates each Schema and then drops and recreates each Table on the connection that carries the Type=Target annotation:

<#@ template language="VB" optionexplicit="false" tier="30" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="01_CreateStaging" ConstraintMode="Linear"
ProtectionLevel="EncryptSensitiveWithUserKey">
<Tasks>
<# Dim TargetConnName as String = RootNode.Connections.Where(Function(c) c.GetTag("Type") = "Target").First.Name #>
<# for each Schem in RootNode.Schemas #>
<ExecuteSQL Name="Create Schema <#= Schem.Connection.Name #>_<#= Schem.Name #>"
ConnectionName="<#= TargetConnName #>">
<DirectInput>
<#= Schem.GetDropAndCreateDdl() #>
IF NOT EXISTS (SELECT schema_name FROM information_schema.schemata
WHERE schema_name = '<#= Schem.Name #>')
BEGIN
EXEC sp_executesql N'CREATE SCHEMA [<#= Schem.Name #>]'
END
GO
</DirectInput>
</ExecuteSQL>
<# next #>

<# for each tbl in RootNode.Tables #>
<ExecuteSQL Name="Create Table <#= tbl.Name #>"
ConnectionName="<#= TargetConnName #>">
<DirectInput>
<#= tbl.GetDropAndCreateDdl() #>
</DirectInput>
</ExecuteSQL>
<# next #>
</Tasks>
</Package>
</Packages>
</Biml>

Tier 40: Populate the Staging Tables

The final file emits a parallel package that truncates and reloads each staging table. Each container reads from the source connection (the original one, recorded as an annotation), and writes to the table on the target connection:

<#@ template tier="40" optionexplicit="false" language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="02_PopulateTables" ConstraintMode="Parallel"
ProtectionLevel="EncryptSensitiveWithUserKey">
<Tasks>
<# Dim TargetConnName as String = RootNode.Connections.Where(Function(c) c.GetTag("Type") = "Target").First.Name #>
<# for each tbl in RootNode.Tables #>
<Container Name="Transfer <#= tbl.Schema.Name #>_<#= tbl.Name #>" ConstraintMode="Linear">
<Tasks>
<ExecuteSQL Name="Truncate" ConnectionName="<#= TargetConnName #>">
<DirectInput>truncate table <#= tbl.ScopedName #></DirectInput>
</ExecuteSQL>
<Dataflow Name="Copy <#= tbl.Schema.Name #>_<#= tbl.Name #>">
<Transformations>
<OleDbSource Name="Get Rows" ConnectionName="<#= tbl.GetTag("Connection") #>">
<DirectInput>SELECT <#= tbl.GetColumnList() #> FROM [<#= tbl.GetTag("SchemaName") #>].[<#= tbl.GetTag("TableName") #>]</DirectInput>
</OleDbSource>
<OleDbDestination Name="Set Rows" ConnectionName="<#= TargetConnName #>">
<TableOutput TableName="<#= tbl.ScopedName #>" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Container>
<# next #>
</Tasks>
</Package>
</Packages>
</Biml>

Where to Take It Next

Adding new metadata columns is a no code change exercise: GetAnnotations exposes them as Tag annotations the moment they appear in the table. Supporting load patterns beyond TRUNCATE and LOAD, or sources beyond OLE DB, is mostly a matter of adding columns to the metadata model and branching on them in the load template.

Reference: Metadata Schema DDL

USE master
GO
CREATE DATABASE DerivedMetadataDb
GO
USE DerivedMetadataDb
GO
CREATE SCHEMA meta
GO

CREATE TABLE [meta].[Connections](
[Name] [nvarchar](50) NOT NULL,
[ConnectionString] [nvarchar](500) NULL,
[Type] [nvarchar](50) NULL,
CONSTRAINT [PKC_Connection] PRIMARY KEY CLUSTERED ([Name] ASC)
)
GO

CREATE TABLE [meta].[Databases](
[Connection] [nvarchar](50) NOT NULL,
[DB] [nvarchar](50) NOT NULL,
CONSTRAINT [PKC_Database] PRIMARY KEY CLUSTERED ([DB] ASC)
)
GO
ALTER TABLE [meta].[Databases] WITH CHECK
ADD CONSTRAINT [FK_Databases_REF_Connection] FOREIGN KEY([Connection])
REFERENCES [meta].[Connections] ([Name]) ON DELETE CASCADE
GO

CREATE TABLE [meta].[Schemas](
[ID] AS (([DB] + '.') + [Schema]) PERSISTED NOT NULL,
[DB] [nvarchar](50) NOT NULL,
[Schema] [nvarchar](50) NOT NULL,
CONSTRAINT [PKC_Schema] PRIMARY KEY CLUSTERED ([ID] ASC)
)
GO
ALTER TABLE [meta].[Schemas] WITH CHECK
ADD CONSTRAINT [FK_Schema_REF_Database] FOREIGN KEY([DB])
REFERENCES [meta].[Databases] ([DB]) ON DELETE CASCADE
GO

CREATE TABLE [meta].[Tables](
[ID] AS ((([Connection] + '_') + ([SchemaName] + '_')) + [TableName]) PERSISTED NOT NULL,
[Connection] [nvarchar](50) NOT NULL,
[SchemaName] [nvarchar](50) NOT NULL,
[TableName] [nvarchar](100) NOT NULL,
[TargetSchema] [nvarchar](101) NOT NULL,
CONSTRAINT [PKC_Table] PRIMARY KEY CLUSTERED ([ID] ASC)
)
GO
ALTER TABLE [meta].[Tables]
ADD CONSTRAINT [DF_Tables_TargetSchema]
DEFAULT (N'DerivedMetadataDb.Stage') FOR [TargetSchema]
GO
ALTER TABLE [meta].[Tables] WITH CHECK
ADD CONSTRAINT [FK_Tables_REF_Connections] FOREIGN KEY([Connection])
REFERENCES [meta].[Connections] ([Name]) ON DELETE CASCADE
GO
ALTER TABLE [meta].[Tables] WITH CHECK
ADD CONSTRAINT [FK_Tables_REF_Schemas] FOREIGN KEY([TargetSchema])
REFERENCES [meta].[Schemas] ([ID])
GO

CREATE TABLE [meta].[Columns](
[ID] AS (([TableID] + '_') + [ColumnName]) PERSISTED NOT NULL,
[TableID] [nvarchar](202) NOT NULL,
[ColumnName] [nvarchar](50) NOT NULL,
CONSTRAINT [PKC_Column] PRIMARY KEY CLUSTERED ([ID] ASC)
)
GO
ALTER TABLE [meta].[Columns] WITH CHECK
ADD CONSTRAINT [FK_Columns_REF_Tables] FOREIGN KEY([TableID])
REFERENCES [meta].[Tables] ([ID]) ON DELETE CASCADE
GO