Housekeeping with Biml
The Problem That Housekeeping Solves
A generated staging schema grows naturally as tables are added to the metadata. It does not shrink as easily. When a table is removed from the source list or disappears from the upstream system, nothing in the generation pipeline knows that the corresponding object in the staging database should also go away. Over time the staging schema collects orphaned tables that no longer participate in any load, taking up space and confusing anyone reading the schema.
A small housekeeping pattern fixes this. Every Biml managed table is tagged with an extended property when it is first created. The next generation run inspects the live database for tagged tables, compares the result against the current 'RootNode.Tables' collection, and emits a drop task for anything that is tagged but no longer represented in the metadata.
Tagging Tables on Creation
The first half of the pattern stamps an extended property on every table the Biml definition creates. The 'sp_addextendedproperty' call goes inside the same Execute SQL task that issues the DROP and CREATE:
<Packages>
<Package Name="01_BuildStaging" ConstraintMode="Parallel">
<Tasks>
<# foreach (var stagingTable in RootNode.Tables) { #>
<ExecuteSQL Name="Create_<#=stagingTable.Name#>" ConnectionName="StagingConn">
<DirectInput>
<#=stagingTable.GetDropAndCreateDdl()#>
GO
EXEC sys.sp_addextendedproperty
@name = N'BimlGenerated', @value = N'True',
@level0type = N'SCHEMA', @level0name = N'<#=stagingTable.Schema.Name#>',
@level1type = N'TABLE', @level1name = N'<#=stagingTable.Name#>';
</DirectInput>
</ExecuteSQL>
<# } #>
Every table created by the template now carries the 'BimlGenerated' extended property.
Detecting and Dropping Orphans
The cleanup half runs at compile time. The template uses 'ExternalDataAccess.GetDataTable' to query 'fn_listextendedproperty' for tables that carry the tag, then walks the result and asks 'RootNode.Tables' whether each one is still defined. Anything missing from the metadata gets its own Execute SQL task that drops the table:
<#
var stagingConn = (AstDbConnectionNode)RootNode.Connections["StagingConn"];
var orphanQuery =
"SELECT objname FROM fn_listextendedproperty " +
"(NULL, 'schema', 'dbo', 'table', default, NULL, NULL) " +
"WHERE name = 'BimlGenerated' AND value = 'True' AND objtype = 'Table'";
var taggedTables = ExternalDataAccess.GetDataTable(
stagingConn.ConnectionString, orphanQuery);
#>
<# foreach (DataRow row in taggedTables.Rows) {
var tableName = row[0].ToString();
if (RootNode.Tables[tableName] == null) {
#>
<ExecuteSQL Name="Remove_<#=tableName#>" ConnectionName="StagingConn">
<DirectInput>
IF EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[<#=tableName#>]')
AND type IN (N'U'))
DROP TABLE [dbo].[<#=tableName#>];
GO
</DirectInput>
</ExecuteSQL>
<# } } #>
</Tasks>
</Package>
</Packages>
The drop task is generated only for tagged tables that have no matching definition in the current Biml model. Tables created by other systems are left alone because they do not carry the 'BimlGenerated' property.
Why It Stays Safe
Two properties make the pattern safe to leave running unattended:
- The extended property is stamped only by the same template that generated the table. Anything that the template did not create is invisible to the cleanup query.
- The 'IF EXISTS' guard around the DROP prevents the package from failing if the table has already been removed by hand between two runs.
When a table is removed from the metadata definition, the next generation run drops it from the database. The lifecycle of a Biml managed table now matches the lifecycle of its metadata definition.