Skip to main content

Cleaning a Database with Biml

Why This Pattern Matters

Some workflows need a fresh database before every deploy. When dropping and recreating the database is not an option, the next best thing is to remove every object in it. A short SSIS package can do exactly that by looping over the table list, dropping each table it finds, and repeating the cycle until no tables are left. The repeat handles cases where foreign keys block a drop on the first pass.

The package can be expressed in plain Biml, with no BimlScript code nuggets, which makes it a useful walkthrough of how 'ForLoop', 'ForEachAdoLoop', and result set capture wire together.

The Algorithm

The control flow follows this loop:

  1. While 'TABLE_COUNT' is at least one:
    • Run a SQL query that returns the current list of tables and capture it into the object variable 'TABLES_IN_DB'.
    • For each row in 'TABLES_IN_DB', build a 'DROP TABLE' statement and execute it. Failures are tolerated because dependency order is unknown.
    • Run a count query against 'INFORMATION_SCHEMA.TABLES' and capture the count back into 'TABLE_COUNT'.

When the count reaches zero, the outer loop exits.

The Biml

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="WorkArea"
ConnectionString="provider=MSDataShape;server=warehouse_dev;uid=warehouse_admin;pwd=*****"
DelayValidation="true" />
</Connections>
<Packages>
<Package Name="01-Clean_WorkArea" ProtectionLevel="EncryptAllWithUserKey"
ConstraintMode="Linear">
<Variables>
<Variable Name="TABLES_IN_DB" DataType="Object" />
<Variable Name="TABLE_COUNT" DataType="Int32">1</Variable>
</Variables>
<Tasks>
<ForLoop Name="WHILE TABLES EXIST" ConstraintMode="Linear">
<LoopTestExpression>@TABLE_COUNT >= 1</LoopTestExpression>
<Tasks>
<ExecuteSQL Name="GET_TABLE_LIST" ResultSet="Full" ConnectionName="WorkArea">
<DirectInput>SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'workarea';</DirectInput>
<Results>
<Result Name="0" VariableName="User.TABLES_IN_DB" />
</Results>
</ExecuteSQL>
<ForEachAdoLoop Name="FOREACH_RECORD_IN_TABLE_LIST"
SourceVariableName="User.TABLES_IN_DB"
ConstraintMode="Linear">
<Variables>
<Variable Name="TABLE_NAME" DataType="String" />
<Variable Name="SQL" DataType="String" />
</Variables>
<VariableMappings>
<VariableMapping Name="0" VariableName="User.TABLE_NAME" />
</VariableMappings>
<Tasks>
<Expression Name="GENERATE SQL"
Expression="@[User::SQL]=&quot;DROP TABLE workarea.&quot; + @[User::TABLE_NAME]" />
<ExecuteSQL Name="DELETE TABLE" ConnectionName="WorkArea"
DelayValidation="true" FailPackageOnFailure="false">
<VariableInput VariableName="User.SQL" />
</ExecuteSQL>
</Tasks>
</ForEachAdoLoop>
<ExecuteSQL Name="GET_TABLES_COUNT" ResultSet="SingleRow" ConnectionName="WorkArea">
<DirectInput>SELECT COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'workarea';</DirectInput>
<Results>
<Result Name="0" VariableName="User.TABLE_COUNT" />
</Results>
</ExecuteSQL>
</Tasks>
</ForLoop>
</Tasks>
</Package>
</Packages>
</Biml>

What the Generated Package Demonstrates

Several patterns show up that come back in many other Biml templates:

  • 'ForLoop' with 'LoopTestExpression' implements a WHILE.
  • 'ExecuteSQL' with 'ResultSet="Full"' captures a full result set into an Object variable.
  • 'ForEachAdoLoop' iterates a captured Object variable, with 'VariableMappings' assigning the current row column to a string variable.
  • 'Expression' tasks compose a SQL statement at runtime by concatenating a constant prefix with a variable.
  • 'ExecuteSQL' with 'VariableInput' runs SQL stored in a variable.
  • 'FailPackageOnFailure="false"' lets the inner DROP fail without aborting the package, which is what makes the retry loop possible.

The pattern works against any source that has an 'INFORMATION_SCHEMA.TABLES' view. Replace the connection details and the schema filter, regenerate, and the same package cleans a different database.