Skip to main content

Asset Declaration and Scripting Basics in Biml: Part 1

Why Add Script to Biml

Most early Biml posts focus on how to mimic SSIS task configuration. The point of BimlScript is different: it iterates over a template and produces many objects from one snippet. As the underlying metadata changes, the packages can be regenerated. Biml is a Business Intelligence Markup Language, not just a package serializer, so it can also declare BI assets such as connections, databases, schemas, and tables. Once those assets are declared, they become master data that other Biml code can reference.

Three benefits follow:

  1. Multiple developers can split the work across files.
  2. The project can fully describe an environment, not just packages.
  3. Declared objects can be referenced as master data from generation code.

The Example

The example pulls three tables out of an operational database and lands them in a staging schema. Source and target column names match, but target tables sit in dbo and are prefixed with import_. The Item table has three columns in staging while the source has more. Auto mapping handles the column subset.

CREATE TABLE dbo.import_ItemCategory(
[ItemCategoryID] [int] NOT NULL,
[Name] nvarchar(50) NOT NULL
);

CREATE TABLE dbo.import_ItemSubcategory(
[ItemSubcategoryID] [int] NOT NULL,
[ItemCategoryID] [int] NOT NULL,
[Name] nvarchar(50) NOT NULL
);

CREATE TABLE dbo.import_Item(
[ItemID] [int] NOT NULL,
[Name] nvarchar(50) NOT NULL,
[ItemSubcategoryID] [int] NULL
);

The Per-Table Template

Each table needs a Truncate followed by a Dataflow that loads from source to target. Movements run in parallel, but the dataflow inside one movement waits on its own truncate via a precedence constraint. The naming convention is consistent so that the same template can serve every table by substituting the name.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="build_stage" ConstraintMode="Parallel" ProtectionLevel="EncryptSensitiveWithUserKey">
<Tasks>
<ExecuteSQL Name="TRUNC_Item" ConnectionName="stage">
<DirectInput>Truncate Table dbo.import_Item;</DirectInput>
</ExecuteSQL>

<Dataflow Name="LOAD_Item">
<Transformations>
<OleDbSource Name="SRC_Item" ConnectionName="retail_source">
<DirectInput>Select * from dbo.Item;</DirectInput>
</OleDbSource>

<OleDbDestination Name="TGT_Item" ConnectionName="stage">
<ExternalTableOutput Table="dbo.import_Item" />
</OleDbDestination>
</Transformations>

<PrecedenceConstraints>
<Inputs>
<Input OutputPathName="TRUNC_Item.Output" />
</Inputs>
</PrecedenceConstraints>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>

Two design choices matter here. First, the Package's ConstraintMode is Parallel, so SSIS does not put precedence between the truncate and the dataflow by default. Second, an explicit PrecedenceConstraints block is added inside the Dataflow so the dataflow waits on the matching truncate.

Splitting Into Tiered Files

Modularization helps. There is a natural build order: connections exist before databases, databases before schemas, schemas before tables, and tables before packages that reference them. Splitting work across files also lets one person own connections and another own packages.

The example uses two files. The first, declared as tier 1, defines the environment (connections, databases, schemas, tables). The second, declared as tier 2, generates the staging package by iterating over the tables declared in tier 1. The compiler honors the tier order so tier 2 can rely on the assets created in tier 1.

Up Next

Part 2 shows the actual contents of the two files, walks through the foreach over RootNode.Tables, and explains how code nuggets and expression nuggets cooperate to produce the final package.