Biml Annotations and ObjectTags
Why This Pattern Matters
Biml objects already carry a long list of built in attributes, but real projects always need more. A staging table that was built from source metadata often needs to remember which source schema and source table it came from. A package may need a hint that downstream files can read. Annotations and ObjectTags are the two mechanisms Biml provides for attaching that extra metadata to any object.
Annotations are string to string pairs and can be declared directly on objects in Biml. ObjectTags are string to object pairs and are added through BimlScript code in a higher tier file.
Biml Annotations Versus SSIS Annotations
The word annotation means two different things in this space. SSIS annotations are the sticky note style comments that appear on the SSIS design surface. Biml annotations are a separate concept: they store extra metadata on Biml objects and never show up on the SSIS design surface.
Biml annotations come in four flavors: 'CodeComment', 'Documentation', 'Description', and 'Tag'. This walkthrough focuses on the 'Tag' flavor, which is the right tool for storing arbitrary string metadata for later retrieval.
Declaring Annotations on an Object
Annotations are declared inline. The 'Annotation' element has a 'Tag' attribute that names the key, and the element body holds the value:
<OleDbConnection Name="StagingDb" ConnectionString="...">
<Annotations>
<Annotation Tag="DestinationDatabase">Warehouse</Annotation>
<Annotation Tag="DestinationSchema">stg</Annotation>
</Annotations>
</OleDbConnection>
Reading an Annotation From a Higher Tier File
A higher tier file can pull the value back out with 'GetTag', passing the same key that was declared on the annotation:
<#=RootNode.OleDbConnections["StagingDb"].GetTag("DestinationSchema")#>
Example: Source Table Metadata in Annotations
A common staging pattern is to build one staging table per source table, derive a new name for the staging table, and remember the original source name on the staging table so that the load packages know what to read from. Two tiered files split that work between table creation and package creation.
The tier 0 file imports the source schema and creates one staging table per source table. The original schema qualified source name is attached as an annotation:
<#@ template tier="0" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="OpsSource" ConnectionString="Data Source=.;Initial Catalog=OperationalDB;Provider=SQLNCLI11.1;Integrated Security=SSPI;" />
<OleDbConnection Name="StagingDb" ConnectionString="Data Source=.;Initial Catalog=Warehouse;Provider=SQLNCLI11.1;Integrated Security=SSPI;" />
</Connections>
<Databases>
<Database Name="Warehouse" ConnectionName="StagingDb" />
</Databases>
<Schemas>
<Schema Name="stg" DatabaseName="Warehouse" />
</Schemas>
<Tables>
<# var srcConn = SchemaManager.CreateConnectionNode("OpsSource", "Data Source=.;Initial Catalog=OperationalDB;Provider=SQLNCLI11.1;Integrated Security=SSPI;"); #>
<# var srcSchema = srcConn.GetDatabaseSchema(ImportOptions.ExcludeViews | ImportOptions.ExcludeIdentity | ImportOptions.ExcludeForeignKey); #>
<# foreach (var srcTable in srcSchema.TableNodes) { #>
<Table Name="<#=srcTable.Schema#>_<#=srcTable.Name#>" SchemaName="Warehouse.stg">
<Columns>
<#=srcTable.Columns.GetBiml()#>
</Columns>
<Annotations>
<Annotation Tag="SourceSchemaQualifiedName"><#=srcTable.SchemaQualifiedName#></Annotation>
</Annotations>
</Table>
<# } #>
</Tables>
</Biml>
Each staging table receives a name of the form 'schema_table', for example 'Sales_Orders'. The annotation with tag 'SourceSchemaQualifiedName' stores the bracketed source name, for example '[Sales].[Orders]', so the load package can read it back later.
The tier 1 file builds one package per staging table. The 'OleDbSource' uses 'GetTag' to retrieve the source name from the annotation:
<#@ template tier="1" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# foreach (var stgTable in RootNode.Tables) { #>
<Package Name="Load_<#=stgTable.Name#>" ConstraintMode="Linear">
<Tasks>
<Dataflow Name="Load <#=stgTable.Name#>">
<Transformations>
<OleDbSource Name="Source <#=stgTable.Name#>" ConnectionName="OpsSource">
<ExternalTableInput Table="<#=stgTable.GetTag("SourceSchemaQualifiedName")#>" />
</OleDbSource>
<OleDbDestination Name="Destination <#=stgTable.Name#>" ConnectionName="StagingDb">
<ExternalTableOutput Table="<#=stgTable.SchemaQualifiedName#>" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages>
</Biml>
ObjectTags
ObjectTags solve the same problem as annotations but carry richer values. Where an annotation value is always a string, an ObjectTag value can be any .NET object, including arrays and lists. The trade off is that ObjectTags cannot be declared inline in flat Biml. They are added in BimlScript code in a higher tier than the tier that created the object they are being added to.
In practice this means that ObjectTags are not a drop in replacement for the annotation pattern above, because that pattern adds the metadata in the same file that creates the table. ObjectTags fit better when the metadata is global to the project. Attaching an ObjectTag to the 'RootNode' makes the value available to every file in the project without include files.
An ObjectTag is set using indexer syntax with a string key:
<# RootNode.ObjectTag["IncludedEntities"] = new List<string> { "Customer", "CustomerCategory" }; #>
The same indexer reads the value back. Because the value is typed as 'object', the read is usually cast:
<# var included = (List<string>)RootNode.ObjectTag["IncludedEntities"]; #>
Two helpers test for the presence of a tag by key or by value:
<#=RootNode.ObjectTag.ContainsKey("IncludedEntities")#>
<#=RootNode.ObjectTag.ContainsValue("Customer")#>
Summary
Annotations and ObjectTags both carry custom metadata between Biml files. Annotations are simple string pairs that can be declared inline in flat Biml. ObjectTags are richer string to object pairs that have to be added in BimlScript code in a higher tier than the object they attach to. For staging patterns that need to remember a source name on a staging table, annotations are usually the right choice. For project wide settings, attaching an ObjectTag to the 'RootNode' keeps the value available everywhere.